# numpy

[absolute_beginners](https://numpy.org/doc/1.26/user/absolute_beginners.html)

## NumPy 基础概念

- Ndarray：理解 N 维数组，特别是 shape (维度) 的概念。

- 广播机制 (Broadcasting)：这是 NumPy 最强大的特性，允许不同维度的数组进行数学运算。

- 向量化计算：学会用数组操作代替 Python 的 for 循环（速度快几十倍）。

## narray

In [5]:
import numpy as np

a = np.array([[1,2], [2, 3]])
print(a.shape)
a.dtype

(2, 2)


dtype('int32')

## array creation

In [6]:
np.zeros((2,3))

array([[0., 0., 0.],
       [0., 0., 0.]])

In [7]:
np.ones((2, 3))

array([[1., 1., 1.],
       [1., 1., 1.]])

In [9]:
np.arange(0, 10, 2)

array([0, 2, 4, 6, 8])

In [10]:
np.linspace(0, 1, 5)

array([0.  , 0.25, 0.5 , 0.75, 1.  ])

In [13]:
np.eye(3, k=1)

array([[0., 1., 0.],
       [0., 0., 1.],
       [0., 0., 0.]])

In [14]:
np.eye(3, k=-1)

array([[0., 0., 0.],
       [1., 0., 0.],
       [0., 1., 0.]])

## Broadcasting

[Broadcasting](https://numpy.org/doc/1.26/user/basics.broadcasting.html#basics-broadcasting)

In [62]:
a = np.array([1, 2, 3])
b = np.array([2, 2, 2])
a * b 

array([2, 4, 6])

In [56]:
a = np.array([1, 2, 3])
b = 2
a * b 

array([2, 4, 6])

### stretch on row

In [67]:
a = np.array([[ 0.0,  0.0,  0.0],
              [10.0, 10.0, 10.0],
              [20.0, 20.0, 20.0],
              [30.0, 30.0, 30.0]])
a.shape
b = np.array([1.0, 2.0, 3.0])
a * b

array([[ 0.,  0.,  0.],
       [10., 20., 30.],
       [20., 40., 60.],
       [30., 60., 90.]])

![alt text](../imgs/image.png)

In [None]:
# b = np.array([1.0, 2.0, 3.0, 4.0])
# a * b 
# ValueError: operands could not be broadcast together with shapes (4,3) (4,) 

ValueError: operands could not be broadcast together with shapes (4,3) (4,) 

![alt text](../imgs/image2.png)

### stretch both row and column

In [None]:
a = np.array([0.0, 10.0, 20.0, 30.0])
b = np.array([1.0, 2.0, 3.0])
# it converts a 1D array (4,) → 2D column vector (4, 1):
a[:, np.newaxis] + b

array([[ 1.,  2.,  3.],
       [11., 12., 13.],
       [21., 22., 23.],
       [31., 32., 33.]])

![alt text](../imgs/image3.png)

In [82]:
from numpy import array, argmin, sqrt, sum
observation = array([111.0, 188.0])
codes = array([[102.0, 203.0],
               [132.0, 193.0],
               [45.0, 155.0],
               [57.0, 173.0]])
diff = codes - observation    # the broadcast happens here
print(diff)
dist = sqrt(sum(diff**2,axis=-1))
print(dist)
argmin(dist)

[[ -9.  15.]
 [ 21.   5.]
 [-66. -33.]
 [-54. -15.]]
[17.49285568 21.58703314 73.79024326 56.04462508]


0

In [None]:
from numpy import array, argmin, sqrt, sum
observation = array([111.0, 188.0])
codes = array([[102.0, 203.0],
               [132.0, 193.0],
               [45.0, 155.0],
               [57.0, 173.0]])
diff = codes - observation
# here we just compare with axis=-1
sqrt(sum(diff**2,axis=0))


array([88.28363382, 39.54743987])

| axis    | sum operation                        | result shape | meaning                                                   |
| ------- | ------------------------------------ | ------------ | --------------------------------------------------------- |
| -1 or 1 | sum across components of each vector（行） | (4,)         | squared distances per vector                              |
| 0       | sum down rows (for each component)（列） | (2,)         | sum of all first components, sum of all second components |


## Universal Functions

In [21]:
np.sqrt(a)

array([1.        , 1.41421356, 1.73205081])

In [None]:
# e^x = y
np.exp(a)

array([ 2.71828183,  7.3890561 , 20.08553692])

In [26]:
np.maximum(a, 2)

array([2, 2, 3])

## Aggregations

In [27]:
a.sum()

6

In [30]:
a.mean(axis=0)

2.0

In [32]:
a.max(axis=0)

3

## Shape Manipulation

In [34]:
a.reshape(3, 1)

array([[1],
       [2],
       [3]])

In [35]:
a.T

array([1, 2, 3])

In [36]:
np.concatenate([a, a], axis=0)

array([1, 2, 3, 1, 2, 3])

In [37]:
np.stack((a, a), axis=0)

array([[1, 2, 3],
       [1, 2, 3]])

## Linear Algebra

$$
A = 
\begin{bmatrix}
a & b \\
c & d
\end{bmatrix}
$$

$$
det(A) = ad-bc
$$

In [None]:
A = np.array([[1, 2], [3, 4]])
# 1*4 - 2*3 = -2
np.linalg.det(A)

-2.0000000000000004

$$
A = 
\begin{bmatrix}
a & b \\
c & d
\end{bmatrix}
$$

$$
inv(A) = A^{-1} = \frac{1}{ad - bc}
\begin{bmatrix}
d & -b \\
-c & a
\end{bmatrix}

$$

In [40]:
np.linalg.inv(A)

array([[-2. ,  1. ],
       [ 1.5, -0.5]])

$$
A = 
\begin{bmatrix}
a & b \\
c & d
\end{bmatrix}
$$

$$
A \cdot A = 
\begin{bmatrix}
a^2 + bc & ab + bd \\
ca + dc & cb + d^2
\end{bmatrix}
$$

In [41]:
np.dot(A, A)

array([[ 7, 10],
       [15, 22]])

## Radom Module

In [42]:
rng = np.random.default_rng(42)
rng.integers(0, 10, size=5)

array([0, 7, 6, 4, 4], dtype=int64)

In [43]:
rng.normal(0, 1, size=(2, 2))

array([[ 0.94056472, -1.95103519],
       [-1.30217951,  0.1278404 ]])

## 用 NumPy 实现一个简单的 余弦相似度 (Cosine Similarity) 计算

`np.linalg.norm(a)`

$$
\vec{a} = 
\begin{bmatrix}
1 \\ 2 \\ 3
\end{bmatrix}, \quad
\|\vec{a}\| = \sqrt{1^2 + 2^2 + 3^2}
$$

$$
\|\vec{a}\| = \sqrt{\sum_{i=1}^{n} a_i^2}
$$

In [50]:
import numpy as np

def cosine_similarity(v1, v2):
    """
    计算两个向量的余弦相似度
    公式: (v1 . v2) / (|v1| * |v2|)
    """
    # 1. 计算点积 (Dot Product)
    np_dot = np.dot(v1, v2)
    # 2. 计算向量的模 (Norm) - 也就是长度
    norm_v1 = np.linalg.norm(v1)
    norm_v2 = np.linalg.norm(v2)
    # 3. 计算相似度
    return np_dot / (norm_v1 * norm_v2)

# 测试数据（模拟两个文本的向量）
vector_a = np.array([1.2, 0.5, 3.1])
vector_b = np.array([1.1, 0.6, 3.0]) # 这个应该和 a 很像
vector_c = np.array([-1.0, 2.0, 0.1]) # 这个应该差别很大

print(f"A和B的相似度: {cosine_similarity(vector_a, vector_b)}")
print(f"A和C的相似度: {cosine_similarity(vector_a, vector_c)}")

A和B的相似度: 0.9991850397771528
A和C的相似度: 0.014619570047666973


$$\text{Similarity} = \cos(\theta) = \frac{\vec{v1} \cdot \vec{v2}}{||\vec{v1}|| \times ||\vec{v2}||}$$

- 分子 $(\vec{v1} \cdot \vec{v2})$：这是 点积。它衡量两个向量有多“同向”。如果两个向量指向完全相同的方向，点积最大；如果垂直（无关），点积为 0。
- 分母 $(||\vec{v1}|| \times ||\vec{v2}||)$：这是两个向量的 长度（模） 的乘积。
- 除法的作用：这就是关键！除以长度是为了 “归一化” (Normalization)。它消除了“长度”的影响，只保留“方向”的差异。