# Python的科学计算包 - Numpy

**Numerical Python extensions**是一个第三方的Python包，用于科学计算。目前已经成为绝大部分Python科学计算的基础包。

https://zhuanlan.zhihu.com/p/24309547

----

## 1. 基本类型（Array）

在Numpy中可以非常方便的创建各种不同类型的张量（Tensor），并且执行一些基本操作。

### 1.1 一维数组

In [1]:
# 按照惯例，numpy使用别名np
import numpy as np

In [2]:
# 用列表生成numpy的数组
a = [1, 2, 3, 4]
b = np.array(a)
type(b)

numpy.ndarray

In [3]:
# 输出数组的形状
b.shape

(4,)

In [4]:
# 返回最大的索引值
b.argmax()

3

In [5]:
# 最大值
b.max()

4

In [6]:
# 均值
b.mean()

2.5

### 1.2 二维数组
$$ \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} $$

In [7]:
c = [[1, 2], [3, 4]]
d = np.array(c)
d

array([[1, 2],
       [3, 4]])

In [8]:
d.shape

(2, 2)

In [9]:
d.size

4

In [10]:
d.max(axis = 0)

array([3, 4])

In [11]:
d.max(axis=1)

array([2, 4])

In [12]:
d.mean(axis=0)

array([ 2.,  3.])

In [13]:
# 将数组展开为一个1维数组
d.flatten()

array([1, 2, 3, 4])

In [14]:
# 返回一个1维的数组
np.ravel(c)

array([1, 2, 3, 4])

### 1.3 利用Numpy产生数组

Numpy中包含很多用来生成数组的函数。

In [15]:
e = np.ones((3, 3), dtype=np.float)
e

array([[ 1.,  1.,  1.],
       [ 1.,  1.,  1.],
       [ 1.,  1.,  1.]])

In [16]:
f = np.repeat(3, 4)
f

array([3, 3, 3, 3])

In [17]:
# 创建一个2*2*3的张量
g = np.zeros((2, 2, 3), dtype=np.uint8)
g.shape

(2, 2, 3)

In [18]:
h = g.astype(np.float)
h

array([[[ 0.,  0.,  0.],
        [ 0.,  0.,  0.]],

       [[ 0.,  0.,  0.],
        [ 0.,  0.,  0.]]])

In [19]:
# 类似于range函数
l = np.arange(10)
l

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [20]:
# 在0到6之间取5个数
m = np.linspace(0, 6, 5)
m

array([ 0. ,  1.5,  3. ,  4.5,  6. ])

In [21]:
# save and load
p = np.array(
    [[1, 2, 3, 4],
     [5, 6, 7, 8]]
)

np.save('p.npy', p)
q = np.load('p.npy')
q

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

### 1.4 Numpy数组的切片、转置与翻转

In [22]:
a = np.arange(24).reshape((2, 3, 4))
a

array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])

In [23]:
a[1][1][1]

17

In [24]:
# ：表示当前维度上所有的下标
a[:, 2, :]

array([[ 8,  9, 10, 11],
       [20, 21, 22, 23]])

In [25]:
a[:, :, 1]

array([[ 1,  5,  9],
       [13, 17, 21]])

In [26]:
# ... 表示没有明确指出维度
a[..., 1]

array([[ 1,  5,  9],
       [13, 17, 21]])

In [27]:
a[:, 1:, 1:-1]

array([[[ 5,  6],
        [ 9, 10]],

       [[17, 18],
        [21, 22]]])

In [28]:
np.split(np.arange(9), 3)

[array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8])]

In [29]:
np.split(np.arange(9),[2, -3])

[array([0, 1]), array([2, 3, 4, 5]), array([6, 7, 8])]

In [30]:
l0 = np.arange(6).reshape((2, 3))
l1 = np.arange(6,12).reshape((2,3))

In [31]:
# 沿着垂直方向拼接两个数组
np.vstack((l0, l1))

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])

In [32]:
# 沿着水平方向拼接两个数组
np.hstack((l0, l1))

array([[ 0,  1,  2,  6,  7,  8],
       [ 3,  4,  5,  9, 10, 11]])

In [33]:
np.concatenate((l0, l1))

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])

In [34]:
np.concatenate((l0, l1), axis=-1)

array([[ 0,  1,  2,  6,  7,  8],
       [ 3,  4,  5,  9, 10, 11]])

In [35]:
# stack直接添加一个维度
s = np.stack((l0, l1))
s

array([[[ 0,  1,  2],
        [ 3,  4,  5]],

       [[ 6,  7,  8],
        [ 9, 10, 11]]])

In [36]:
# 按照指定的轴进行转置
s.transpose((2, 0, 1))

array([[[ 0,  3],
        [ 6,  9]],

       [[ 1,  4],
        [ 7, 10]],

       [[ 2,  5],
        [ 8, 11]]])

In [37]:
u = a[0].transpose()
u

array([[ 0,  4,  8],
       [ 1,  5,  9],
       [ 2,  6, 10],
       [ 3,  7, 11]])

In [38]:
# 将u逆时针旋转90度的变换进行三次
v = np.rot90(u, 3)
v

array([[ 3,  2,  1,  0],
       [ 7,  6,  5,  4],
       [11, 10,  9,  8]])

In [39]:
# 沿水平轴上下翻转
w = np.flipud(u)
w

array([[ 3,  7, 11],
       [ 2,  6, 10],
       [ 1,  5,  9],
       [ 0,  4,  8]])

In [40]:
# 沿垂直轴左右翻转
x = np.fliplr(u)
x

array([[ 8,  4,  0],
       [ 9,  5,  1],
       [10,  6,  2],
       [11,  7,  3]])

In [41]:
# 按照一维顺序滚动位移
np.roll(u, 1)

array([[11,  0,  4],
       [ 8,  1,  5],
       [ 9,  2,  6],
       [10,  3,  7]])

In [42]:
# 按照指定轴滚动位移
np.roll(u, 1, axis=1)

array([[ 8,  0,  4],
       [ 9,  1,  5],
       [10,  2,  6],
       [11,  3,  7]])

In [43]:
np.roll(u, 1)

array([[11,  0,  4],
       [ 8,  1,  5],
       [ 9,  2,  6],
       [10,  3,  7]])

### 1.5 Numpy的基础数学运算

Numpy的基础数学运算是支持Array的。

In [44]:
np.abs(-1)

1

In [45]:
np.sin(np.pi/2)

1.0

In [46]:
a = np.array([0.462884, 0.978545])
np.arctanh(a)

array([ 0.50097551,  2.26207964])

In [47]:
np.exp(a)

array([ 1.58864905,  2.66058228])

In [48]:
np.power(a, 3)

array([ 0.09917827,  0.93700607])

In [49]:
np.dot([1, 2], [3, 4])

11

In [50]:
np.sqrt(a)

array([ 0.68035579,  0.98921433])

In [51]:
np.sum([1, 2, 3, 4])

10

In [52]:
np.mean([5, 6, 7, 8])

6.5

In [53]:
np.std([1, 2, 3])

0.81649658092772603

### 1.6 广播（Broadcasting）

如果两个数组（Array）维度不一致，则在没有对齐的维度上进行扩展。

In [54]:
a = np.arange(1,7).reshape((2, 3))
b = np.array([1, 2, 3, 1, 2 ,3]).reshape((2, 3))

In [55]:
a

array([[1, 2, 3],
       [4, 5, 6]])

In [56]:
b

array([[1, 2, 3],
       [1, 2, 3]])

In [57]:
a + b

array([[2, 4, 6],
       [5, 7, 9]])

In [58]:
a - b

array([[0, 0, 0],
       [3, 3, 3]])

In [59]:
a * b

array([[ 1,  4,  9],
       [ 4, 10, 18]])

In [60]:
a / b

array([[ 1. ,  1. ,  1. ],
       [ 4. ,  2.5,  2. ]])

In [61]:
a ** 2

array([[ 1,  4,  9],
       [16, 25, 36]], dtype=int32)

In [62]:
a ** b

array([[  1,   4,  27],
       [  4,  25, 216]], dtype=int32)

In [63]:
c = np.arange(1, 13).reshape((4, 3))
c

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

In [64]:
d = np.array([2, 2, 2]).reshape((1, 3))
d

array([[2, 2, 2]])

In [65]:
c + d

array([[ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11],
       [12, 13, 14]])

In [66]:
c * d

array([[ 2,  4,  6],
       [ 8, 10, 12],
       [14, 16, 18],
       [20, 22, 24]])

In [67]:
c - 1

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])

## 2. 线性代数模块（linalg）

Numpy提供了常用的线性代数运算模块。
```
基本线性函数:

- norm         Vector or matrix norm
- inv          Inverse of a square matrix
- solve         Solve a linear system of equations
- det             Determinant of a square matrix
- lstsq           Solve linear least-squares problem
- pinv            Pseudo-inverse (Moore-Penrose) calculated using a singular value decomposition
- matrix_power    Integer power of a square matrix

Eigenvalues and decompositions:

- eig             Eigenvalues and vectors of a square matrix
- eigh            Eigenvalues and eigenvectors of a Hermitian matrix
- eigvals         Eigenvalues of a square matrix
- eigvalsh        Eigenvalues of a Hermitian matrix
- qr              QR decomposition of a matrix
- svd             Singular value decomposition of a matrix
- cholesky        Cholesky decomposition of a matrix

Tensor operations:

- tensorsolve     Solve a linear tensor equation
- tensorinv       Calculate an inverse of a tensor

Exceptions:

- LinAlgError     Indicates a failed linear algebra operation
```

在做向量与矩阵的dot乘运算的时候，应将向量reshape成张量进行计算。

In [68]:
a = np.array([3, 4])
np.linalg.norm(a)

5.0

In [69]:
b = np.arange(1, 10).reshape((3, 3))
b

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [70]:
c = np.array([1, 0, 1]).reshape((3, 1))
c

array([[1],
       [0],
       [1]])

In [71]:
np.dot(b, c)

array([[ 4],
       [10],
       [16]])

In [72]:
np.dot(c.T, b)

array([[ 8, 10, 12]])

In [73]:
np.trace(b)

15

In [74]:
np.linalg.det(b)

-9.5161973539299405e-16

In [75]:
np.linalg.matrix_rank(b)

2

In [76]:
d = np.array([2, 1, 1, 2]).reshape((2, 2))

In [77]:
# 特征值分解
u, v = np.linalg.eig(d)
print(u) # 特征向量
print(v) # 特征值

[ 3.  1.]
[[ 0.70710678 -0.70710678]
 [ 0.70710678  0.70710678]]


In [78]:
np.dot(np.dot(v, np.diag(u)), v.T)

array([[ 2.,  1.],
       [ 1.,  2.]])

In [79]:
# Cholesky分解
l = np.linalg.cholesky(d)
l

array([[ 1.41421356,  0.        ],
       [ 0.70710678,  1.22474487]])

In [80]:
np.dot(l, l.T)

array([[ 2.,  1.],
       [ 1.,  2.]])

In [81]:
# SVD 分解
e = np.array([1, 2, 3, 4, 5, 6]).reshape((2, 3))
e

array([[1, 2, 3],
       [4, 5, 6]])

In [82]:
U, s, V = np.linalg.svd(e)

In [83]:
U

array([[-0.3863177 , -0.92236578],
       [-0.92236578,  0.3863177 ]])

In [84]:
s

array([ 9.508032  ,  0.77286964])

In [85]:
V

array([[-0.42866713, -0.56630692, -0.7039467 ],
       [ 0.80596391,  0.11238241, -0.58119908],
       [ 0.40824829, -0.81649658,  0.40824829]])

In [86]:
# 矩阵补齐
S = np.hstack((np.diag(s), np.zeros((2, 1))))
S

array([[ 9.508032  ,  0.        ,  0.        ],
       [ 0.        ,  0.77286964,  0.        ]])

In [87]:
np.dot(U, np.dot(S, V))

array([[ 1.,  2.,  3.],
       [ 4.,  5.,  6.]])

## 3. 随机模块

随机模块包含了随机数产生和统计分布相关的基本函数，Python本身也有随机模块random，不过功能更丰富。
```
========================
Random Number Generation
========================

==================== =========================================================
Utility functions
==============================================================================
random_sample        Uniformly distributed floats over ``[0, 1)``.
random               Alias for `random_sample`.
bytes                Uniformly distributed random bytes.
random_integers      Uniformly distributed integers in a given range.
permutation          Randomly permute a sequence / generate a random sequence.
shuffle              Randomly permute a sequence in place.
seed                 Seed the random number generator.
choice               Random sample from 1-D array.

==================== =========================================================

==================== =========================================================
Compatibility functions
==============================================================================
rand                 Uniformly distributed values.
randn                Normally distributed values.
ranf                 Uniformly distributed floating point numbers.
randint              Uniformly distributed integers in a given range.
==================== =========================================================

==================== =========================================================
Univariate distributions
==============================================================================
beta                 Beta distribution over ``[0, 1]``.
binomial             Binomial distribution.
chisquare            :math:`\chi^2` distribution.
exponential          Exponential distribution.
f                    F (Fisher-Snedecor) distribution.
gamma                Gamma distribution.
geometric            Geometric distribution.
gumbel               Gumbel distribution.
hypergeometric       Hypergeometric distribution.
laplace              Laplace distribution.
logistic             Logistic distribution.
lognormal            Log-normal distribution.
logseries            Logarithmic series distribution.
negative_binomial    Negative binomial distribution.
noncentral_chisquare Non-central chi-square distribution.
noncentral_f         Non-central F distribution.
normal               Normal / Gaussian distribution.
pareto               Pareto distribution.
poisson              Poisson distribution.
power                Power distribution.
rayleigh             Rayleigh distribution.
triangular           Triangular distribution.
uniform              Uniform distribution.
vonmises             Von Mises circular distribution.
wald                 Wald (inverse Gaussian) distribution.
weibull              Weibull distribution.
zipf                 Zipf's distribution over ranked data.
==================== =========================================================

==================== =========================================================
Multivariate distributions
==============================================================================
dirichlet            Multivariate generalization of Beta distribution.
multinomial          Multivariate generalization of the binomial distribution.
multivariate_normal  Multivariate generalization of the normal distribution.
==================== =========================================================

==================== =========================================================
Standard distributions
==============================================================================
standard_cauchy      Standard Cauchy-Lorentz distribution.
standard_exponential Standard exponential distribution.
standard_gamma       Standard Gamma distribution.
standard_normal      Standard normal distribution.
standard_t           Standard Student's t-distribution.
==================== =========================================================

==================== =========================================================
Internal functions
==============================================================================
get_state            Get tuple representing internal state of generator.
set_state            Set state of generator.
==================== =========================================================
```

In [88]:
import numpy.random as random

In [89]:
random.seed(42)

In [90]:
random.rand(1, 3)

array([[ 0.37454012,  0.95071431,  0.73199394]])

In [91]:
# 产生一个[0, 1)之间随机数
random.random()

0.5986584841970366

In [92]:
# 以下四个函数是一样的
random.random((3, 3))
random.sample((3, 3))
random.random_sample((3, 3))
random.ranf((3, 3))

array([[ 0.17052412,  0.06505159,  0.94888554],
       [ 0.96563203,  0.80839735,  0.30461377],
       [ 0.09767211,  0.68423303,  0.44015249]])

In [93]:
# 产生10个[1, 6)之间的浮点随机数
5 * random.random(10) + 1

array([ 1.61019117,  3.47588455,  1.17194261,  5.54660201,  2.29389991,
        4.31261142,  2.55855538,  3.60034011,  3.7335514 ,  1.92427228])

In [94]:
random.uniform(1, 6, 10)

array([ 5.84792314,  4.87566412,  5.69749471,  5.47413675,  3.98949989,
        5.60937118,  1.44246251,  1.97991431,  1.22613644,  2.62665165])

In [95]:
# 产生10个[1,6)之间的整型随机数
random.randint(1, 6, 10)

array([5, 2, 5, 2, 1, 4, 4, 4, 5, 1])

In [96]:
# 产生2x5的标准正态分布样本
mu, sigma = 2, 0.5
random.normal(mu, sigma, size=(5, 2))

array([[ 1.57829004,  1.7433373 ],
       [ 1.9782316 ,  1.86234954],
       [ 1.2184666 ,  1.6021845 ],
       [ 2.4652922 ,  2.3388837 ],
       [ 2.34922013,  2.08680103]])

In [97]:
# 产生20个，n=10，p=0.5的二项分布样本，扔10次硬币正面朝上的次数
random.binomial(n=10, p=0.5, size=20)

array([6, 7, 7, 5, 3, 4, 6, 6, 5, 4, 5, 5, 8, 7, 6, 4, 4, 2, 6, 3])

In [98]:
a = np.arange(10)
# 有放回的随机采样
random.choice(a, 7)

array([2, 6, 0, 3, 3, 4, 6])

In [99]:
# 无放回的随机采样
random.choice(a, 7, replace=False)

array([0, 7, 8, 1, 5, 2, 9])

In [100]:
# 随机乱序
indexes = random.permutation(a)
indexes

array([0, 2, 1, 9, 7, 6, 3, 5, 4, 8])

In [101]:
b = np.arange(10) * 2 + 2
b

array([ 2,  4,  6,  8, 10, 12, 14, 16, 18, 20])

In [102]:
b[indexes]

array([ 2,  6,  4, 20, 16, 14,  8, 12, 10, 18])

In [103]:
# 打乱数组
random.shuffle(a)
a

array([1, 2, 7, 4, 8, 5, 0, 3, 6, 9])

In [104]:
# 生成一个长度为9的随机bytes序列并作为str返回
random.bytes(9)

b'\xbd\xefVmx\x1a"\x00s'