### numpy.array基础

In [1]:
import numpy as np

In [2]:
# 查看numpy的版本
np.__version__

'1.16.4'

### Python List的特点

In [3]:
L = [i for i in range(10)]
L

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [4]:
L[5] = 100
L

[0, 1, 2, 3, 4, 100, 6, 7, 8, 9]

In [5]:
L[5] = "Machine Learning"
L

[0, 1, 2, 3, 4, 'Machine Learning', 6, 7, 8, 9]

python中的list可以存放多种不同的数据类型，这也导致了list的效率比较低。

python中有个array模块可以只存放一种数据类型。

In [6]:
import array

In [7]:
arr = array.array('i', [i for i in range(10)])
arr

array('i', [0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [8]:
arr[5] = 100
arr

array('i', [0, 1, 2, 3, 4, 100, 6, 7, 8, 9])

In [9]:
arr[5] = "Machine Learning"

TypeError: an integer is required (got type str)

array这个模块只是把其中的数据作为了数组，而不是看作向量或矩阵，所以也不方便使用array来进行机器学习算法的学习。

### numpy.array

In [10]:
nparr = np.array([i for i in range(10)])
nparr

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [11]:
nparr[5] = 100
nparr

array([  0,   1,   2,   3,   4, 100,   6,   7,   8,   9])

In [12]:
nparr[5] = "Machine Learning"

ValueError: invalid literal for int() with base 10: 'Machine Learning'

In [13]:
nparr.dtype

dtype('int32')

如果是整型的话，nparray会自动截断

In [14]:
nparr[5] = 5.0
nparr

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [15]:
nparr.dtype

dtype('int32')

In [16]:
nparr[3] = 3.14
nparr

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [17]:
nparr.dtype

dtype('int32')

In [18]:
nparr2 = np.array([1, 2, 3.0])
nparr2

array([1., 2., 3.])

In [19]:
nparr2.dtype

dtype('float64')

### 其他创建numpy.array的方法

创建全为0

In [20]:
np.zeros(10)

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

In [21]:
np.zeros(10).dtype

dtype('float64')

In [22]:
np.zeros(10, dtype=int)

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [23]:
np.zeros(shape=(3,5), dtype=int)

array([[0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0]])

创建全为1

In [24]:
np.ones(10)

array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

In [25]:
np.ones(shape=(3,5))

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

创建指定值的数组

In [26]:
np.full(shape=(3,5), fill_value=666)

array([[666, 666, 666, 666, 666],
       [666, 666, 666, 666, 666],
       [666, 666, 666, 666, 666]])

### arange

In [27]:
[i for i in range(0, 20, 2)] # 参数：起始点、终止点、步长

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

In [28]:
np.arange(0, 20, 2)

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

In [29]:
# 传入浮点数
[i for i in range(0, 1, 0.2)]

TypeError: 'float' object cannot be interpreted as an integer

In [30]:
np.arange(0, 1, 0.2)

array([0. , 0.2, 0.4, 0.6, 0.8])

### linspace

In [31]:
np.linspace(0, 20, 10) # 0~20的区间里等长地截出10个点

array([ 0.        ,  2.22222222,  4.44444444,  6.66666667,  8.88888889,
       11.11111111, 13.33333333, 15.55555556, 17.77777778, 20.        ])

In [32]:
np.linspace(0, 20, 11)

array([ 0.,  2.,  4.,  6.,  8., 10., 12., 14., 16., 18., 20.])

### random

In [33]:
np.random.randint(0, 10) # 第一个第二个参数表示生成的随机数的范围

3

In [34]:
np.random.randint(0, 10, 10) # 可以有第三个参数，表示生成的是向量还是高维矩阵

array([5, 9, 0, 4, 0, 8, 7, 5, 8, 2])

In [35]:
np.random.randint(0, 10, size=10) # 为了方便理解，在写参数时加上参数名字size

array([9, 9, 9, 5, 4, 4, 5, 7, 6, 4])

当我们在代码调试的时候，如果要保证生成的随机数都一样的话，就需要指定*随机种子*

In [36]:
np.random.seed(666)
np.random.randint(0, 10, size=10)

array([2, 6, 9, 4, 3, 1, 0, 8, 7, 5])

In [37]:
np.random.seed(666)
np.random.randint(0, 10, size=10)

array([2, 6, 9, 4, 3, 1, 0, 8, 7, 5])

生成随机的浮点数

In [38]:
np.random.random() # 生成的是0~1之间的浮点数

0.19289200304058374

In [39]:
np.random.random(5) # 参数是size

array([0.70084475, 0.29322811, 0.77447945, 0.00510884, 0.11285765])

In [41]:
np.random.random((3,5))

array([[0.11095367, 0.24766823, 0.0232363 , 0.72732115, 0.34003494],
       [0.19750316, 0.90917959, 0.97834699, 0.53280254, 0.25913185],
       [0.58381262, 0.32569065, 0.88889931, 0.62640453, 0.81887369]])

可以指定生成的范围内的随机数

In [43]:
np.random.normal() # 默认是正态分布

-2.2349687669162455

In [44]:
np.random.normal(10, 100) # 均值为10，方差为100

137.04776597908875

In [45]:
np.random.normal(10, 100, size=(3,5))

array([[ -62.92882099,  145.9252547 , -110.99026555,    5.38172775,
         -34.11824366],
       [  56.95343143,   54.32581718, -156.73887544, -171.73174936,
        -129.75391595],
       [  88.39269053,  -19.12996494,   77.04904296,   80.69310022,
         152.96524101]])

In [46]:
np.random.normal? # 方便地查看文档

In [47]:
help(np.random.normal)

Help on built-in function normal:

normal(...) method of mtrand.RandomState instance
    normal(loc=0.0, scale=1.0, size=None)
    
    Draw random samples from a normal (Gaussian) distribution.
    
    The probability density function of the normal distribution, first
    derived by De Moivre and 200 years later by both Gauss and Laplace
    independently [2]_, is often called the bell curve because of
    its characteristic shape (see the example below).
    
    The normal distributions occurs often in nature.  For example, it
    describes the commonly occurring distribution of samples influenced
    by a large number of tiny, random disturbances, each with its own
    unique distribution [2]_.
    
    Parameters
    ----------
    loc : float or array_like of floats
        Mean ("centre") of the distribution.
    scale : float or array_like of floats
        Standard deviation (spread or "width") of the distribution.
    size : int or tuple of ints, optional
        Output shap