# numpy库

## 一、安装库的方法

* 直接pip安装
* 在anaconda中安装

##  二、导入库的方法

* 方法1:import modname  

       import modname as shortname        

In [9]:
import numpy

In [10]:
numpy.sqrt(4)

2.0

In [15]:
import numpy as np

In [16]:
np.arange(9)

array([0, 1, 2, 3, 4, 5, 6, 7, 8])

* 方法2:from modname import funcname

       form modname import fa,fa,fc

In [11]:
from numpy import sqrt

In [12]:
sqrt(4)

2.0

* 方法3:from modname import *

In [13]:
from numpy import *

In [14]:
sin(pi/6)

0.49999999999999994

## 三、结合金融场景演示numpy模块的操作

|股票简称|2018年9月3日|2018年9月4日|2018年9月5日|2018年9月6日|2018年9月7日|
|-------|-----------|-----------|-----------|-----------|-----------|
|中国石油|0.3731%|2.1066%|-0.4854%|0.6098%|-0.6060%|
|工商银行|-0.1838%|0.1842%|-1.6544%|-0.3738%|0.3752%|
|上汽集团|-0.3087%|-0.0344%|-3.3391%|0.7123%|0.4597%|
|宝钢股份|-2.4112%|1.1704%|-2.9563%|-1.4570%|1.6129%|

numpy是运用python进行科学计算的基础包（模块），可以定义任意数据类型，它的内容包括：（1）强大的N维数组对象；（2）复杂的广播功能；（3）用于集成C/C++和Fortran代码的工具；（4）实用的线性代数、傅立叶变换和随机数功能等

numpy除了显著的科学用途之外，还可以用作通用数据的高效多位容器，这使得numpy能够无缝快速地与各种数据库集成

numpy是python的外部模块，因此使用时需要导入并且查看相应的版本信息

### 3.1 导入numpy库

In [19]:
import numpy as np #导入numpy模块

In [21]:
np.__version__   #查看numpy版本号

'1.16.4'

### 3.2 股票信息输入

numpy最显著的特征在于它的数据结构是运用了数组。数组（array）和前面的的列表有相似指出，但是数组是可以定义纬度的。因此数组的全称是N维数组，数组适合做数学代数运算。数组的结构如下：
* 一维数组 np.array(一个数列）
* 二维数组 np.array([数列1，数列2，...数列m]）

#### 将4只股票的配置比例以一维数组方式直接在python中进行输入

In [22]:
weight=np.array([0.15,0.2,0.25,0.4])

In [23]:
type(weight)

numpy.ndarray

In [25]:
weight.shape 

(4,)

用shape函数可以得到该变量是一维数组，相当于是一个由4个元素组成的向量

#### 将这4只股票涨跌幅以数组方式在python中进行输入

In [32]:
stock_return=np.array([[0.003731,0.021066,-0.004854,0.006098,-0.00606],
                       [-0.001838,0.001842,-0.016544,-0.003738,0.003752],
                       [-0.003087,-0.000344,-0.033391,0.007123,0.004597],
                       [-0.024112,0.011704,-0.029563,-0.01457,0.016129]])

In [34]:
stock_return

array([[ 0.003731,  0.021066, -0.004854,  0.006098, -0.00606 ],
       [-0.001838,  0.001842, -0.016544, -0.003738,  0.003752],
       [-0.003087, -0.000344, -0.033391,  0.007123,  0.004597],
       [-0.024112,  0.011704, -0.029563, -0.01457 ,  0.016129]])

In [35]:
stock_return.shape

(4, 5)

##### 列表转换为数组np.array

思考？weight和stock_return先以列表输入，再转换为数组

##### ndim函数和size函数

### 3.3 数组的便捷生成

#### 1、生成整数序列

In [37]:
a=np.arange(10)

In [38]:
a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [39]:
b=np.arange(1,15,3)

In [40]:
b

array([ 1,  4,  7, 10, 13])

#### 2、生成等差序列

In [41]:
np.linspace(0,100,51)

array([  0.,   2.,   4.,   6.,   8.,  10.,  12.,  14.,  16.,  18.,  20.,
        22.,  24.,  26.,  28.,  30.,  32.,  34.,  36.,  38.,  40.,  42.,
        44.,  46.,  48.,  50.,  52.,  54.,  56.,  58.,  60.,  62.,  64.,
        66.,  68.,  70.,  72.,  74.,  76.,  78.,  80.,  82.,  84.,  86.,
        88.,  90.,  92.,  94.,  96.,  98., 100.])

#### 3、创建元素0的数组

In [44]:
zero_array1=np.zeros(5) #5代表形状参数

In [45]:
zero_array1

array([0., 0., 0., 0., 0.])

In [47]:
zero_array2=np.zeros([5,6]) #5,6代表形状参数

In [48]:
zero_array2

array([[0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.]])

In [50]:
weight

array([0.15, 0.2 , 0.25, 0.4 ])

In [51]:
zero_weight=np.zeros_like(weight)

In [52]:
zero_weight

array([0., 0., 0., 0.])

#### 4、创建元素1的数组

In [53]:
stock_return

array([[ 0.003731,  0.021066, -0.004854,  0.006098, -0.00606 ],
       [-0.001838,  0.001842, -0.016544, -0.003738,  0.003752],
       [-0.003087, -0.000344, -0.033391,  0.007123,  0.004597],
       [-0.024112,  0.011704, -0.029563, -0.01457 ,  0.016129]])

In [54]:
one_stock=np.ones_like(stock_return)

In [55]:
one_stock

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

In [57]:
np.ones([5,6])

array([[1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1.]])

#### 5、创建单位矩阵

In [59]:
np.eye(5)

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

### 3.4 数组的索引、切片和排序

#### 1. 索引

* 投资者希望找到工商银行这个股票在2018年9月5日的涨跌幅

In [61]:
stock_return

array([[ 0.003731,  0.021066, -0.004854,  0.006098, -0.00606 ],
       [-0.001838,  0.001842, -0.016544, -0.003738,  0.003752],
       [-0.003087, -0.000344, -0.033391,  0.007123,  0.004597],
       [-0.024112,  0.011704, -0.029563, -0.01457 ,  0.016129]])

In [62]:
stock_return[1,2]

-0.016544

* 投资者希望找到涨跌幅低于-1%的数据所在数组中的索引值

In [63]:
np.where(stock_return<-0.01)

(array([1, 2, 3, 3, 3]), array([2, 2, 0, 2, 3]))

第一个代表第几行，第二个代表第几列

#### 2. 切片

* 投资者希望提取上汽集团、宝钢股份在2018年9月4日至9月6日的涨跌幅数据

In [64]:
stock_return[2:,1:4]

array([[-0.000344, -0.033391,  0.007123],
       [ 0.011704, -0.029563, -0.01457 ]])

* 投资者希望分别提取第二行的全部数据和第3列的全部数据

In [65]:
stock_return[1] #提取第2行的全部数据

array([-0.001838,  0.001842, -0.016544, -0.003738,  0.003752])

In [67]:
stock_return[:,2] #提取第3列的全部数据

array([-0.004854, -0.016544, -0.033391, -0.029563])

#### 3. 排序

投资者通常会非常关心股票的涨跌幅大小情况，为了区分大小，一种最便捷的方法就是将数据按照由小到大的顺序进行排序

* 投资者希望针对股票按照日涨跌幅进行排序

In [69]:
np.sort(stock_return,axis=0) #按列对元素由小到大排序

array([[-0.024112, -0.000344, -0.033391, -0.01457 , -0.00606 ],
       [-0.003087,  0.001842, -0.029563, -0.003738,  0.003752],
       [-0.001838,  0.011704, -0.016544,  0.006098,  0.004597],
       [ 0.003731,  0.021066, -0.004854,  0.007123,  0.016129]])

In [70]:
np.sort(stock_return,axis=1) #按行对元素由小到大排序

array([[-0.00606 , -0.004854,  0.003731,  0.006098,  0.021066],
       [-0.016544, -0.003738, -0.001838,  0.001842,  0.003752],
       [-0.033391, -0.003087, -0.000344,  0.004597,  0.007123],
       [-0.029563, -0.024112, -0.01457 ,  0.011704,  0.016129]])

In [71]:
np.sort(stock_return) #按行对元素由小到大排序

array([[-0.00606 , -0.004854,  0.003731,  0.006098,  0.021066],
       [-0.016544, -0.003738, -0.001838,  0.001842,  0.003752],
       [-0.033391, -0.003087, -0.000344,  0.004597,  0.007123],
       [-0.029563, -0.024112, -0.01457 ,  0.011704,  0.016129]])

In [72]:
stock_return

array([[ 0.003731,  0.021066, -0.004854,  0.006098, -0.00606 ],
       [-0.001838,  0.001842, -0.016544, -0.003738,  0.003752],
       [-0.003087, -0.000344, -0.033391,  0.007123,  0.004597],
       [-0.024112,  0.011704, -0.029563, -0.01457 ,  0.016129]])

### 3.5 数组的相关运算

#### 1.数组内的运算

* 假定投资者需要计算2018年9月3日至9月7日这5个交易日中，相关股票的平均涨跌幅、累积涨跌幅、最大或者最小涨跌幅等指标

* 求和

In [74]:
stock_return.sum(axis=0) #按列求和

array([-0.025306,  0.034268, -0.084352, -0.005087,  0.018418])

In [75]:
stock_return.sum(axis=1) #按行求和，计算得到每只股票5天的累积涨跌幅

array([ 0.019981, -0.016526, -0.025102, -0.040412])

In [76]:
stock_return.sum()  #全部元素之和

-0.062059

* 求乘积

In [77]:
stock_return.prod(axis=0) #按列求乘积

array([-5.10435205e-10, -1.56230010e-10,  7.92717092e-08,  2.36564304e-09,
       -1.68584406e-09])

In [78]:
stock_return.prod(axis=1) #按行求乘积

array([ 1.40983129e-11, -7.85557141e-13, -1.16107947e-12, -1.96057312e-09])

In [79]:
stock_return.prod() #全部元素之积

-2.521099098076826e-44

* 求最值

In [80]:
stock_return.max(axis=0) #按列求最大值

array([ 0.003731,  0.021066, -0.004854,  0.007123,  0.016129])

In [81]:
stock_return.max(axis=1) #按行求最大值

array([0.021066, 0.003752, 0.007123, 0.016129])

In [82]:
stock_return.max() #求全部元素最大值

0.021066

In [83]:
stock_return.min(axis=0) #按列求最小值

array([-0.024112, -0.000344, -0.033391, -0.01457 , -0.00606 ])

In [84]:
stock_return.min(axis=1) #按行求最小值

array([-0.00606 , -0.016544, -0.033391, -0.029563])

In [85]:
stock_return.min() #求全部元素最小值

-0.033391

* 求均值

In [86]:
stock_return.mean(axis=0) #按列求平均值

array([-0.0063265 ,  0.008567  , -0.021088  , -0.00127175,  0.0046045 ])

In [87]:
stock_return.mean(axis=1) #按行求平均值

array([ 0.0039962, -0.0033052, -0.0050204, -0.0080824])

In [88]:
stock_return.mean() #求全部元素的平均值

-0.00310295

* 求方差和标准差

In [89]:
stock_return.var(axis=0) #按列求方差

array([1.12029577e-04, 7.26743290e-05, 1.26845031e-04, 7.69277212e-05,
       6.18181183e-05])

In [90]:
stock_return.var(axis=1) #按行求方差

array([9.50638330e-05, 5.07807114e-05, 2.14090849e-04, 3.47629344e-04])

In [91]:
stock_return.var() #求全部元素的方差

0.0001966187774475

In [92]:
stock_return.std(axis=0) #按列求标准差

array([0.0105844 , 0.00852492, 0.01126255, 0.00877084, 0.00786245])

In [93]:
stock_return.std(axis=1) #按行求标准差

array([0.00975007, 0.00712606, 0.01463184, 0.01864482])

In [94]:
stock_return.std() #全部元素求标准差

0.014022081780088862

* 幂运算

In [95]:
np.sqrt(stock_return) #对每个元素计算开方

  """Entry point for launching an IPython kernel.


array([[0.06108191, 0.14514131,        nan, 0.07808969,        nan],
       [       nan, 0.04291853,        nan,        nan, 0.06125357],
       [       nan,        nan,        nan, 0.08439787, 0.06780118],
       [       nan, 0.10818503,        nan,        nan, 0.127     ]])

由于开方仅适用于正数，因此，负数的开方在python中显示为nan，表示无解

In [96]:
np.square(stock_return) #对每个元素计算平方

array([[1.39203610e-05, 4.43776356e-04, 2.35613160e-05, 3.71856040e-05,
        3.67236000e-05],
       [3.37824400e-06, 3.39296400e-06, 2.73703936e-04, 1.39726440e-05,
        1.40775040e-05],
       [9.52956900e-06, 1.18336000e-07, 1.11495888e-03, 5.07371290e-05,
        2.11324090e-05],
       [5.81388544e-04, 1.36983616e-04, 8.73970969e-04, 2.12284900e-04,
        2.60144641e-04]])

In [97]:
np.exp(stock_return) #对每个元素计算以e为底的指数次方

array([[1.00373797, 1.02128945, 0.99515776, 1.00611663, 0.99395832],
       [0.99816369, 1.0018437 , 0.9835921 , 0.99626898, 1.00375905],
       [0.99691776, 0.99965606, 0.96716033, 1.00714843, 1.00460758],
       [0.97617637, 1.01177276, 0.97086971, 0.98553563, 1.01625977]])

* 对数运算

In [98]:
np.log(stock_return) #对每个元素计算自然对数

  """Entry point for launching an IPython kernel.


array([[-5.59107898, -3.86009491,         nan, -5.09979443,         nan],
       [        nan, -6.29690334,         nan,         nan, -5.58546625],
       [        nan,         nan,         nan, -4.94442629, -5.38235136],
       [        nan, -4.44782462,         nan,         nan, -4.12713639]])

由于开方仅适用于正数，因此，负数的开方在python中显示为nan，表示无解

In [99]:
np.log10(stock_return) #对每个元素计算底数10的对数

  """Entry point for launching an IPython kernel.


array([[-2.42817475, -1.67641792,         nan, -2.21481258,         nan],
       [        nan, -2.73471037,         nan,         nan, -2.42573717],
       [        nan,         nan,         nan, -2.14733706, -2.3375255 ],
       [        nan, -1.93166569,         nan,         nan, -1.79239256]])

In [100]:
np.log2(stock_return) #对每个元素计算底数2的对数

  """Entry point for launching an IPython kernel.


array([[-8.06622192, -5.56893979,         nan, -7.35744813,         nan],
       [        nan, -9.08451122,         nan,         nan, -8.05812446],
       [        nan,         nan,         nan, -7.13329929, -7.76509162],
       [        nan, -6.41685452,         nan,         nan, -5.9541992 ]])

#### 2. 数组间的运算

In [101]:
stock_return

array([[ 0.003731,  0.021066, -0.004854,  0.006098, -0.00606 ],
       [-0.001838,  0.001842, -0.016544, -0.003738,  0.003752],
       [-0.003087, -0.000344, -0.033391,  0.007123,  0.004597],
       [-0.024112,  0.011704, -0.029563, -0.01457 ,  0.016129]])

In [102]:
one_stock

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

In [108]:
new_array1=stock_return+one_stock #两个数组相加

In [109]:
new_array1

array([[1.003731, 1.021066, 0.995146, 1.006098, 0.99394 ],
       [0.998162, 1.001842, 0.983456, 0.996262, 1.003752],
       [0.996913, 0.999656, 0.966609, 1.007123, 1.004597],
       [0.975888, 1.011704, 0.970437, 0.98543 , 1.016129]])

In [117]:
new_array2=stock_return-one_stock #两个数组相减

In [118]:
new_array2

array([[-0.996269, -0.978934, -1.004854, -0.993902, -1.00606 ],
       [-1.001838, -0.998158, -1.016544, -1.003738, -0.996248],
       [-1.003087, -1.000344, -1.033391, -0.992877, -0.995403],
       [-1.024112, -0.988296, -1.029563, -1.01457 , -0.983871]])

In [119]:
new_array3=new_array1*new_array2 #两个数组相乘

In [120]:
new_array3

array([[-0.99998608, -0.99955622, -0.99997644, -0.99996281, -0.99996328],
       [-0.99999662, -0.99999661, -0.9997263 , -0.99998603, -0.99998592],
       [-0.99999047, -0.99999988, -0.99888504, -0.99994926, -0.99997887],
       [-0.99941861, -0.99986302, -0.99912603, -0.99978772, -0.99973986]])

In [121]:
new_array4=new_array1/new_array2 #两个数组相除

In [122]:
new_array4

array([[-1.00748994, -1.04303865, -0.9903389 , -1.01227083, -0.987953  ],
       [-0.99633074, -1.0036908 , -0.9674505 , -0.99255184, -1.00753226],
       [-0.993845  , -0.99931224, -0.93537586, -1.0143482 , -1.00923646],
       [-0.9529114 , -1.02368521, -0.94257175, -0.97127847, -1.03278682]])

In [123]:
new_array5=new_array1**new_array2 #两个数组之间的幂运算

In [124]:
new_array5

array([[0.99629671, 0.97979882, 1.00490141, 0.99397581, 1.00613401],
       [1.00184477, 0.99816477, 1.01710298, 1.00376608, 0.99627602],
       [1.00310613, 1.00034424, 1.03571831, 0.99297758, 0.99544502],
       [1.02531098, 0.98856602, 1.03137818, 1.01500246, 0.98438102]])

In [125]:
stock_return+1 #数组内每个元素均加上1

array([[1.003731, 1.021066, 0.995146, 1.006098, 0.99394 ],
       [0.998162, 1.001842, 0.983456, 0.996262, 1.003752],
       [0.996913, 0.999656, 0.966609, 1.007123, 1.004597],
       [0.975888, 1.011704, 0.970437, 0.98543 , 1.016129]])

In [126]:
stock_return-1 #数组内每个元素均减去1

array([[-0.996269, -0.978934, -1.004854, -0.993902, -1.00606 ],
       [-1.001838, -0.998158, -1.016544, -1.003738, -0.996248],
       [-1.003087, -1.000344, -1.033391, -0.992877, -0.995403],
       [-1.024112, -0.988296, -1.029563, -1.01457 , -0.983871]])

In [127]:
stock_return*2  #数组内每个元素均乘以2

array([[ 0.007462,  0.042132, -0.009708,  0.012196, -0.01212 ],
       [-0.003676,  0.003684, -0.033088, -0.007476,  0.007504],
       [-0.006174, -0.000688, -0.066782,  0.014246,  0.009194],
       [-0.048224,  0.023408, -0.059126, -0.02914 ,  0.032258]])

In [128]:
stock_return/2  #数组内每个元素均除以2

array([[ 0.0018655,  0.010533 , -0.002427 ,  0.003049 , -0.00303  ],
       [-0.000919 ,  0.000921 , -0.008272 , -0.001869 ,  0.001876 ],
       [-0.0015435, -0.000172 , -0.0166955,  0.0035615,  0.0022985],
       [-0.012056 ,  0.005852 , -0.0147815, -0.007285 ,  0.0080645]])

In [129]:
stock_return**2  #数组内每个元素均进行平方

array([[1.39203610e-05, 4.43776356e-04, 2.35613160e-05, 3.71856040e-05,
        3.67236000e-05],
       [3.37824400e-06, 3.39296400e-06, 2.73703936e-04, 1.39726440e-05,
        1.40775040e-05],
       [9.52956900e-06, 1.18336000e-07, 1.11495888e-03, 5.07371290e-05,
        2.11324090e-05],
       [5.81388544e-04, 1.36983616e-04, 8.73970969e-04, 2.12284900e-04,
        2.60144641e-04]])

在金融分析中，经常会要求比较两个或者更多个形状相同数组之间对应元素的大小关系，并且由此生成包含最大元素或者最小元素的新数组

In [133]:
np.maximum(stock_return,one_stock)  #生成由两个数组对应元素的最大值作为元素的新数组

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

In [134]:
np.minimum(stock_return,one_stock) #生成由两个数组对应元素的最小值作为元素的新数组

array([[ 0.003731,  0.021066, -0.004854,  0.006098, -0.00606 ],
       [-0.001838,  0.001842, -0.016544, -0.003738,  0.003752],
       [-0.003087, -0.000344, -0.033391,  0.007123,  0.004597],
       [-0.024112,  0.011704, -0.029563, -0.01457 ,  0.016129]])

#### 3. 矩阵的操作

* 矩阵的性质

In [137]:
corrcoef_return=np.corrcoef(stock_return) #计算相关4只股票涨跌幅的相关系数矩阵

In [136]:
corrcoef_return

array([[1.        , 0.38215651, 0.36338676, 0.30254781],
       [0.38215651, 1.        , 0.89216018, 0.80740528],
       [0.36338676, 0.89216018, 1.        , 0.60483848],
       [0.30254781, 0.80740528, 0.60483848, 1.        ]])

In [139]:
np.diag(corrcoef_return) #提取矩阵的对角线

array([1., 1., 1., 1.])

In [140]:
np.triu(corrcoef_return) #提取矩阵上三角

array([[1.        , 0.38215651, 0.36338676, 0.30254781],
       [0.        , 1.        , 0.89216018, 0.80740528],
       [0.        , 0.        , 1.        , 0.60483848],
       [0.        , 0.        , 0.        , 1.        ]])

In [141]:
np.tril(corrcoef_return) #提取矩阵下三角

array([[1.        , 0.        , 0.        , 0.        ],
       [0.38215651, 1.        , 0.        , 0.        ],
       [0.36338676, 0.89216018, 1.        , 0.        ],
       [0.30254781, 0.80740528, 0.60483848, 1.        ]])

In [142]:
np.transpose(corrcoef_return) #转置矩阵

array([[1.        , 0.38215651, 0.36338676, 0.30254781],
       [0.38215651, 1.        , 0.89216018, 0.80740528],
       [0.36338676, 0.89216018, 1.        , 0.60483848],
       [0.30254781, 0.80740528, 0.60483848, 1.        ]])

In [144]:
corrcoef_return.T #转置矩阵

array([[1.        , 0.38215651, 0.36338676, 0.30254781],
       [0.38215651, 1.        , 0.89216018, 0.80740528],
       [0.36338676, 0.89216018, 1.        , 0.60483848],
       [0.30254781, 0.80740528, 0.60483848, 1.        ]])

In [145]:
np.trace(corrcoef_return) #矩阵的迹

4.0

* 矩阵的运算

按照每只股票在投资组合中的配置比例（权重）求出相应每个交易日投资组合的平均收益率，也就相当于求军政的内积

In [147]:
average_return=np.dot(weight,stock_return)

In [148]:
average_return

array([-0.0102245 ,  0.0081239 , -0.02420985, -0.00388015,  0.00744225])