# numpy快速入门教程

## 基本概念

In [None]:
'''
numpy相关概念
numpy的主要对象是多维数组ndarray，所包含元素为统一类型，数组的维度成为轴（axis），轴的数量成为秩（rank）
ndarray的主要参数
ndarray.ndim
ndarray.shape
ndarray.size
ndarray.dtype
ndarray.itemsize
ndarray.data
'''

### 举例

In [None]:
import numpy as np

In [None]:
a = np.arange(15).reshape(3,5)
# a的维度，也就是轴的数量及秩（rank）
print(a.ndim)
print(a.shape)
print(a.dtype)
print(a.dtype.name)
print(a.size)
print(a.itemsize)
# a的类型为：numpy.ndarray类型
type(a)

### 创建数组

In [None]:
'''
numpy创建数组的方式有很多种：
1. 从python的列表或元组创建，数组中的数据类型根据所传列表或元组的数据类型推测
'''

In [None]:
import numpy as np

a = np.array([1,2,3])  # 注意：所传递的是一个列表，而不是直接1，2，3
b = np.array((1., 2., 3.))
print(a)
print(type(a))
print(a.dtype)
print(b, b.dtype)

In [None]:
# 注意：所传递的是一个列表，而不是直接1，2，3
a = np.array(1,2,3)  # 该句会报错

In [None]:
# array函数会根据序列的是否包含序列的情况，自动生成多维数组
a = np.array([[1,2,3],[4,5,6]])
b = np.array([[[1,2],[3,4]],[[5,6],[7,8]]])
# print(a, a.ndim, a.shape)
print(b, b.shape, sep='\n')
b

In [None]:
# 创建数组时可以同时指定数据类型
a = np.array([[1,2],[3,4]], dtype=np.complex)
print(a, a.dtype, sep='\n')

In [None]:
'''
Often, the elements of an array are originally unknown, 
but its size is known. Hence, NumPy offers several functions to create arrays with initial placeholder content. 
These minimize the necessity of growing arrays, an expensive operation.

The function zeros creates an array full of zeros, 
the function ones creates an array full of ones, 
and the function empty creates an array whose initial content is random and depends on the state of the memory.
By default, the dtype of the created array is float64.

通常情况下，数组的具体元素不可知，但是数组的大小是知道的。因此，numpy提供了一些函数，用来作为已知数组大小的元素的占位内容。
函数zeros创建全为0的数组，函数ones创建全为1的数组，函数empty随机根据内存状态随机初始化数据
'''

In [None]:
np.zeros((3,4))

In [None]:
np.ones([2,3])

In [None]:
np.empty((2,3))

In [None]:
'''
To create sequences of numbers, NumPy provides a function analogous to range that returns arrays instead of lists.
为了创建数字序列，numpy提供了一个类似range的函数用来返回数组而不是列表
arange([start,] stop[, step,], dtype=None)
'''

In [None]:
np.arange(10)

In [None]:
np.arange(10,100,10,dtype=np.float)

In [None]:
np.arange(0,1,0.1)

In [None]:
'''
When arange is used with floating point arguments, 
it is generally not possible to predict the number of elements obtained, due to the finite floating point precision.
For this reason, it is usually better to use the function linspace 
that receives as an argument the number of elements that we want, instead of the step:

'''

In [None]:
np.linspace(0,2,9)

In [None]:
x = np.linspace(0, 2 * np.pi, 100)
f = np.sin(x)
print(x,f,sep='\n')

In [None]:
# 还有更多的和创建数组有关的函数，列出如下
'''
array, 
zeros, zeros_like, 
ones, ones_like, 
empty, empty_like, 
arange, 
linspace, 
numpy.random.rand, numpy.random.randn,
fromfunction, fromfile
'''

### 打印数组

In [None]:
'''
When you print an array, NumPy displays it in a similar way to nested lists, but with the following layout:

the last axis is printed from left to right,
the second-to-last is printed from top to bottom,
the rest are also printed from top to bottom, with each slice separated from the next by an empty line.
One-dimensional arrays are then printed as rows, bidimensionals as matrices and tridimensionals as lists of matrices.
一维数组以行的形式打印，二维数组以矩阵的形式打印,三维数组以矩阵列表的形式打印

'''

In [None]:
# 一维数组以行的形式打印
a = np.arange(6)
print(a)

In [None]:
# 二维数组以矩阵的形式打印
a = np.arange(12).reshape(4,3)
print(a)

In [None]:
# 三维数组以矩阵列表的形式打印
a = np.arange(24).reshape(2,3,4)
print(a)

In [None]:
'''
If an array is too large to be printed, 
NumPy automatically skips the central part of the array and only prints the corners:
如果数组太大难以打印，numpy会自动跳过数组的中间部分，只打印角落的内容

'''

In [None]:
print(np.arange(10000))

In [None]:
print(np.arange(10000).reshape(100,100))

In [None]:
'''
To disable this behaviour and force NumPy to print the entire array, 
you can change the printing options using set_printoptions.
为了禁止这种行为，强制打印数组的所有内容，可以通过设置set_printoptions来改变打印效果
'''

In [None]:
# np.set_printoptions(threshold='nan')

### 基本操作

In [None]:
'''
Arithmetic operators on arrays apply elementwise. A new array is created and filled with the result.

'''

In [None]:
a = np.array([10, 20, 30, 40])
b = np.arange(4)
b

In [None]:
c = a - b
c

In [None]:
b ** 2

In [None]:
10 * np.sin(a)

In [None]:
a < 35

In [None]:
'''
Unlike in many matrix languages, the product operator * operates elementwise in NumPy arrays. 
The matrix product can be performed using the dot function or method:
'''

In [None]:
A = np.array([[1,1],[0,1]])
B = np.array([[2,0],[3,4]])
A * B

In [None]:
A.dot(B)

In [None]:
np.dot(A,B)

In [None]:
'''
Some operations, such as += and *=, act in place to modify an existing array rather than create a new one.
'''

In [None]:
a = np.ones((2,3),dtype=int)
b = np.random.random((2,3))
a *= 3
a

In [None]:
b += a
b

In [None]:
a += b
a

In [None]:
'''
When operating with arrays of different types, 
the type of the resulting array corresponds to the more general or precise one (a behavior known as upcasting).
'''

In [None]:
a = np.ones(3, dtype=np.int32)
a

In [None]:
b = np.linspace(0, np.pi, 3)
b

In [None]:
b.dtype.name

In [None]:
c = a + b
c

In [None]:
c.dtype.name

In [None]:
d = np.exp(c*1j)
d

In [None]:
d.dtype.name

In [None]:
'''
Many unary operations, such as computing the sum of all the elements in the array, 
are implemented as methods of the ndarray class.
'''

In [None]:
a = np.random.random((2,3))
a

In [None]:
a.sum()

In [None]:
a.min()

In [None]:
a.max()

In [None]:
'''
By default, these operations apply to the array as though it were a list of numbers, regardless of its shape. 
However, by specifying the axis parameter you can apply an operation along the specified axis of an array:
'''

In [None]:
b = np.arange(12).reshape(3,4)
b

In [None]:
b.sum(axis=0)

In [None]:
b.min(axis=1)

In [None]:
b.cumsum(axis=1)

### （通用函数）Universal Functions 

In [None]:
'''
NumPy provides familiar mathematical functions such as sin, cos, and exp. 
In NumPy, these are called “universal functions”(ufunc). 
Within NumPy, these functions operate elementwise on an array, producing an array as output.
'''

In [None]:
B = np.arange(3)
B

In [None]:
np.exp(B)

In [None]:
np.sqrt(B)

In [None]:
C = np.array([2., -1., 4.])
np.add(B ,C)

In [None]:
# 更多通用函数，列出如下
'''
all, any, apply_along_axis, argmax, argmin, argsort, average, 
bincount, ceil, clip, conj, corrcoef, cov, cross, cumprod, cumsum, 
diff, dot, floor, inner, inv, lexsort, max, maximum, mean, median, 
min, minimum, nonzero, outer, prod, re, round, sort, std, sum, trace,
transpose, var, vdot, vectorize, where
'''

### 索引、切片、迭代

In [None]:
'''
One-dimensional arrays can be indexed, sliced and iterated over, much like lists and other Python sequences.

一维数组可以进行索引，切片，迭代操作，更像列表和其他的python序列
'''

In [None]:
a = np.arange(10) ** 3
a

In [None]:
a[2]

In [None]:
a[2:5]

In [None]:
a[:6:2] = -100
a

In [None]:
for i in a:
    print(i**(1/3.))

In [None]:
'''
Multidimensional arrays can have one index per axis. These indices are given in a tuple separated by commas:
'''

In [None]:
def f(x, y):
    return 10*x + y

In [None]:
d = np.fromfunction(f, (5,4),dtype=np.int)
d

In [None]:
d[2,3]

In [None]:
d[0:5,1]

In [None]:
d[:,1]

In [None]:
d[1:3,:]

In [None]:
'''
When fewer indices are provided than the number of axes, the missing indices are considered complete slices:
'''

In [None]:
d[-1]

In [None]:
'''
The expression within brackets in b[i] is treated as an i followed by as many instances of : as needed to represent the remaining axes. 
NumPy also allows you to write this using dots as b[i,...].

The dots (...) represent as many colons as needed to produce a complete indexing tuple. 
For example, if x is a rank 5 array (i.e., it has 5 axes), then

x[1,2,...] is equivalent to x[1,2,:,:,:],
x[...,3] to x[:,:,:,:,3] and
x[4,...,5,:] to x[4,:,:,5,:].


'''

In [None]:
c = np.array([[[0,1,2],[11,12,12]],[[100,101,101],[201,202,203]]])
c

In [None]:
c[1,...]

In [None]:
c[...,2]

In [None]:
'''
Iterating over multidimensional arrays is done with respect to the first axis:
多维数组的迭代作用在第一个轴上
'''

In [None]:
b

In [None]:
for row in b:
    print(row)

In [None]:
'''
However, if one wants to perform an operation on each element in the array, 
one can use the flat attribute which is an iterator over all the elements of the array:

如果想迭代每一个元素，则使用flat属性，将返回一个迭代器，迭代数组的所有元素
'''

In [None]:
b.flat

In [None]:
for element in b.flat:
    print(element)

## 形状操作（Shape Manipulation）

### 改变数组的形状

In [None]:
'''
An array has a shape given by the number of elements along each axis:
数组根据每一个轴的元素数量确定形状
'''


In [None]:
a = np.floor(10 * np.random.random((3,4)))
a

In [None]:
a.shape

In [None]:
'''
The shape of an array can be changed with various commands. 
Note that the following three commands all return a modified array, but do not change the original array:

可以通过不同的命令改变数组的形状。
注意：以下三个命令会返回一个修改后的数组，原数组不会改变
'''

In [None]:
a.ravel() # returns the array, flattened

In [None]:
a.reshape(6,2) # returns the array with a modified shape

In [None]:
a.T # returns the array, transposed
a.transpose()

In [None]:
a.T.shape

In [None]:
a.shape

In [None]:
'''
The order of the elements in the array resulting from ravel() is normally “C-style”, that is, 
the rightmost index “changes the fastest”, so the element after a[0,0] is a[0,1]. 
If the array is reshaped to some other shape, again the array is treated as “C-style”. 
NumPy normally creates arrays stored in this order, so ravel() will usually not need to copy its argument, 
but if the array was made by taking slices of another array or created with unusual options, it may need to be copied.
The functions ravel() and reshape() can also be instructed, using an optional argument, to use FORTRAN-style arrays, 
in which the leftmost index changes the fastest.
'''

In [None]:
'''
The reshape function returns its argument with a modified shape, 
whereas the ndarray.resize method modifies the array itself:
函数reshape返回一个改变形状的新数组，而resize函数，改变数组本身

'''

In [None]:
a

In [None]:
a.resize((2,6))
a

In [None]:
'''
If a dimension is given as -1 in a reshaping operation, the other dimensions are automatically calculated:
'''

In [None]:
a.reshape(3,-1)

### 叠加不同的数组

In [None]:
a = np.floor(10 * np.random.random((2,2)))
a

In [None]:
b = np.floor(10 * np.random.random((2,2)))
b

In [None]:
np.vstack((a,b))

In [None]:
np.hstack((a,b))

In [None]:
'''
The function column_stack stacks 1D arrays as columns into a 2D array. 
It is equivalent to vstack only for 1D arrays:
'''

In [None]:
np.column_stack((a,b))

In [None]:
a = np.array([4., 2.])
b = np.array([2.,8.])

In [None]:
a[:,np.newaxis]

In [None]:
np.column_stack((a[:,np.newaxis],b[:,np.newaxis]))

In [None]:
np.vstack((a[:,np.newaxis],b[:,np.newaxis]))

In [None]:
'''
For arrays of with more than two dimensions, hstack stacks along their second axes, 
vstack stacks along their first axes, 
and concatenate allows for an optional arguments giving the number of the axis 
along which the concatenation should happen.
'''

In [None]:
'''
In complex cases, r_ and c_ are useful for creating arrays by stacking numbers along one axis. 
They allow the use of range literals (”:”)
'''

In [None]:
np.r_[1:4,0,4]

### 将一个数组分裂成多个小数组

In [None]:
'''
Using hsplit, you can split an array along its horizontal axis, 
either by specifying the number of equally shaped arrays to return, 
or by specifying the columns after which the division should occur:
'''

In [None]:
a = np.floor(10 * np.random.random((4,12)))
a

In [None]:
np.hsplit(a, 3)

In [None]:
np.hsplit(a, (3,5))

In [None]:
np.vsplit(a, 4)

## 复制和视图

In [None]:
'''
When operating and manipulating arrays, their data is sometimes copied into a new array and sometimes not. 
This is often a source of confusion for beginners. There are three cases:
'''

### No Copy at All

In [None]:
'''
Simple assignments make no copy of array objects or of their data.
简单赋值不对数组对象或值做复制
'''

In [None]:
a = np.arange(12)
b = a
b is a

In [None]:
b.shape = 3,4
a.shape

In [None]:
'''
Python passes mutable objects as references, so function calls make no copy.
'''

In [None]:
print(id(a))
print(id(b))

### View or Shallow Copy

In [None]:
'''
Different array objects can share the same data. 
The view method creates a new array object that looks at the same data.
不同的数组对象可以拥有相同的数据。
view方法可以创建拥有同样数据的数组对象
'''

In [None]:
c = a.view()
c

In [None]:
c is a

In [None]:
c.base is a

In [None]:
c.flags.owndata

In [None]:
c.shape = 2,6
c

In [None]:
a.shape

In [None]:
c[0,4] = 1234

In [None]:
a

In [None]:
'''
Slicing an array returns a view of it:
'''

In [None]:
s = a[:,1:3]
s[:] = 10
a

### Deep Copy

In [None]:
d = a.copy()
d is a

In [None]:
d.base is a

In [None]:
d[0,0] = 99
a

### Functions and Methods Overview

In [None]:
# Array Creation
arange, array, copy, empty, empty_like, eye, fromfile, fromfunction, identity, 
linspace, logspace, mgrid, ogrid, ones, ones_like, r, zeros, zeros_like
# Conversions
ndarray.astype, atleast_1d, atleast_2d, atleast_3d, mat
# Manipulations
array_split, column_stack, concatenate, diagonal, dsplit, dstack, hsplit, hstack, 
ndarray.item, newaxis, ravel, repeat, reshape, resize, squeeze, swapaxes, take, transpose, vsplit, vstack
# Questions
all, any, nonzero, where
# Ordering
argmax, argmin, argsort, max, min, ptp, searchsorted, sort
# Operations
choose, compress, cumprod, cumsum, inner, ndarray.fill, imag, prod, put, putmask, real, sum
# Basic Statistics
cov, mean, std, var
# Basic Linear Algebra
cross, dot, outer, linalg.svd, vdot

## Less Basic

### 广播规则

In [None]:
'''
Broadcasting allows universal functions to deal in a meaningful way with inputs that do not have exactly the same shape.

The first rule of broadcasting is that if all input arrays do not have the same number of dimensions, a “1” will be repeatedly prepended to the shapes of the smaller arrays until all the arrays have the same number of dimensions.

The second rule of broadcasting ensures that arrays with a size of 1 along a particular dimension act as if they had the size of the array with the largest shape along that dimension. The value of the array element is assumed to be the same along that dimension for the “broadcast” array.

After application of the broadcasting rules, the sizes of all arrays must match. More details can be found in Broadcasting.
'''

## Fancy indexing and index tricks

In [None]:
'''
NumPy offers more indexing facilities than regular Python sequences. 
In addition to indexing by integers and slices, as we saw before, 
arrays can be indexed by arrays of integers and arrays of booleans.
'''

### Indexing with Arrays of Indices

In [None]:
a = np.arange(12) ** 2
i = np.array([1,1,3,8,5])
a[i]

In [None]:
j = np.array([[3,4],[9,7]])
a[j]

In [None]:
'''
When the indexed array a is multidimensional, a single array of indices refers to the first dimension of a. 
The following example shows this behavior by converting an image of labels into a color image using a palette.
'''

In [None]:
palette = np.array( [ [0,0,0],                # black
                      [255,0,0],              # red
                      [0,255,0],              # green
                      [0,0,255],              # blue
                      [255,255,255] ] )       # white

In [None]:
image = np.array( [ [ 0, 1, 2, 0 ],           # each value corresponds to a color in the palette
                    [ 0, 3, 4, 0 ]  ] )

In [None]:
palette[image]

In [None]:
'''
We can also give indexes for more than one dimension. 
The arrays of indices for each dimension must have the same shape.
'''