# Numpy

Numpy,short for Numerical Python, is the core library for scientific computing in Python. It provides a high-performance multidimensional array object, and tools for working with these arrays. If you are already familiar with MATLAB, you might find this tutorial useful to get started with Numpy.

Here are some of the things you’ll find in NumPy:
* ndarray, an efficient multidimensional array providing fast array-oriented arithmetic operations and flexible broadcasting capabilities.
* Mathematical functions for fast operations on entire arrays of data without having to write loops.
* Tools for reading/writing array data to disk and working with memory-mapped files.
* Linear algebra, random number generation, and Fourier transform capabilities.
* A C API for connecting NumPy with libraries written in C, C++, or FORTRAN.

Because NumPy provides an easy-to-use C API, it is straightforward to pass data to external libraries written in a low-level language and also for external libraries to return data to Python as NumPy arrays. This feature has made Python a language of choice for wrapping legacy C/C++/Fortran codebases and giving them a dynamic and easy-to-use interface.


## NumPy-Data type

|data type|description|
|---------|-----------|
|bool|存储为一个字节的布尔值（真或假）|
|int|默认整数，相当于 C 的long，通常为int32或int64|
|intc|相当于 C 的int，通常为int32或int64|
|intp|用于索引的整数，相当于 C 的size_t，通常为int32或int64|
|int8|字节（-128 ~ 127）|
|int16|16位整数（-32768 ~ 32767）|
|int32|32位整数（-2147483648 ~ 2147483647|
|int64|64位整数（-9223372036854775808 ~ 9223372036854775807）|
|uint8|8位无符号整数（0 ~ 255）|
|uint16|16位无符号整数（0 ~ 65535）|
|uint32|32位无符号整数（0 ~ 4294967295）|
|uint64|64 位无符号整数（0 ~ 18446744073709551615）|
|float|float64的简写|
|float16|半精度浮点：符号位，5 位指数，10 位尾数|
|float32|单精度浮点：符号位，8 位指数，23 位尾数|
|float64|双精度浮点：符号位，11 位指数，52 位尾数|
|complex|complex128的简写|
|complex64|复数，由两个 32 位浮点表示（实部和虚部）|
|complex128|复数，由两个 64 位浮点表示（实部和虚部）|

## The NumPy ndarray: A Multidimensional Array Object
One of the key features of NumPy is its N-dimensional array object, or ndarray,which is a fast, flexible container for large datasets in Python. Arrays enable you to perform mathematical operations on whole blocks of data using similar syntax to the equivalent operations between scalar elements.

An ndarray is a generic multidimensional container for **homogeneous data;** that is, **all of the elements must be the same type.** Every array has a `shape`, a tuple indicating the `size` of each dimension, and a `dtype`, an object describing the data type of the array

NumPy 中定义的最重要的对象是称为`ndarray`的N维数组类型。 它描述**相同类型的元素集合**。 可以使用基于零的索引访问集合中的项目。

`ndarray`中的每个元素在内存中使用相同大小的块。 ndarray中的每个元素是数据类型对象的对象（称为`dtype`）。

从`ndarray`对象提取的任何元素（通过切片）由一个数组标量类型的 Python 对象表示。 下图显示了`ndarray`，数据类型对象（`dtype`）和数组标量类型之间的关系。

![](https://www.tutorialspoint.com//numpy/images/ndarray.jpg)

## Creating ndarrays

The easiest way to create an array is to use the array function. This accepts any sequence-like object (including other arrays) and produces a new NumPy array containing the passed data.

In [1]:
numpy.array(object, dtype = None, copy = True, order = None, subok = False, ndmin = 0)

NameError: name 'numpy' is not defined

|参数名|描述|
|-----|:---|
|object|任何暴露数组接口方法的对象都会返回一个数组或任何（嵌套）序列|
|dtype|数组的所需数据类型，可选|
|copy|可选，默认为true，对象是否被复制|
|order|C（按行）、F（按列）或A（任意，默认）|
|subok|默认情况下，返回的数组被强制为基类数组。 如果为true，则返回子类|
|ndimin|指定返回数组的最小维数|

In [56]:
import numpy as np

a = np.array([1,2,3])
print(a)

# 多于一个维度 
a = np.array([[1,2,3],[2,3,4]])  
print(a)
print("shape:",a.shape)
print("# of dim:",a.ndim)
print('size:',a.size)

# 最小维度
a = np.array([1,2,3,4,5], ndmin =  2)  
print(a)


# dtype 参数 
a = np.array([1,2,3], dtype = complex)  
print(a)

#int8，int16，int32，int64 可替换为等价的字符串 'i1'，'i2'，'i4'，以及其他。  
dt = np.dtype('i4')  
print(dt) 

[1 2 3]
[[1 2 3]
 [2 3 4]]
shape: (2, 3)
# of dim: 2
size: 6
[[1 2 3 4 5]]
[1.+0.j 2.+0.j 3.+0.j]
int32


Numpy also provides many functions to create arrays

|Function|Description|
|--------|-----------|
|array|Covert input data(list,tuple,array,or other sequence type)to an ndarray either by inferring a dtype or explicitly specifying a dtype; copies the input data by default|
|asarray|Convert input to ndarray, but do not copy if the input is already an ndarray|
|arange|Like the built-in range but returns as ndarray instead of a list|
|ones,ones_like|Produce an array of all 1s with the given shape and dtype; ones_like takes another array and produces a ones array of the same shape and dtype|
|zeros,zeros_like|Like ones and ones_like but producing arrays of 0s instead|
|empty,empty_like|Create new arrays by allocating new memory, but do not populate with any values like ones and zeros|
|full,full_like|Produce an array of the given shape and dtype with all values set to the indicated “fill value”;full_like takes another array and produces a filled array of the same shape and dtype|
|eye,identity|Create a square N × N identity matrix (1s on the diagonal and 0s elsewhere)|

In [18]:
import numpy as np

a = np.zeros((2,2))   # Create an array of all zeros
print(a)              # Prints "[[ 0.  0.]
                      #          [ 0.  0.]]"

b = np.ones((1,2))    # Create an array of all ones
print(b)              # Prints "[[ 1.  1.]]"

c = np.full((2,2), 7)  # Create a constant array
print(c)               # Prints "[[ 7.  7.]
                       #          [ 7.  7.]]"

d = np.empty((3,4))   # Create a empty array
print(d)              # empty, unlike zeros, does not set the array values to zero, 
                      # and may therefore be marginally faster. On the other hand, 
                      # it requires the user to manually set all the values in the array, 
                      # and should be used with caution.
    
e = np.eye(2)         # Create a 2x2 identity matrix
print(e)              # Prints "[[ 1.  0.]
                      #          [ 0.  1.]]"

f = np.random.random((2,2))  # Create an array filled with random values
print(f)                     # Might print "[[ 0.91940167  0.08143941]
                             #               [ 0.68744134  0.87236687]]"
    
g = np.arange(10,22,2)    # Create an even array between [10,22)
print(g)                  # Prints[10 12 14 16 18 20]
print(g.reshape((2,3)))

[[0. 0.]
 [0. 0.]]
[[1. 1.]]
[[7 7]
 [7 7]]
[[5.43e-323 9.88e-324 1.48e-323 1.98e-323]
 [2.47e-323 7.91e-323 8.40e-323 3.95e-323]
 [4.45e-323 4.94e-323 1.04e-322 5.93e-323]]
[[1. 0.]
 [0. 1.]]
[[0.53701068 0.98133066]
 [0.49735809 0.62795559]]
[10 12 14 16 18 20]
[[10 12 14]
 [16 18 20]]


## Array indexing

Numpy offers several ways to index into arrays.
* Slicing
* Integer array indexing
* Boolean array indexing

**a)Slicing**: Similar to Python lists, numpy arrays can be sliced.

The basic slice syntax is `i:j:k` where `i` is the starting index, `j` is the stopping index, and `k` is the step `(k≠0)`. This selects the m elements (in the corresponding dimension) with index values i, i + k, …, i + (m - 1) k where m = q + (r\neq0) and q and r are the quotient and remainder obtained by dividing *j - i* by *k: j - i = q k + r*, so that *i + (m - 1) k < j.*

Since arrays may be multidimensional, you must specify a slice for each dimension of the array:

In [42]:
import numpy as np

# Create the following rank 2 array with shape (3, 4)
# [[ 1  2  3  4]
#  [ 5  6  7  8]
#  [ 9 10 11 12]]
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])

# Use slicing to pull out the subarray consisting of the first 2 rows
# and columns 1 and 2; b is the following array of shape (2, 2):
# [[2 3]
#  [6 7]]
b = a[:2, 1:3]

# A slice of an array is a view into the same data, so modifying it
# will modify the original array.
print(a[0, 1])   # Prints "2"
b[0, 0] = 77     # b[0, 0] is the same piece of data as a[0, 1]
print(a[0, 1])   # Prints "77"

#Select the specific column
print(a[:,1])     # Prints "[77 6 10]"
print(a[:,:2])    # Prints "[[ 1 77]
                           # [ 5  6]
                           # [ 9 10]]"

#Select the specific row
print(a[2,:])       # Prints "[9 10 11 12]"
print(a[2:,:])      # Prints "[[ 9 10 11 12]]"

2
77
[77  6 10]
[[ 1 77]
 [ 5  6]
 [ 9 10]]
[ 9 10 11 12]
[[ 9 10 11 12]]
[]


You can also mix integer indexing with slice indexing. However, doing so **will yield an array of lower rank than the original array.** Note that this is quite different from the way that MATLAB handles array slicing:

In [43]:
import numpy as np

# Create the following rank 2 array with shape (3, 4)
# [[ 1  2  3  4]
#  [ 5  6  7  8]
#  [ 9 10 11 12]]
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])

# Two ways of accessing the data in the middle row of the array.
# Mixing integer indexing with slices yields an array of lower rank,
# while using only slices yields an array of the same rank as the
# original array:
row_r1 = a[1, :]    # Rank 1 view of the second row of a
row_r2 = a[1:2, :]  # Rank 2 view of the second row of a
print(row_r1, row_r1.shape)  # Prints "[5 6 7 8] (4,)"
print(row_r2, row_r2.shape)  # Prints "[[5 6 7 8]] (1, 4)"

# We can make the same distinction when accessing columns of an array:
col_r1 = a[:, 1]
col_r2 = a[:, 1:2]
print(col_r1, col_r1.shape)  # Prints "[ 2  6 10] (3,)"
print(col_r2, col_r2.shape)  # Prints "[[ 2]
                             #          [ 6]
                             #          [10]] (3, 1)"

[5 6 7 8] (4,)
[[5 6 7 8]] (1, 4)
[ 2  6 10] (3,)
[[ 2]
 [ 6]
 [10]] (3, 1)


**b)Integer array indexing**: When you index into numpy arrays using slicing, the resulting array view will always be a subarray of the original array. In contrast, integer array indexing allows you to **construct arbitrary arrays using the data from another array.** Here is an example:

In [50]:
import numpy as np

a = np.array([[1,2],[3,4],[5,6]])

# Also called "Fancy indexing"
# An example of integer array indexing.
# The returned array will have shape (3,) and
print(a[[0, 1, 2], [0, 1, 0]])  # Prints "[1 4 5]"

# The above example of integer array indexing is equivalent to this:
print(np.array([a[0, 0], a[1, 1], a[2, 0]]))  # Prints "[1 4 5]"

# When using integer array indexing, you can reuse the same
# element from the source array:
print(a[[0, 0], [1, 1]])  # Prints "[2 2]"

# Equivalent to the previous integer array indexing example
print(np.array([a[0, 1], a[0, 1]]))  # Prints "[2 2]"

[1 4 5]
[1 4 5]
[2 2]
[2 2]


One useful trick with integer array indexing is **selecting or mutating one element from each row of a matrix**:

In [32]:
import numpy as np

# Create a new array from which we will select elements
a = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])

print(a)  # prints "array([[ 1,  2,  3],
          #                [ 4,  5,  6],
          #                [ 7,  8,  9],
          #                [10, 11, 12]])"

# Create an array of indices
b = np.array([0, 2, 0, 1])

# Select one element from each row of a using the indices in b
print(a[np.arange(4), b])  # Prints "[ 1  6  7 11]"

# Mutate one element from each row of a using the indices in b
a[np.arange(4), b] += 10

print(a)  # prints "array([[11,  2,  3],
          #                [ 4,  5, 16],
          #                [17,  8,  9],
          #                [10, 21, 12]])
            

#Select the specific row
print(a[2])       # Prints "[17 8 9]"

[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]
[ 1  6  7 11]
[[11  2  3]
 [ 4  5 16]
 [17  8  9]
 [10 21 12]]
[ 2  5  8 21]
[17  8  9]


**c)Boolean array indexing:** Boolean array indexing lets you pick out arbitrary elements of an array. Frequently this type of indexing is used to **select the elements of an array that satisfy some condition.** Here is an example:

In [16]:
import numpy as np

a = np.array([[1,2], [3, 4], [5, 6]])

bool_idx = (a > 2)   # Find the elements of a that are bigger than 2;
                     # this returns a numpy array of Booleans of the same
                     # shape as a, where each slot of bool_idx tells
                     # whether that element of a is > 2.

print(bool_idx)      # Prints "[[False False]
                     #          [ True  True]
                     #          [ True  True]]"

# We use boolean array indexing to construct a rank 1 array
# consisting of the elements of a corresponding to the True values
# of bool_idx
print(a[bool_idx])  # Prints "[3 4 5 6]"

# We can do all of the above in a single concise statement:
print(a[a > 2])     # Prints "[3 4 5 6]"

[[False False]
 [ True  True]
 [ True  True]]
[3 4 5 6]
[3 4 5 6]


## Array Math

Basic mathematical functions operate **elementwise** on arrays, and are available both as operator overloads and as functions in the numpy module:

In [51]:
import numpy as np

x = np.array([[1,2],[3,4]], dtype=np.float64)
y = np.array([[5,6],[7,8]], dtype=np.float64)

# Elementwise sum; both produce the array
# [[ 6.0  8.0]
#  [10.0 12.0]]
print(x + y)
print(np.add(x, y))

# Elementwise difference; both produce the array
# [[-4.0 -4.0]
#  [-4.0 -4.0]]
print(x - y)
print(np.subtract(x, y))

# Elementwise product; both produce the array
# [[ 5.0 12.0]
#  [21.0 32.0]]
print(x * y)
print(np.multiply(x, y))

# Elementwise division; both produce the array
# [[ 0.2         0.33333333]
#  [ 0.42857143  0.5       ]]
print(x / y)
print(np.divide(x, y))

# Elementwise square root; produces the array
# [[ 1.          1.41421356]
#  [ 1.73205081  2.        ]]
print(np.sqrt(x))

[[ 6.  8.]
 [10. 12.]]
[[ 6.  8.]
 [10. 12.]]
[[-4. -4.]
 [-4. -4.]]
[[-4. -4.]
 [-4. -4.]]
[[ 5. 12.]
 [21. 32.]]
[[ 5. 12.]
 [21. 32.]]
[[0.2        0.33333333]
 [0.42857143 0.5       ]]
[[0.2        0.33333333]
 [0.42857143 0.5       ]]
[[1.         1.41421356]
 [1.73205081 2.        ]]


Note that unlike MATLAB, `* `is elementwise multiplication, not matrix multiplication. We instead use the `dot` function to compute inner products of vectors, to multiply a vector by a matrix, and to multiply matrices. `dot` is available both as a function in the numpy module and as an instance method of array objects:

In [52]:
import numpy as np

x = np.array([[1,2],[3,4]])
y = np.array([[5,6],[7,8]])

v = np.array([9,10])
w = np.array([11,12])

# Inner product of vectors; both produce 219
print(v.dot(w))
print(np.dot(v, w))

# Matrix / vector product; both produce the rank 1 array [29 67]
print(x.dot(v))
print(np.dot(x, v))

# Matrix / matrix product; both produce the rank 2 array
# [[19 22]
#  [43 50]]
print(x.dot(y))
print(np.dot(x, y))

219
219
[29 67]
[29 67]
[[19 22]
 [43 50]]
[[19 22]
 [43 50]]


Numpy provides many useful functions for performing computations on arrays; Here are some examples:

In [67]:
import numpy as np

x = np.array([[1,2],[3,4]])

# maximum function
print("max:",np.max(x))
# show the index of maximum function
print("index of max:",np.argmax(x))

# sum function
print("sum function")
print(np.sum(x))  # Compute sum of all elements; prints "10"
print(np.sum(x, axis=0))  # Compute sum of each column; prints "[4 6]"
print(np.sum(x, axis=1))  # Compute sum of each row; prints "[3 7]"

# cumsum function
print("cumsum function")
print(np.cumsum(x)) # Compute the cumulative sum of all elements;
print(np.cumsum(x, axis = 0)) # Compute  the cumulative sum of the elements along a given axis.
print(np.cumsum(x, axis = 1)) # Compute  the cumulative sum of the elements along a given axis.


# diff function
print("diff function")
print(np.diff(x)) # Calculate the n-th discrete difference over all elements
print(np.diff(x, axis = 0))
print(np.diff(x, axis = 1))


# Products function
print("prod function")
print(np.prod(x)) # Return the product of all array elements.
print(np.prod(x,axis = 0)) # Return the product of array elements over a given axis.
print(np.prod(x,axis = 1)) # Return the product of array elements over a given axis.


#gradient function
print("gradient function")
print(np.gradient(x)) # Compute the gradient of an N-dimensional array.

max: 4
index of max: 3
sum function
10
[4 6]
[3 7]
cumsum function
[ 1  3  6 10]
[[1 2]
 [4 6]]
[[1 3]
 [3 7]]
diff function
[[1]
 [1]]
[[2 2]]
[[1]
 [1]]
prod function
24
[3 8]
[ 2 12]
gradient function
[array([[2., 2.],
       [2., 2.]]), array([[1., 1.],
       [1., 1.]])]


Apart from computing mathematical functions using arrays, we frequently need to reshape or otherwise manipulate data in arrays. The simplest example of this type of operation is transposing a matrix; to transpose a matrix, simply use the `T `attribute of an array object:

In [54]:
import numpy as np

x = np.array([[1,2], [3,4]])
print(x)    # Prints "[[1 2]
            #          [3 4]]"
print(x.T)  # Prints "[[1 3]
            #          [2 4]]"

# Note that taking the transpose of a rank 1 array does nothing:
v = np.array([1,2,3])
print(v)    # Prints "[1 2 3]"
print(v.T)  # Prints "[1 2 3]"

[[1 2]
 [3 4]]
[[1 3]
 [2 4]]
[1 2 3]
[1 2 3]


## Broadcasting

Broadcasting is a powerful mechanism that allows numpy to work with arrays of different shapes when performing arithmetic operations. Frequently we have a smaller array and a larger array, and we want to use the smaller array multiple times to perform some operation on the larger array.

For example, suppose that we want to add a constant vector to each row of a matrix. We could do it like this:

In [68]:
import numpy as np

# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
y = np.empty_like(x)   # Create an empty matrix with the same shape as x

# Add the vector v to each row of the matrix x with an explicit loop
for i in range(4):
    y[i, :] = x[i, :] + v

# Now y is the following
# [[ 2  2  4]
#  [ 5  5  7]
#  [ 8  8 10]
#  [11 11 13]]
print(y)

[[ 2  2  4]
 [ 5  5  7]
 [ 8  8 10]
 [11 11 13]]


This works; however when the matrix `x` is very large, computing an explicit loop in Python could be slow. Note that adding the vector `v` to each row of the matrix `x` is equivalent to forming a matrix `vv` by stacking multiple copies of `v` vertically, then performing elementwise summation of `x` and `vv`. We could implement this approach like this:

In [69]:
import numpy as np

# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
vv = np.tile(v, (4, 1))   # Stack 4 copies of v on top of each other
print(vv)                 # Prints "[[1 0 1]
                          #          [1 0 1]
                          #          [1 0 1]
                          #          [1 0 1]]"
y = x + vv  # Add x and vv elementwise
print(y)  # Prints "[[ 2  2  4
          #          [ 5  5  7]
          #          [ 8  8 10]
          #          [11 11 13]]"

[[1 0 1]
 [1 0 1]
 [1 0 1]
 [1 0 1]]
[[ 2  2  4]
 [ 5  5  7]
 [ 8  8 10]
 [11 11 13]]


Numpy broadcasting allows us to perform this computation without actually creating multiple copies of `v`. Consider this version, using broadcasting:

In [70]:
import numpy as np

# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
y = x + v  # Add v to each row of x using broadcasting
print(y)  # Prints "[[ 2  2  4]
          #          [ 5  5  7]
          #          [ 8  8 10]
          #          [11 11 13]]"

[[ 2  2  4]
 [ 5  5  7]
 [ 8  8 10]
 [11 11 13]]


The line `y = x + v` works even though `x` has shape `(4, 3)` and `v` has shape `(3,)` due to broadcasting; this line works as if `v` actually had shape `(4, 3)`, where each row was a copy of `v`, and the sum was performed elementwise.

Broadcasting two arrays together follows these rules:
    
    1. If the arrays do not have the same rank, prepend the shape of the lower rank array with 1s until both shapes have the same length.
    
    2. The two arrays are said to be compatible in a dimension if they have the same size in the dimension, or if one of the arrays has size 1 in that dimension.
    
    3. The arrays can be broadcast together if they are compatible in all dimensions.
    
    4. After broadcasting, each array behaves as if it had shape equal to the elementwise maximum of shapes of the two input arrays.
    
    5. In any dimension where one array had size 1 and the other array had size greater than 1, the first array behaves as if it were copied along that dimension
    

Functions that support broadcasting are known as universal functions. You can find the list of all universal functions in the [documentation](htps://docs.scipy.org/doc/numpy/reference/ufuncs.html#available-ufuncs).

Broadcasting typically makes your code more concise and faster, so you should strive to use it where possible. Here are some applications of broadcasting:

In [72]:
import numpy as np

# Compute outer product of vectors
v = np.array([1,2,3])  # v has shape (3,)
w = np.array([4,5])    # w has shape (2,)
# To compute an outer product, we first reshape v to be a column
# vector of shape (3, 1); we can then broadcast it against w to yield
# an output of shape (3, 2), which is the outer product of v and w:
# [[ 4  5]
#  [ 8 10]
#  [12 15]]
print(np.reshape(v, (3, 1)) * w)

# Add a vector to each row of a matrix
x = np.array([[1,2,3], [4,5,6]])
# x has shape (2, 3) and v has shape (3,) so they broadcast to (2, 3),
# giving the following matrix:
# [[2 4 6]
#  [5 7 9]]
print(x + v)

# Add a vector to each column of a matrix
# x has shape (2, 3) and w has shape (2,).
# If we transpose x then it has shape (3, 2) and can be broadcast
# against w to yield a result of shape (3, 2); transposing this result
# yields the final result of shape (2, 3) which is the matrix x with
# the vector w added to each column. Gives the following matrix:
# [[ 5  6  7]
#  [ 9 10 11]]
print((x.T + w).T)
# Another solution is to reshape w to be a column vector of shape (2, 1);
# we can then broadcast it directly against x to produce the same
# output.
print(x + np.reshape(w, (2, 1)))

# Multiply a matrix by a constant:
# x has shape (2, 3). Numpy treats scalars as arrays of shape ();
# these can be broadcast together to shape (2, 3), producing the
# following array:
# [[ 2  4  6]
#  [ 8 10 12]]
print(x * 2)

[[ 4  5]
 [ 8 10]
 [12 15]]
[[2 4 6]
 [5 7 9]]
[[ 5  6  7]
 [ 9 10 11]]
[[ 5  6  7]
 [ 9 10 11]]
[[ 2  4  6]
 [ 8 10 12]]
