<img src="NiceStart.jpeg" width=400 height=400 align="center"/>

# NumPy

NumPy (or Numpy) is a Linear Algebra Library for Python, the reason it is so important for Data Science with Python is that almost all of the libraries in the PyData Ecosystem rely on NumPy as one of their main building blocks.

Numpy is also incredibly fast, as it has bindings to C libraries. For more info on why you would want to use Arrays instead of lists, check out this great [StackOverflow post](http://stackoverflow.com/questions/993984/why-numpy-instead-of-python-lists).

In [159]:
#输出多个output
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity='all'

In [160]:
import numpy as np

## Creating NumPy Arrays

Numpy arrays essentially come in two flavors: vectors and matrices. Vectors are strictly 1-d arrays and matrices are 2-d (but you should note a matrix can still have only one row or one column).

Trick: 区分是几维度的Array，就看有几个方括号


Difference between np.array and list:<br/>
* Array can not contain different data types, like the vector in R
* You can apply more function into array than list (e.g. max())
* High dimension arrays exists (e.g. matrix is two-dimension array)

### From a Python List
We can create an array by directly converting a list or list of lists

In [161]:
my_list = [1,2,3]
my_list

[1, 2, 3]

In [162]:
np.array(my_list)

array([1, 2, 3])

In [163]:
my_matrix = [[1,2,3],[4,5,6],[7,8,9]]
my_matrix

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

In [164]:
np.array(my_matrix) #overlay by rows

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

### Built-in Methods
There are lots of built-in ways to generate Arrays
#### arange
Return evenly spaced values within a given interval

In [165]:
np.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [166]:
np.arange(0,10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [167]:
np.arange(0,11,2)

array([ 0,  2,  4,  6,  8, 10])

#### zeros and ones
Generate arrays of zeros or ones

In [168]:
np.zeros(3)

array([0., 0., 0.])

In [169]:
np.zeros((3,4))

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [170]:
np.ones(3)

array([1., 1., 1.])

In [171]:
np.ones((3,4))

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

#### linspace
Return evenly spaced numbers over a specified interval

In [172]:
np.linspace(0,10,3)

array([ 0.,  5., 10.])

In [173]:
np.linspace(0,10,20)

array([ 0.        ,  0.52631579,  1.05263158,  1.57894737,  2.10526316,
        2.63157895,  3.15789474,  3.68421053,  4.21052632,  4.73684211,
        5.26315789,  5.78947368,  6.31578947,  6.84210526,  7.36842105,
        7.89473684,  8.42105263,  8.94736842,  9.47368421, 10.        ])

#### eye
Creates an identity matrix

In [174]:
np.eye(4)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

### Random Number
Numpy also has lots of ways to create random number arrays
#### rand
Create an array of the given shape and populate it with random samples from a uniform distribution over [0,1]

In [175]:
np.random.rand(3)

array([0.97928677, 0.80058103, 0.10942878])

In [176]:
np.random.rand(3,4)

array([[0.32222765, 0.98389761, 0.67516037, 0.89825841],
       [0.42444744, 0.4764231 , 0.94662101, 0.06248068],
       [0.46823505, 0.23671653, 0.25618383, 0.38190445]])

#### randn
Return samples from the standard normal distribution

In [177]:
np.random.randn(2)

array([-0.19565799,  0.37675517])

In [178]:
np.random.randn(3,4)

array([[-0.35306524,  0.88404574,  0.03970243,  1.38776232],
       [-0.54703924, -0.71338695, -0.15135416,  1.65505114],
       [-0.53095802, -1.29168894,  0.9343582 , -0.84680923]])

#### randint
Return random integers from `low` (inclusive) to `high` (exclusive)

In [179]:
np.random.randint(1,50)

36

In [180]:
np.random.randint(1,50,10)

array([26, 42,  4, 23, 11, 37,  9,  8, 34, 10])

## Array Attributes and Methods

In [181]:
arr = np.arange(25)
ranarr = np.random.randint(0,50,10)

In [182]:
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24])

In [183]:
ranarr

array([32, 39, 45,  2, 36, 42, 27, 24,  9, 18])

### Reshape
Return an array containing the same data with a new shape

In [184]:
arr.reshape(5,5)

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

In [185]:
ranarr.reshape(1,-1)

array([[32, 39, 45,  2, 36, 42, 27, 24,  9, 18]])

### max,min,argmax,argmin
These are useful methods for finding max or min values. Or to find their index locations using argmin or argmax

In [186]:
ranarr
ranarr.max()
ranarr.min()

array([32, 39, 45,  2, 36, 42, 27, 24,  9, 18])

45

2

In [187]:
ranarr.argmax()
ranarr.argmin()

2

3

### Shape
Shape is an attribute that arrays have

In [188]:
#Vector
arr.shape

(25,)

In [189]:
# Notice the two sets of brackets
arr.reshape(1,-1)

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15,
        16, 17, 18, 19, 20, 21, 22, 23, 24]])

In [190]:
arr.reshape(1,-1).shape

(1, 25)

In [191]:
arr.reshape(-1,1).shape

(25, 1)

当是一维的时候，在做矩阵乘法的时候，NumPy可以自动的调整为行向量或者是列向量<br/>
当时二维的时候，在做矩阵乘法的时候，需要注意矩阵和向量维度的可乘性

### dtype
Grab the data type of the object in the array

In [192]:
starr = np.array(["good","nice","lucky"])

In [193]:
starr.dtype

dtype('<U5')

In [194]:
arr.dtype

dtype('int64')

## NumPy Indexing and Selection

In [195]:
#Creating a sample array
arr = np.arange(0,11)
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

### Bracket Indexing and Selection
The simplest way to pick one or some elements of an array looks very similar to python lists

In [196]:
arr[8]
arr[1:5]
arr[0:-1]

8

array([1, 2, 3, 4])

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [197]:
arr[:]
arr[::-1]

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

array([10,  9,  8,  7,  6,  5,  4,  3,  2,  1,  0])

### Broadcasting
Numpy arrays differr from a normal Python list becasue of their ability to broadcast, 可以类比为R的c()数据类型的性质

In [198]:
#Setting a value with index range (Broadcasting)
arr[0:5] = 100
arr

array([100, 100, 100, 100, 100,   5,   6,   7,   8,   9,  10])

In [199]:
arr = np.arange(0,11)
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [200]:
slice_of_arr = arr[0:6]
slice_of_arr

array([0, 1, 2, 3, 4, 5])

In [201]:
slice_of_arr[:] = 100 
slice_of_arr

array([100, 100, 100, 100, 100, 100])

In [202]:
arr

array([100, 100, 100, 100, 100, 100,   6,   7,   8,   9,  10])

**Note the changes also occur in our original array, slice_of_arr is just a view of arr, the data is not copied**

In [203]:
#To get a copy, need to be explicit
arr_copy = arr.copy()
arr_copy

array([100, 100, 100, 100, 100, 100,   6,   7,   8,   9,  10])

In [204]:
arr_copy[:] = 666
arr_copy

array([666, 666, 666, 666, 666, 666, 666, 666, 666, 666, 666])

In [205]:
arr

array([100, 100, 100, 100, 100, 100,   6,   7,   8,   9,  10])

### Indexing a 2D array (matrices)

In [206]:
arr_2d = np.array([[5,10,15],[20,25,30],[35,40,45]])
arr_2d
arr_2d.shape

array([[ 5, 10, 15],
       [20, 25, 30],
       [35, 40, 45]])

(3, 3)

In [207]:
#Indexing row
arr_2d[1]

arr_2d[1,:]

array([20, 25, 30])

array([20, 25, 30])

In [208]:
#Indexing individual element value
#Method 1
arr_2d[1][0]

#Method 2 (recommended)
arr_2d[1,0]

20

20

In [209]:
#Slicing
arr_2d[:2,1:]

array([[10, 15],
       [25, 30]])

*Fancy Indexing:* <br/>
Allows you to select entire rows or columns out of order

In [216]:
#set up matrix
arr_2d = np.zeros((5,5))

arr_2d[0] = np.arange(5)
arr_2d[:,0] = np.arange(5)

arr_2d

array([[0., 1., 2., 3., 4.],
       [1., 0., 0., 0., 0.],
       [2., 0., 0., 0., 0.],
       [3., 0., 0., 0., 0.],
       [4., 0., 0., 0., 0.]])

In [217]:
arr_2d[[1,4,3,0]]

array([[1., 0., 0., 0., 0.],
       [4., 0., 0., 0., 0.],
       [3., 0., 0., 0., 0.],
       [0., 1., 2., 3., 4.]])

In [218]:
arr_2d[:,[1,4,3,0]]

array([[1., 4., 3., 0.],
       [0., 0., 0., 1.],
       [0., 0., 0., 2.],
       [0., 0., 0., 3.],
       [0., 0., 0., 4.]])

### Selection
Let's briefly go over how to use brackets for selection based off of comparison operrators

In [213]:
arr = np.arange(1,11)
arr

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [214]:
arr[arr > 4]

array([ 5,  6,  7,  8,  9, 10])

In [215]:
arr[arr % 2 == 0]

array([ 2,  4,  6,  8, 10])

## NumPy Operations

### Arithmetic
You can easily perform array with array arithmetic, or scalar with array arithmetic

In [227]:
arr = np.arange(1,10)
arr

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [228]:
arr + arr

array([ 2,  4,  6,  8, 10, 12, 14, 16, 18])

In [229]:
arr * arr

array([ 1,  4,  9, 16, 25, 36, 49, 64, 81])

In [230]:
arr/arr

array([1., 1., 1., 1., 1., 1., 1., 1., 1.])

In [231]:
1/arr

array([1.        , 0.5       , 0.33333333, 0.25      , 0.2       ,
       0.16666667, 0.14285714, 0.125     , 0.11111111])

In [226]:
arr ** 3

array([  0,   1,   8,  27,  64, 125, 216, 343, 512, 729])

### Universal Array Functions
Numpy comes with many [universal array functions](http://docs.scipy.org/doc/numpy/reference/ufuncs.html), which are essentially just mathematical operations you can use to perform the operation across the array.

In [234]:
#Taking Square Roots
np.sqrt(arr)

array([1.        , 1.41421356, 1.73205081, 2.        , 2.23606798,
       2.44948974, 2.64575131, 2.82842712, 3.        ])

In [235]:
#Calculating exponential
np.exp(arr)

array([2.71828183e+00, 7.38905610e+00, 2.00855369e+01, 5.45981500e+01,
       1.48413159e+02, 4.03428793e+02, 1.09663316e+03, 2.98095799e+03,
       8.10308393e+03])

In [237]:
np.max(arr) #Same as arr.max()

9

In [239]:
np.log(arr)

array([0.        , 0.69314718, 1.09861229, 1.38629436, 1.60943791,
       1.79175947, 1.94591015, 2.07944154, 2.19722458])