## Numpy
NumPy is the fundamental package for scientific computing with Python. 

NumPy’s main object is the homogeneous multidimensional array.  
It is a table of elements (usually numbers), all of the **same type**, indexed by a tuple of positive integers.  
In NumPy dimensions are called **axes**.

### Concept of axis

Reference: 手把手打開資料分析大門
https://www.slideshare.net/tw_dsconf/python-83977705/61

In [1]:
# one axis
[1, 3, 5]  # 3 elements, it has a length of 3.

# 2 axes
[[ 1, 3, 5],
 [ 2, 4, 6]]  # The first axis has a length of 2, the second axis has a length of 3.

[[1, 3, 5], [2, 4, 6]]

NumPy’s array class is called **ndarray**. It is also known by the alias **array**.  
Note that numpy.array is not the same as the Standard Python Library class array.array, which only handles one-dimensional arrays and offers less functionality.

### Create ndarray

In [2]:
import numpy as np

# create 1 axis array
x = np.arange(3)

print(x)
print(type(x))  # <class 'numpy.ndarray'>

# check if ndarray type
isinstance(x, np.ndarray)  # True

# be explicitly specified type
y = np.arange(3, dtype='float64')  # [ 0.  1.  2.]
print(y)

[0 1 2]
<class 'numpy.ndarray'>
[ 0.  1.  2.]


In [3]:
import numpy as np

existed_list = [18, 15, 21, 10, 88, 76, 29, 20]

np_array = np.array(existed_list)
print(np_array)  # [18 15 21 10 88 76 29 20]

[18 15 21 10 88 76 29 20]


### Important attributes of  ndarray

In [4]:
import numpy as np

x = np.arange(3)
print(x)

# ndim - the number of axes (dimensions) of the array.
print(x.ndim)  # 1 dim

# shape - the dimensions of the array. 
# This is a tuple of integers indicating the size of the array in each dimension.
print(x.shape)  # (3, )

# size - the total number of elements of the array. 
print(x.size)  # 3

# dtype - the type of the elements in the array.
print(x.dtype)  # int64

[0 1 2]
1
(3,)
3
int64


### Axes reshape
Gives a new shape to an array without changing its data.

In [5]:
# reshape
x = np.arange(6)
print(x)  # [0 1 2 3 4 5]

new_shape = x.reshape(2, 3)
print(new_shape) # [[0 1 2]
                 #  [3 4 5]]

# equivalently
new_shape = np.reshape(x, (2, 3))    
    
    
# also can be one line to create and reshpae
y = np.arange(6).reshape(2, 3)

[0 1 2 3 4 5]
[[0 1 2]
 [3 4 5]]


### Initial placeholder content

In [6]:
# np.zeros - full of zeros
np.zeros(3)  # array([ 0.,  0.,  0.])

np.zeros((2, 3))  # array([[ 0.,  0.,  0.],
                  #        [ 0.,  0.,  0.]])

    
# np.ones - full of ones
np.ones((2,3))  # array([[ 1.,  1.,  1.],
                #        [ 1.,  1.,  1.]])

    
# np.identity - a square array with ones on the main diagonal
np.identity(3)  # array([[ 1.,  0.,  0.],
                #        [ 0.,  1.,  0.],
                #        [ 0.,  0.,  1.]])


# By default, the dtype of the created array is float64.
# using dtype change the type
# np.zeros(3, dtype=np.int16)

array([[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.]])

### Array Index

Array indexing refers to any use of the square brackets ( [ ] ) to index array values.

In [7]:
import numpy as np

# 1-D array
x = np.arange(6)  # array([0, 1, 2, 3, 4, 5])
x[2]   # 2
x[-2]  # 4


# 2-D array
x = np.arange(6).reshape(2, 3)  #[[0, 1, 2],
                                # [3, 4, 5]])
x[0, 2]   # 2
x[1, -1]  # 5

5

### Array Slice & Stride

The slicing and striding works exactly the same way it does for lists except that they can be applied to multiple dimensions as well.

In [8]:
import numpy as np

# 1-D array
x = np.arange(6)  # array([0, 1, 2, 3, 4, 5])
x[1:5]   # [2, 3, 4]
x[:2]    # [0, 1]
x[1:5:2] # [1, 3]


# 2-D array
x = np.arange(6).reshape(2, 3)  #[[0, 1, 2],
                                # [3, 4, 5]])
x[0, 0:2]    #  [0, 1]
x[:, 1:]     # [[1, 2],
             #  [4, 5]]
x[::1, ::2]  # [[0, 2],
             #  [3, 5]]

array([[0, 2],
       [3, 5]])

### Boolean / Mask Index 
Boolean arrays must be of the same shape as the initial dimensions of the array being indexed.

In [9]:
import numpy as np

# 1-D array
x = np.arange(6)  # array([0, 1, 2, 3, 4, 5])
condition = x<3
x[condition]      # [0, 1, 2]

x[condition] = 0
x                 # [0, 0, 0, 3, 4, 5]


# why called mask?

# original_x      # [ 0,    1,    2,   3,    4,    5]
# if <3, assign 0
print(condition)  # [ True  True  True False False False]
x                 # [ 0,    0,    0,   3,    4,    5]

[ True  True  True False False False]


array([0, 0, 0, 3, 4, 5])

### Concatenate
Join a sequence of arrays along an existing axis.

In [10]:
import numpy as np

a = np.array([[1, 2, 3], [4, 5, 6]])
b = np.array([[7, 8, 9]])

np.concatenate((a, b), axis=0)  # [[1, 2, 3],
                                #  [4, 5, 6],
                                #  [7, 8, 9]]

c =  [[0], [0]]       
np.concatenate((a, c), axis=1)  # [[1, 2, 3, 0],
                                #  [4, 5, 6, 0]]

array([[1, 2, 3, 0],
       [4, 5, 6, 0]])

### Basic Operations

In [11]:
import numpy as np
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])

print(a + b)  # array([[6, 8], [10, 12]])
print(a - b)  # array([[-4, -4], [-4, -4]])
print(a * b)  # array([[5, 12], [21, 32]])
print(a / b)  # array([[0.2, 0.33333333], [0.42857143, 0.5]]

print(a - 1)  # array([[0, 1], [2, 3]])
print(a * 2)  # array([[2, 4], [6, 8]])

[[ 6  8]
 [10 12]]
[[-4 -4]
 [-4 -4]]
[[ 5 12]
 [21 32]]
[[ 0.2         0.33333333]
 [ 0.42857143  0.5       ]]
[[0 1]
 [2 3]]
[[2 4]
 [6 8]]


### Basic Linear Algebra

轉置矩陣：m \* n 矩陣在向量空間上轉置為 n \* m 矩陣  
逆矩陣：n \* n 矩陣 A 存在一個 n \* n 矩陣 B，使得 AB = BA = I

In [12]:
import numpy as np

# 轉置矩陣
a = np.array([[0, 1], 
              [2, 3]])

print(a.T)  #[[0, 2],
            # [1, 3]]

# 逆矩陣
inverse = np.linalg.inv(a)
print(inverse)             # [[-1.5, 0.5], 
                           #  [1,    0]]

# 內積 
print(np.dot(a, inverse))  # [[ 1.  0.]
                           #  [ 0.  1.]]

[[0 2]
 [1 3]]
[[-1.5  0.5]
 [ 1.   0. ]]
[[ 1.  0.]
 [ 0.  1.]]


### Vector Stacking

In [13]:
import numpy as np

a = np.array([[0, 1], 
              [2, 3]])

b = np.array([[4, 5], 
              [6, 7]])

c = np.array([[8,  9], 
              [10, 11]])

# vertical
v = np.vstack((a, b, c))
print(v.shape)  # (6, 2)
print(v)

# horizontal
h = np.hstack((a, b, c))
print(h.shape)  # (2, 6)
print(h)

# stack 
s = np.stack([a, b, c], axis=0)
print(s.shape)  # (3, 2, 2)
print(s)

(6, 2)
[[ 0  1]
 [ 2  3]
 [ 4  5]
 [ 6  7]
 [ 8  9]
 [10 11]]
(2, 6)
[[ 0  1  4  5  8  9]
 [ 2  3  6  7 10 11]]
(3, 2, 2)
[[[ 0  1]
  [ 2  3]]

 [[ 4  5]
  [ 6  7]]

 [[ 8  9]
  [10 11]]]


## 練習題

In [14]:
import numpy as np

In [15]:
Z = np.arange(10, 50)
print(Z)

[10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49]


In [16]:
Z = np.arange(50)
Z = Z[::-1]
print(Z)

[49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25
 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10  9  8  7  6  5  4  3  2  1  0]


In [17]:
Z = np.random.random((3, 3, 3))
print(Z)

[[[ 0.67564188  0.09083812  0.71603631]
  [ 0.46063409  0.1201112   0.7253742 ]
  [ 0.59102889  0.73559909  0.5441022 ]]

 [[ 0.11115843  0.67817467  0.01560134]
  [ 0.70915343  0.54937783  0.66858778]
  [ 0.73977152  0.57440465  0.32654832]]

 [[ 0.07959621  0.13005374  0.80079367]
  [ 0.61201666  0.34747518  0.68889982]
  [ 0.57648818  0.03267974  0.9140913 ]]]


In [18]:
Z = np.random.random((10, 10))
Zmin, Zmax = Z.min(), Z.max()
print(Zmin, Zmax)

0.00700726147983 0.967893477194


In [19]:
Z = np.ones((3, 3))
Z = np.pad(Z, pad_width=1, mode='constant', constant_values=0)
print(Z)

[[ 0.  0.  0.  0.  0.]
 [ 0.  1.  1.  1.  0.]
 [ 0.  1.  1.  1.  0.]
 [ 0.  1.  1.  1.  0.]
 [ 0.  0.  0.  0.  0.]]


In [20]:
Z = np.random.random((5, 5))
Z2 = Z / Z.max()
Zmax, Zmin = Z.max(), Z.min()
Z1 = (Z - Zmin) / (Zmax - Zmin)
print(Z1)
print(Z2)

[[ 0.64044462  1.          0.33549354  0.62608569  0.79055145]
 [ 0.76212792  0.42360372  0.89677917  0.41086149  0.        ]
 [ 0.12873463  0.25682202  0.96319046  0.48846749  0.57964635]
 [ 0.29389396  0.15668918  0.14826989  0.64431352  0.10519078]
 [ 0.33226473  0.09995767  0.1282338   0.6657058   0.53270008]]
[[ 0.65080249  1.          0.35463627  0.6368572   0.79658513]
 [ 0.76898041  0.44020822  0.8997527   0.42783306  0.02880744]
 [ 0.15383355  0.27823108  0.96425085  0.50320343  0.59175566]
 [ 0.31423507  0.18098281  0.17280605  0.65455993  0.13096794]
 [ 0.35150048  0.12588559  0.15334715  0.67533596  0.5461618 ]]


In [21]:
Z = np.arange(11)
Z[(3 < Z) & (Z <= 8)] *= -1
print(Z)

[ 0  1  2  3 -4 -5 -6 -7 -8  9 10]


In [22]:
A = np.array([3,4,6,10,24,89,45,43,46,99,100])
div3 = A[A%3 != 0]
print(div3)

div5 = A[A%5 == 0]
print(div5)

div15 = A[(A%3 == 0) & (A%5 == 0)]
print(div15)

[  4  10  89  43  46 100]
[ 10  45 100]
[45]


In [23]:
Z = np.random.random(10)
Z[Z.argmax()] = 0
print(Z)

[ 0.37198232  0.25117761  0.68209449  0.04650935  0.67797053  0.41547822
  0.35110955  0.26371556  0.          0.56822811]


In [24]:
Z = np.zeros((5, 5))
Z += np.arange(5)
print(Z)

[[ 0.  1.  2.  3.  4.]
 [ 0.  1.  2.  3.  4.]
 [ 0.  1.  2.  3.  4.]
 [ 0.  1.  2.  3.  4.]
 [ 0.  1.  2.  3.  4.]]


### 請參考 [100-numpy-exercises](https://github.com/rougier/numpy-100/blob/master/100%20Numpy%20exercises.md) 做更多 numpy 的操作練習