<table align="left">
  <td>
    <a href="https://colab.research.google.com/github/bioinfo-gao/Classical_ML/blob/main/1_data_computations_with_numpy/1_intro_to_Numpy_for_data_computation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
  </td>
  <td>
</table>

*This notebook was created by [Zhen Gao]*

<a name='0'></a>
# Intro to NumPy for Data Computations

This is lab is performing data computations with NumPy. NumPy is a scientific tool used to make mathematical computations easily. 

In this lab, you will learn to:

 * [1. Create a NumPy array](#1)
 * [2. Select data: indexing and slicing of array](#2)
 * [3. Perform mathematical and other basic operations](#3)
 * [4. Perform basic statistics](#4)
 * [5. Manipulate data](#5)

If you are using Google Colab, we do not need to install NumPy. We will only have to import it just like this:

`import numpy as np`

If you are using local Jupyter notebooks, make sure you have it installed already.

<a name='1'></a>
## 1. Creating an Array in NumPy

Array can either be vector or matrice. A vector is one dimensional array, and a matrix is a two or more dimensional array. 

In [1]:
## Importing numpy

import numpy as np

#### Hightlights of Array and Tensor by ZG

+ Arrays are the most common data structure used in Python for numerical computations. 

+ Arrays inclues 1 dimensional, 2 dimensional, and n-dimensional arrays. 

+ comparing to tensors in deep learning (pytorch and tensorflow)

+ Numpy is the most popular library for numerical computations in Python.  

+ Tensor also includes the concept of dimensions, 1 dimensional, 2 dimensional, and n-dimensional tensors.

+ Tensor and Array are mathamatially different, but they are used interchangeably in deep learning.

+ The main difference between arrays and tensors is that tensors are used for deep learning, 
while arrays are used for numerical computations.

In [None]:
## Creating a simple 1 dimensional array:  ==>> vector
np.array([1,2,3,4,5])

array([1, 2, 3, 4, 5])

In [5]:
## Creating 2 dimensional array:       == >> matrix

np.array([(1,2,3,4,5), (6,7,8,9,10)])

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10]])

In [7]:
## Creating an array from a list

num_list = [1,2,3,4,5]

np.array(num_list)

array([1, 2, 3, 4, 5])

In [8]:
print(np.array(num_list))

[1 2 3 4 5]


In [None]:
# 创建一个 3D 零数组：2 层 x 5 行 x 5 列
zeros_3d = np.zeros((2, 5, 5))         # all 0s
print("\n--- Zeros 3D Array Shape ---")
print(zeros_3d.shape)

# 创建一个 4D 数组：10 批次 x 3 层 x 32 行 x 32 列
# 常见于深度学习中的 10 张 32x32 像素的 RGB 图像
ones_4d = np.ones((10, 3, 32, 32))     # all 1s  
print("\n--- Ones 4D Array Shape ---")
print(ones_4d.shape)


--- Zeros 3D Array Shape ---
(2, 5, 5)

--- Ones 4D Array Shape ---
(10, 3, 32, 32)


In [None]:
import numpy as np

# 1. 创建序列 1 到 100
# np.arange(1, 101) 创建从 1 到 100 的数组
data = np.arange(1, 101)

print("原始数据 Shape:", data.shape) # (100,)
print("-" * 30) # print 分割线 


原始数据 Shape: (100,)
------------------------------


In [None]:
# --- 2. 创建 3D 数组 ---

# 结构必须满足：a * b * c = 100。选择 (5, 5, 4)
# 5 层 x 5 行 x 4 列 = 100 个元素
shape_3d = (5, 5, 4) 
array_3d = data.reshape(shape_3d)

print("3D Array Shape:", array_3d.shape) # (5, 5, 4)
print("3D Array (前两层):")
print(array_3d[:2]) 
print("-" * 30)

3D Array Shape: (5, 5, 4)
3D Array (前两层):
[[[ 1  2  3  4]
  [ 5  6  7  8]
  [ 9 10 11 12]
  [13 14 15 16]
  [17 18 19 20]]

 [[21 22 23 24]
  [25 26 27 28]
  [29 30 31 32]
  [33 34 35 36]
  [37 38 39 40]]]
------------------------------


In [None]:
# --- 3. 创建 4D 数组 ---

# 结构必须满足：a * b * c * d = 100。选择 (2, 5, 5, 2)
# 2 批次 x 5 层 x 5 行 x 2 列 = 100 个元素
shape_4d = (2, 5, 5, 2)
array_4d = data.reshape(shape_4d)

print("4D Array Shape:", array_4d.shape) # (2, 5, 5, 2)
print("4D Array (第一个批次的第一层):")
print(array_4d[0, 0]) #  第一个批次的第一层的元素   
print("-" * 30)


4D Array Shape: (2, 5, 5, 2)
4D Array (第一个批次的第一层):
[[ 1  2]
 [ 3  4]
 [ 5  6]
 [ 7  8]
 [ 9 10]]
------------------------------


In [None]:
# --- 4. 通过索引选择元素 ---

# 从 3D 数组中选择一个元素
# 索引: [第 2 层, 第 3 行, 第 1 列] (索引从 0 开始)     e.g.  [第 3层, 第 4 行, 第 2 列] (索引从 0 开始)
# 对应于原始数据中的第 25 个元素 (因为 (1*5*4) + (2*4) + 0 = 20 + 8 + 1 = 29, 元素是 30)
print(array_3d[2]) 
print("3rd layer, all rows all columns:")
print("-" * 30)

index_3d = (2, 3, 1)
element_3d = array_3d[index_3d]

print(f"3D 数组选择元素 (索引 {index_3d}):") #  e.g.  [第 3层, 第 4 行, 第 2 列] (索引从 0 开始)
print(f"元素值: {element_3d}")
print("-" * 30)


[[41 42 43 44]
 [45 46 47 48]
 [49 50 51 52]
 [53 54 55 56]
 [57 58 59 60]]
3rd layer, all rows all columns:
------------------------------
3D 数组选择元素 (索引 (2, 3, 1)):
元素值: 54
------------------------------


In [18]:

# 从 4D 数组中选择一个元素
# 索引: [第 1 批次, 第 3 层, 第 4 行, 第 0 列] (索引从 0 开始)
# 对应于原始数据中的第 34 个元素 (因为 (0*50) + (2*10) + (3*2) + 0 = 26, 元素是 27)
print(array_4d[0,3]) 
print("1st batch, 4th layer, all rows all columns:")
print("-" * 30)


index_4d = (0, 3, 4, 0)
element_4d = array_4d[index_4d]

print(f"4D 数组选择元素 (索引 {index_4d}):")
print(f"元素值: {element_4d}")

[[31 32]
 [33 34]
 [35 36]
 [37 38]
 [39 40]]
1st batch, 4th layer, all rows all columns:
------------------------------
4D 数组选择元素 (索引 (0, 3, 4, 0)):
元素值: 39


### 1.1 Generating Array

NumPy offers various options to generate an array depending on particular need, such as:

* Generating identity array
* Generating zero array of a given size
* Generating ones array with a given size
* Generating an array in a given range
* Generating an array with random values


In [19]:
## Generating an identity array 

identity_array = np.identity(4)
print(identity_array)

[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]


# np.identity(n) creates an identity matrix of size n x n.
# np.eye(n) is an alias for np.identity(n). 

In [21]:
# 方阵，等价 << == 
print("np.identity(4):")
print(np.identity(4))
print("\nnp.eye(4):")
print(np.eye(4))


np.identity(4):
[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]

np.eye(4):
[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]


In [22]:

# 非方阵
print("\nnp.eye(2, 5):")
print(np.eye(2, 5))



np.eye(2, 5):
[[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]]


In [23]:

# 对角线偏移
print("\nnp.eye(4, k=1):")
print(np.eye(4, k=1))
print("\nnp.eye(4, k=-1):")
print(np.eye(4, k=-1))



np.eye(4, k=1):
[[0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]
 [0. 0. 0. 0.]]

np.eye(4, k=-1):
[[0. 0. 0. 0.]
 [1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]]


In [24]:

# 指定 dtype
print("\nnp.eye(3, dtype=int):")
print(np.eye(3, dtype=int))


np.eye(3, dtype=int):
[[1 0 0]
 [0 1 0]
 [0 0 1]]


In [None]:
## Generating an identity matrix of 1s 
# For Square Matrix , NO difference between eye and identity matrix in numpy

np.eye(4)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

In [26]:
# You can multiply with any constant

np.eye(4) * 7

array([[7., 0., 0., 0.],
       [0., 7., 0., 0.],
       [0., 0., 7., 0.],
       [0., 0., 0., 7.]])

In [27]:
# Generating zero array of a given size
# 1 dimensional zero array
np.zeros(5)

array([0., 0., 0., 0., 0.])

In [None]:
# Creating two dimensional array: pass the tuple of rows and columns' number
#np.zeros((rows, columns))

np.zeros((5,6)) # see above

array([[0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.]])

In [29]:
# Generating ones array of a given size
# 1 dimensional one array

np.ones(5)

array([1., 1., 1., 1., 1.])

In [30]:
# Creating two dimensional ones array: pass the tuple of rows and columns' number
# np.ones((rows, columns))

np.ones((5,6))

array([[1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1.]])

In [31]:
## Generating an array in a given range or interval

np.arange(0,5)

array([0, 1, 2, 3, 4])

In [14]:
## If you want to control the step size

np.arange(0,20,2)

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

In [15]:
## You can also use linspace to generate an evenly spaced numbers in a given interval

np.linspace(0,20,5)

array([ 0.,  5., 10., 15., 20.])

In [None]:

np.linspace(0,100,5) # 得到等差数列 ， 5个数， 长度100， 5个数中间是有4个个间隔

array([  0.,  25.,  50.,  75., 100.])

In [None]:
np.linspace(0,10,10) # 10个数之间有9个间隔

array([ 0.        ,  1.11111111,  2.22222222,  3.33333333,  4.44444444,
        5.55555556,  6.66666667,  7.77777778,  8.88888889, 10.        ])

In [None]:
np.linspace(0,10,11) # 11个数之间有10个间隔

array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])

In [33]:
## Generating an array with random values
# Create a 1D array with 4 random numbers

np.random.rand(4)

array([0.11655377, 0.12693173, 0.63112944, 0.92938675])

In [None]:
np.random.rand(4)

#We will NOT get teh same values

array([0.44371504, 0.90432934, 0.54703702, 0.2923613 ])

In [35]:
np.random.rand(4,5)

array([[0.53506065, 0.03423894, 0.13901229, 0.55267991, 0.3255945 ],
       [0.41655887, 0.21239268, 0.34545019, 0.89703206, 0.40701225],
       [0.85341896, 0.7367693 , 0.61615398, 0.55667015, 0.38743268],
       [0.40793122, 0.53289593, 0.96101842, 0.58164344, 0.87592115]])

In [None]:
np.random.rand(2,3,4,5) # generate a 4D array with random values between 0 and 1

array([[[[0.36414937, 0.32168149, 0.67132955, 0.29972905, 0.17914346],
         [0.69791812, 0.10197262, 0.65781328, 0.62649562, 0.27322106],
         [0.25508026, 0.00819287, 0.9600838 , 0.47943193, 0.07743098],
         [0.31399504, 0.62306826, 0.53422514, 0.90283386, 0.13469619]],

        [[0.20683858, 0.68678721, 0.08383997, 0.93233672, 0.16457511],
         [0.0526936 , 0.28934074, 0.14721155, 0.08065422, 0.45816505],
         [0.20382484, 0.92746226, 0.11294466, 0.81648976, 0.97019855],
         [0.12207582, 0.30000193, 0.24689782, 0.99527165, 0.85249933]],

        [[0.65987183, 0.5219687 , 0.80214209, 0.08548365, 0.12696269],
         [0.62533803, 0.08361198, 0.5317037 , 0.51083862, 0.0623106 ],
         [0.057591  , 0.92122157, 0.36156678, 0.40297267, 0.16993175],
         [0.55156914, 0.72351845, 0.58565898, 0.27079268, 0.95974654]]],


       [[[0.57076773, 0.0139942 , 0.35215824, 0.28929615, 0.82692962],
         [0.9584955 , 0.33637976, 0.45164173, 0.45987053, 0.46701112]

In [None]:
### Generate ONE random integer in a given range

np.random.randint(5,50) # generate a random integer between 5 and 50 == >> (both inclusive) 

### Generate a random array of integers 

5

In [None]:
### Generate one random integer in a given range
np.random.randint(5,50)  # the result is shown as 7 etc 


17

In [None]:
### Generate one random integer in a given range

D2_array = (np.random.randint(5,50) )
print( D2_array ) # Output: 7 

7


In [43]:
import numpy as np

# 使用 print() 函数来显示生成的随机数
random_number = np.random.randint(5, 50)
print(random_number) 

# 或者直接打印
print(np.random.randint(5, 50))



9
26


In [41]:
### Generate 10 random integers in a given range

np.random.randint(5,50,10)

array([11, 23, 48, 47, 31, 31, 33, 20,  8, 28], dtype=int32)

In [45]:
## Random see to output the same random vaues at all run time 
import random

random.seed(10)

random.randint(5,50)

41

<a name='2'></a>
## 2. Data Selection: Indexing and slicing an Array

Indexing: Selecting individual elements from the array

Slicing: Selecting group of element from the array. 


### 2.1 1D Array Indexing and Selection

In [47]:
# Creating a 1 dimensional vector

array_1d = np.array([1,2,3,4,5])

In [49]:
## Indexing: selcting an element from an array

array_1d[1]

np.int64(2)

In [48]:

array_1d [-1]

np.int64(5)

In [50]:
# Slicing: Returning the grou of element from an array

array_1d [2:4]

array([3, 4])

### 2.2 2D Array Indexing and Selection

In [51]:
## Indexing 2D array

array_2d = np.array([[1,2,3],[4,5,6],[7,8,9]])

In [52]:
## Selecting individual element
## array_2d[row][column]
## let's select 5..that is row 1, column 1 (we start from 0!!)

array_2d[1][1]

np.int64(5)

In [None]:
## Selecting individual element
## array_2d[row][column]
## let's select 5..that is row 1, column 1 (we start from 0!!)

array_2d[1,1] # also works ! 

np.int64(5)

In [54]:
# let's select 9..that is row 2, column 2

array_2d[2][2]

np.int64(9)

In [55]:
## Selecting whole row
#array_2d[row]

array_2d[1]

array([4, 5, 6])

In [56]:
## Selecting group of elements in 2D array
## array_2d[rows, columns]..You select rows and columns

## Let's select the first two rows
## Rows :2 denotes that we are selecting all rows up to the second. 
## Columns : denotes that all columns are selected.


array_2d[:2,:]

array([[1, 2, 3],
       [4, 5, 6]])

In [None]:
array_2d[:2:] # still works 

array([[1, 2, 3],
       [4, 5, 6]])

In [58]:

## Selecting all first two rows and first two columns

array_2d[:2,0:2]

array([[1, 2],
       [4, 5]])

In [59]:
## Above is same as

array_2d[0:2,:]

array([[1, 2, 3],
       [4, 5, 6]])

In [35]:
## This will return all rows, and so all columns and so same as orginal array
array_2d[0:3,:]

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [36]:
## return the second row

array_2d[2,:]

array([7, 8, 9])

In [None]:
## return the second row

array_2d[2:] # comma is IMPORTANT !

array([[7, 8, 9]])

In [61]:
## return the second column

array_2d[:,2]

array([3, 6, 9])

In [62]:
## return the last two columns

array_2d[:,1:3]

array([[2, 3],
       [5, 6],
       [8, 9]])

In [63]:
## return the first column

array_2d[:,0]

array([1, 4, 7])

In [64]:
## return the first row

array_2d[0,:]

array([1, 2, 3])

Indexing or selecting 2D array may seems confusing but when you try it multiple times, you get the idea. If you are selecting an entire row, that means the all the columns are selected (but not their all values). And vice versa. 

As shown below, we are selecting the first row, but as you can see all columns are selected (:).

```
array_2d[0,:]
```





### 2.3 Conditional selection

You can use a condition to select values in an array. Let's use comparison operators to select the values. 

In [3]:
## Let's create an array
import numpy as np
arr= np.array(([1,2,3],[4,5,6],[7,8,9]))

#### Select <span style="color: magenta">  all elements </span> in an array which are less than 6


In [4]:

arr[arr <6 ]

array([1, 2, 3, 4, 5])

In [5]:
## Select all elements in an array which are greater than 6

arr[arr > 6]

array([7, 8, 9])

In [6]:
## Select all even numbers in an array

arr[arr % 2 ==0 ]

array([2, 4, 6, 8])

In [7]:
## Select all odd numbers in an array

arr[arr % 2 !=0 ]

array([1, 3, 5, 7, 9])

In [8]:
## You can also have multiple conditions

## In all odd numbers, return values which are greater or equal to 5


arr[(arr % 2 !=0 ) & (arr >=5) ]

array([5, 7, 9])

In [9]:
## Using logical selection, you can also return True for values in which a given condition is met in an array

arr > 5

array([[False, False, False],
       [False, False,  True],
       [ True,  True,  True]])

In [48]:
## We do not have 0 in our array

arr == 0

array([[False, False, False],
       [False, False, False],
       [False, False, False]])

<a name='3'></a>
## 3. Basic Array Operations

### 3.1 Quick Arithmetic operation: Addition, Subtraction, Multiplication, Division, Squaring

In [49]:
# Let's create two arrays

arr1 = np.arange(0,5)
arr2 = np.arange(6,11)

In [50]:
## Addition

arr1 + arr2

array([ 6,  8, 10, 12, 14])

In [51]:
## Subtraction

arr2 - arr1

array([6, 6, 6, 6, 6])

In [52]:
## Multiplication

arr1 * arr2

array([ 0,  7, 16, 27, 40])

In [53]:
## Division

arr1 / arr2

array([0.        , 0.14285714, 0.25      , 0.33333333, 0.4       ])

In [54]:
## Squaring

arr1 ** 2

array([ 0,  1,  4,  9, 16])

### 3.2 Universal functions

NumPy universal functions (`ufunc`) allows to compute math, trigonometric, logical and comparison operations such as sin, cos, tan, exponent(exp), log, square, greater, less, etc...

In [55]:
## creating two arrays 

arr1 = np.arange(0,5)
arr2 = np.arange(6,11)

In [56]:
## Calculating the sum of two arrays

np.add(arr1, arr2)

array([ 6,  8, 10, 12, 14])

In [57]:
## Calculating the product of two arrays

np.multiply(arr1, arr2)

array([ 0,  7, 16, 27, 40])

In [58]:
## Calculating the difference between two arrays

np.subtract(arr1, arr2)

array([-6, -6, -6, -6, -6])

In [59]:
## Calculating the division of two arrays

np.divide(arr1, arr2)

array([0.        , 0.14285714, 0.25      , 0.33333333, 0.4       ])

In [60]:
## Calculating the sin of arr1

np.sin(arr1)

array([ 0.        ,  0.84147098,  0.90929743,  0.14112001, -0.7568025 ])

In [61]:
np.sin([0,45,90,180])

array([ 0.        ,  0.85090352,  0.89399666, -0.80115264])

In [62]:
## Calculating the cosine of arr 1

np.cos(arr1)

array([ 1.        ,  0.54030231, -0.41614684, -0.9899925 , -0.65364362])

In [63]:
np.cos([0,45,90,180])

array([ 1.        ,  0.52532199, -0.44807362, -0.59846007])

In [64]:
## Calculating the tangent(tan) of the array

np.tan(arr2)

array([-0.29100619,  0.87144798, -6.79971146, -0.45231566,  0.64836083])

In [65]:
## Calculating the logarithmic(log) of the array

np.log(arr2)

array([1.79175947, 1.94591015, 2.07944154, 2.19722458, 2.30258509])

In [66]:
## Calculating the exponent(exp or e^) of the array

np.exp(arr2)

array([  403.42879349,  1096.63315843,  2980.95798704,  8103.08392758,
       22026.46579481])

In [67]:
## Calculating the power  of the array
## Array 1 is powered array 2...0^6=0, 1^7=1, 2^8=256, etc..

np.power(arr1, arr2)

array([      0,       1,     256,   19683, 1048576])

In [68]:
## Comparison operations return true or false
## Arr 1 is less than arr 2...so that's false

np.greater(arr1, arr2)

array([False, False, False, False, False])

In [69]:
## Comparison operations return true or false
## Arr 1 is less than arr 2...so that's true

np.less(arr1, arr2)

array([ True,  True,  True,  True,  True])

<a name='4'></a>
## 4. Basic Statistics

With NumPy, we can compute the basic statistics such as the standard deviation (std), variance (var),mean, median, minimum value, maximum value of an array. 

More about NumPy statistics: https://numpy.org/doc/stable/reference/routines.statistics.html#order-statistics

In [70]:
## Creating an array 

arr = np.arange(0,5)
arr

array([0, 1, 2, 3, 4])

### 4.1 Standard Deviation

In [71]:
## calculating the standard deviation of the array
## Std is how much an element of the array deviates from the mean of the array

np.std(arr)

1.4142135623730951

In [72]:
arr2 = np.array([[3,4], [5,6]])

np.std(arr2)

1.118033988749895

In [73]:
## Specifying the axis
## By default, the std is computed on the flattened values (or converted into a single column vector)

np.std(arr2, axis=0)

array([1., 1.])

In [74]:
np.std(arr2, axis=1)

array([0.5, 0.5])

### 4.2 Variance

In [75]:
## Calculating the Variance (var)

arr = np.arange(0,5)

np.var(arr)

2.0

In [76]:
np.var(arr2)

1.25

### 4.3 Mean

In [77]:
## Calculating the mean of the array

np.mean(arr)

2.0

In [78]:
## mean gives the same results as the average
np.average(arr)

2.0

### 4.4 Median

In [79]:
## Calculating the median of the array

np.median(arr)

2.0

### 4.3 Minimum and Maximum

In [80]:
## Calculating the minimum value

np.min(arr)

0

In [81]:
## Calculating the maximum value

np.max(arr)

4

<a name='5'></a>
## 5. Data Manipulation

Data Manipulation is important step in Machine Learning project. Let's some of NumPy methods and functions which are useful in data manipulation. 

### 5.1 Shape of the array

In [82]:
## Creating an array 

arr1 = np.arange(0,10)
arr2 = np.array(([1,2,3],[4,5,6],[7,8,9]))

In [83]:
arr1

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [84]:
arr2

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [85]:
np.shape(arr1)

(10,)

In [86]:
np.shape(arr2)

(3, 3)

In [87]:
arr2.shape

(3, 3)

### 5.2 Shaping the Array

`np.reshape(array_name, newshape=(rows, columns)` or `array_name.reshape(rows, columns)` change the shape of the array. The rows and columns of the new shape has to comform with the existing data of the array. Otherwise, it won't work. Take an example, you can convert (3,3) array into (1,9) but you can't convert it into (5,5). 

In [88]:
### arr1 is (10,)....10 rows, 1 column. Let's reshape it into (5,2)
np.reshape(arr1, newshape=(5,2))

array([[0, 1],
       [2, 3],
       [4, 5],
       [6, 7],
       [8, 9]])

In [89]:
## This would also work
arr1.reshape(5,2)

array([[0, 1],
       [2, 3],
       [4, 5],
       [6, 7],
       [8, 9]])

In [90]:
arr2_reshaped = arr2.reshape(9,1)
arr2_reshaped.T

array([[1, 2, 3, 4, 5, 6, 7, 8, 9]])

In [91]:
arr2_reshaped.reshape(3,3)

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [92]:
## np.resize can also be used to change the shape of the array into a specific size

np.resize(arr2, (1,9))

array([[1, 2, 3, 4, 5, 6, 7, 8, 9]])

### 5.3 Copying array

In [93]:
arr1 = np.arange(0,10)
arr1

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [94]:
arr1_copy = arr1.copy()
arr1_copy

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [95]:
## Copying the values of one array into the other 

## Let's copy array 2 into 1 --they have the same shape

arr1 = np.arange(0,6)
arr2 = np.arange(6,12)

In [96]:
## arr1 is destination, arr2 is source
np.copyto(arr1, arr2)

In [97]:
arr1

array([ 6,  7,  8,  9, 10, 11])

### 5.4 Joining arrays

In [98]:
### Creating two arrays

arr1 = np.array([[1,2,3],[4,5,6],[7,8,9]])
arr2 = np.array([[10,11,12]])

In [99]:
## Joining them

np.concatenate((arr1, arr2))

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

In [100]:
## Transposing arr2
## arr2.T is transpose operation

np.concatenate((arr1, arr2.T), axis=1)

array([[ 1,  2,  3, 10],
       [ 4,  5,  6, 11],
       [ 7,  8,  9, 12]])

In [101]:
### Setting axis to none flatten the array

np.concatenate((arr1, arr2), axis=None)

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

In [102]:
### Joining two 1Ds array into 2D array: Stacking

# Column stacking

arr1 = np.arange(0,6)
arr2 = np.arange(6,12)

np.column_stack((arr1, arr2))

array([[ 0,  6],
       [ 1,  7],
       [ 2,  8],
       [ 3,  9],
       [ 4, 10],
       [ 5, 11]])

In [103]:
## Row stacking 

np.row_stack((arr1, arr2))

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11]])

### 5.5 Splitting arrays

In [104]:
arr1 = np.arange(0,6)
arr1

array([0, 1, 2, 3, 4, 5])

In [105]:
### Splitting the array into two arrays

np.split(arr1, 2)

[array([0, 1, 2]), array([3, 4, 5])]

In [106]:
### Splitting the array into three arrays

np.split(arr1, 3)

[array([0, 1]), array([2, 3]), array([4, 5])]

### 5.6 Adding and repeating elements in an array

In [107]:
arr1 = np.arange(0,6)
arr1

array([0, 1, 2, 3, 4, 5])

In [108]:
## Adding the values at the end of the array
np.append(arr1,7)

array([0, 1, 2, 3, 4, 5, 7])

In [109]:
### Given an array, can you add itself multiple times? or repeat it?

arr = np.array([[1,2,3]])
np.tile(arr, 3)

array([[1, 2, 3, 1, 2, 3, 1, 2, 3]])

In [110]:
np.repeat(arr,3)

array([1, 1, 1, 2, 2, 2, 3, 3, 3])

### 5.7 Sorting elements in an array

In [111]:
arr = np.array([[1,2,3,4,5,3,2,1,3,5,6,7,7,5,9,5]])

np.sort(arr)

array([[1, 1, 2, 2, 3, 3, 3, 4, 5, 5, 5, 5, 6, 7, 7, 9]])

In [112]:
## Finding the unique elements in an array

arr = np.array([[1,2,3,4,5,3,2,1,3,5,6,7,7,5,9,5]])

np.unique(arr)

array([1, 2, 3, 4, 5, 6, 7, 9])

### 5.8 Reversing an array

In [113]:
## You can also flip the array

arr = np.array([[1,2,3],[4,5,6],[7,8,9]])
arr

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [114]:
## Up/down flipping

np.flipud(arr)

array([[7, 8, 9],
       [4, 5, 6],
       [1, 2, 3]])

In [115]:
## left/right flipping

np.fliplr(arr)

array([[3, 2, 1],
       [6, 5, 4],
       [9, 8, 7]])



---



---



That's it for NumPy. In this lab, you learned how to create an array, perform basic operations, and also how to manipulate an array. 


In the next lab, we will learn about the Pandas, another important tool used for real world data manipulation.

### [BACK TO TOP](#0)