## CM4044: AI In Chemistry
## Semester 1 2020/21

<hr>

## Tutorial 1a: Introduction to Numpy Part I
## Objectives
### $\bullet$ Numpy Package
### $\bullet$ One-dimensional Arrays
### $\bullet$ Multi-dimensional Arrays
### $\bullet$ Other Numpy Array Creation Methods

<hr>



## 1. import numpy

**Numpy** is the core library for scientific computing in **Python**. It provides a high-performance multidimensional array object, and tools for working with these arrays.

To use **Numpy**, we first need to import the `numpy` package. The most common code is below:



In [2]:
import numpy as np

print(np.__version__)

1.18.1


## 2. One-dimensional Arrays

A numpy array is a grid of values, all of the **same** type, and is indexed by a tuple of nonnegative integers. The number of dimensions is the rank of the array; the shape of an array is a tuple of integers giving the size of the array along each dimension.

We can initialize numpy arrays from nested Python lists with `numpy.array()`, and the data type of **Numpy** array is `numpy.ndarray`

In [3]:
a = [1, 2, 3, 4]
a = np.array(a)
a

array([1, 2, 3, 4])

In [4]:
type(a) 

numpy.ndarray

### 3. One-dimensional Array Properties

Several functions/attributes are available to check `ndarray` object properties, such as `type()`, `size`, `dtype`, `itemsize`, `shape`, `nbytes`, `ndim`, and many others. 

In [9]:
a = np.array([1, 2, 3, 4])
print(type(a))       # the data type of array a to screen
print(a.size)       # the number of elements in array a
print(a.dtype)       # the data type of the elements in the array a
print(a.itemsize)    # length of every element in bytes
print(a.shape)       # array dimensions 
print(a.nbytes)      # total bytes for all elements
print(a.ndim)        # dimensions of array

<class 'numpy.ndarray'>
4
int32
4
(4,)
16
1


## 4. Indexing and Slicing

Contents of `ndarray` object can be accessed and modified by indexing or slicing with square bracket `[]`, just like Python's `list` objects.

In [8]:
a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
print(a[0])       # the first element's index is 0!!!
print(a[3])       # the fourth element
b = a[1:7:2]      # a slice object (child array) is created, 1:7:2 means starts from index 1, stop at index 7, with step 2
print(b)

1
4
[2 4 6]


The slicing parameters are separated by a colon `:` **(start:stop:step)** directly to the `ndarray` object. If **step** is not specified, **step** is 1. If **start** is not specifed, **start** is 0, if **stop** is not specified, **stop** is array size + 1. And **step** is allowed to be negative. Finally, the slicing does not creat new data object, it sets up a reference relation.

In [6]:
print(a[:8])        # the first element (index 0) to the eighth element (index 7)!
print(a[1:])        # the second elment (index 1) to the last element
print(a[::])        # the whole array
print(a[::-1])      # reverse the order!

[1 2 3 4 5 6 7 8]
[ 2  3  4  5  6  7  8  9 10]
[ 1  2  3  4  5  6  7  8  9 10]
[10  9  8  7  6  5  4  3  2  1]


In [7]:
a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
print(a)        
b = a[1:4]
print(b)
b[0] = 0     
print(a)      # b is a reference to a sub-array of a, so when content referenced by b changed, the contene in a also changed!

[ 1  2  3  4  5  6  7  8  9 10]
[2 3 4]
[ 1  0  3  4  5  6  7  8  9 10]


## 5. Multidimensional Arrays and Properties

`array()` can create multidimensional arrays too. For example,

In [11]:
a = np.array([[ 0, 1, 2, 3],
           [10,11,12,13]])         # the pass-in pararmeter is a list of lists
print(a)

[[ 0  1  2  3]
 [10 11 12 13]]


And the properties of a multidimensional array can be enquired by the same methods used for one-dimensional `ndarray` data object.

In [12]:
print(type(a))       # the data type of array a to screen
print(np.size(a))       # the number of elements in array a
print(a.dtype)       # the data type of the elements in the array a
print(a.itemsize)    # length of every element in bytes
print(a.shape)       # array dimensions 
print(a.nbytes)      # total bytes for all elements
print(a.ndim)        # dimensions of array

<class 'numpy.ndarray'>
8
int32
4
(2, 4)
32
2


## 6. Indexing and Slicing in Multidimensional Arrays

For a two-dimensional arrays, use a pair of integer numbers to index element in the array. For example,

In [13]:
a = np.array([[ 0, 1, 2, 3],
           [10,11,12,13]])
print(a[0,0])       # the first element's index is (0,0) !!!
print(a[1,3])       # the element in second row, fourth column, which is 13

0
13


Here is a three-dimensional array. To create the 3D array, we use `np.arange()` function and `shape` attribute of a **Numpy** array.

In [14]:
a = np.arange(3*4*5)  # this function creats a 1D ndarray object
print(a)
a.shape = 3,4,5       # change the 1D array to a 3D array - the one at the last changes first
print(a)
print(a[0,0,0])       # the first element's index is (0,0,0) !!!
print(a[0,2,3])       # the element is 13

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
 48 49 50 51 52 53 54 55 56 57 58 59]
[[[ 0  1  2  3  4]
  [ 5  6  7  8  9]
  [10 11 12 13 14]
  [15 16 17 18 19]]

 [[20 21 22 23 24]
  [25 26 27 28 29]
  [30 31 32 33 34]
  [35 36 37 38 39]]

 [[40 41 42 43 44]
  [45 46 47 48 49]
  [50 51 52 53 54]
  [55 56 57 58 59]]]
0
13


## 7. Other Numpy Array Creation Functions

## 7.1 arange

`arange()` function is similiar to `range()` function in Python, it returns a `ndarray` object after calling.

    arange(start, stop=None, step=1, dtype=None)

It creates a `ndarray` object with numbers in the range of `[start, stop)` with the input `step`, there is only one input, `start` = 0, `step` = 1.

In [12]:
a = np.arange(4)         # start = 0, stop = 4, step = 1
print(a) 
a = np.arange(1, 4, 2)   # start = 1, stop = 4, step = 2
print(a)
a = np.arange(0, np.pi, np.pi/4)  #start= 0.0, stop = np.pi, step = np.pi/4
print(a)

[0 1 2 3]
[1 3]
[0.         0.78539816 1.57079633 2.35619449]


## 7.2 linspace

`linspace()` function.

    linspace(start, stop, N)

It creates an `ndarray` with numbers evenly spaced over a specified interval of `[start, stop]`.

More general form of this function is:

    numpy.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None, axis=0)
   

In [15]:
a = np.linspace(0, 1, 5)                  #endpoint included, the interval is [0, 1]
print(a)
a = np.linspace(0, 1, 5, endpoint=False)  #endpoint excluded, the interval is [0, 1)
print(a)
a,dx = np.linspace(0,1,5,retstep=True)    #retstep=true, return array and the space between two neighbour elements
print(a)
print(dx)

[0.   0.25 0.5  0.75 1.  ]
[0.  0.2 0.4 0.6 0.8]
[0.   0.25 0.5  0.75 1.  ]
0.25


## 7.3 logspace

`logsapce()` returns a `ndarray` of numbers spaced evenly on a log scale.

    numpy.logspace(start, stop, num=50, endpoint=True, base=10.0, dtype=None, axis=0)


In [14]:
a = np.logspace(1, 5, num=4)
print(a)
a = np.logspace(1,5, num=4, base=2)
print(a)

[1.00000000e+01 2.15443469e+02 4.64158883e+03 1.00000000e+05]
[ 2.          5.0396842  12.69920842 32.        ]


## 7.4 empty

`empty()` creates an empty array with arbitrary values.
    
    numpy.empty(shape, dtype=float, order='C')
    
The parameter `shape` is a tuple of size in different axis.
    

In [15]:
a = np.empty(2)           # the shape is (2,), which means one row and two column
print(a)
a = np.empty((2,2))       # the shape is (2,2), two rows and two columns
print(a)
a = np.empty((3,4,5))     # the shape is (3,4,5), first axis 3, second axis 4, and third axis 5
print(a)

[2.12199579e-314 6.36598737e-314]
[[ 2.          5.0396842 ]
 [12.69920842 32.        ]]
[[[1.23452885e-311 1.23451203e-311 0.00000000e+000 0.00000000e+000
   1.23287955e-311]
  [1.16095484e-028 5.28595592e-085 6.14837643e-071 1.05161522e-153
   8.82142681e+199]
  [4.35294652e-114 6.01347002e-154 1.27734658e-152 6.32275266e+233
   1.96086574e+243]
  [1.87673447e-152 4.11085842e+223 2.76006031e-085 6.01352277e-154
   1.46914500e+195]]

 [[1.14073631e+243 9.45312995e+218 1.81450400e-152 9.95876748e+136
   1.07441485e+160]
  [9.47069335e-154 7.13222113e-154 2.86530675e+161 2.64519874e+185
   3.68205002e+180]
  [7.26613256e+223 1.96086573e+243 1.75300468e+243 5.76928887e+252
   9.47074989e-154]
  [6.19489883e+223 1.32915250e+179 3.68205018e+180 1.81450400e-152
   3.77089701e+233]]

 [[6.63784985e+246 6.01352277e-154 9.13546568e+242 3.81388253e+180
   5.97857699e+135]
  [1.07441485e+160 5.16179392e-109 7.13222113e-154 5.02069306e+276
   2.19993033e-152]
  [9.45312995e+218 1.75300480e+243 5.

## 7.5 zeros, ones and eye

```python
numpy.zeros(shape, dtype=float64)
numpy.ones(shape, dtype=float64)
```

Create an array of 0 values by `zeros(shape)`, or create an array of 1 by `ones(shape)`. `shape` is a tuple of size in different axis.

```python
numpy.eye(N, M=None, k=0, dtype=<class 'float'>, order='C')
```
Create a 2D array with `1` in diagonal position for a square matrix or `1` in sub square matrix depends on the value of N and M

In [16]:
a = np.zeros((2,3))
print(a)
a = np.ones((2,3))
print(a)

[[0. 0. 0.]
 [0. 0. 0.]]
[[1. 1. 1.]
 [1. 1. 1.]]


In [17]:
a = np.eye(5,5)    # 5 by 5 square identical array
print(a)

a = np.eye(5,3)    # the main matrix is five by three, value of 1 is placed in diagonal position of 3 by 3 submatrix
print(a)

a = np.eye(2,5)    # the main matrix is two by five, value of 1 is placed in the diagonal position of 2 by 2 submatrix
print(a)

[[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]
 [0. 0. 0.]
 [0. 0. 0.]]
[[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]]


## 7.6 empty_like, zeros_like and ones_like

    empty_like(a)
    ones_like(a)
    zeros_like(a)

Create a new array with the same shape of array a and same data type.

In [18]:
a = np.arange(1, 4, 0.5)
print(a)
b = np.empty_like(a)
print(b)
b = np.zeros_like(a)
print(b)
b = np.ones_like(a)
print(b)

[1.  1.5 2.  2.5 3.  3.5]
[1.  1.5 2.  2.5 3.  3.5]
[0. 0. 0. 0. 0. 0.]
[1. 1. 1. 1. 1. 1.]


## 8. Understanding Numpy Array Axes

Let us start to review the use of the cartesian coordinates system. For example, a 2D cartesian coordinates:

<img src="./numpy-axes_point-in-cartesian-coordinates-example.png" width="300" height="300" />

Given a pair of number (2,3), we can easily locate the point in this 2D plane because we know the point lies 2 units along the x axis and 3 units along the y axis. In other word, coordinates are the indices of point in this example.

NumPy arrays also have axes. 

<img src="./numpy-arrays-have-axes.png" width="400" height="400" />

In a 2-dimensional NumPy array, the axes are the directions along the rows and columns.
In a NumPy array, axis 0 is the “first” axis and axis 1 is the "second" axis. For example

In [19]:
a = np.eye(3,5)   # axis 0 has three rows, axis 1 has five columns
print(a.shape)
print(a)

(3, 5)
[[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]]


Arrays with three dimensions or higher generalise the above use of axes. For example, a 3D array is a collection of 2D matrix and it needs three index for three axes i, j, k.
1. The first index, i, selects the matrix
2. The second index, j, selects the row
3. The third index, k, selects the column

<img src="./3d-array.png" width="400" height="400" />


In [20]:
a = np.array([[[10, 11, 12], [13, 14, 15], [16, 17, 18]],
               [[20, 21, 22], [23, 24, 25], [26, 27, 28]],
               [[30, 31, 32], [33, 34, 35], [36, 37, 38]]])

print(a.shape)
print(a)

(3, 3, 3)
[[[10 11 12]
  [13 14 15]
  [16 17 18]]

 [[20 21 22]
  [23 24 25]
  [26 27 28]]

 [[30 31 32]
  [33 34 35]
  [36 37 38]]]


You can access any row or column in a 3D array. There are 3 cases.

Case 1: specifying the first two indices. In this case, you are choosing the i value (the matrix), and the j value (the row). This will select a specific row. In this example we are selecting row 2 from matrix 1:

<img src="./3d-array-col-1.png" width="400" height="400" />

In [21]:
print(a[1,2]) #[26, 27, 28]

[26 27 28]


Case 2 - specifying the i value (the matrix), and the k value (the column), using a full slice (:) for the j value (the row). This will select a specific column. In this example we are selecting column 1 from matrix 0:

<img src="./3d-array-col-2.png" width="400" height="400" />

In [22]:
print(a[0, :, 1]) # [11 14 17]

[11 14 17]


Case 3 - specifying the j value (the row), and the k value (the column), using a full slice (:) for the i value (the matrix). This will create a row by taking the same element from each matrix. In this case we are taking row 1, column 2 from each matrix:<img src="./3d-array-col-3.png" width="400" height="400" />


In [23]:
print(a[:, 1, 2]) # [15, 25, 35]

[15 25 35]
