# Introducton to Numpy
## PISIKAsanayan 2023 Python Workshop
In this part of the workshop, we start to introduce another Python package called ```NumPy```. It is one of the most used Python packages as you delve into computational physics, research, and data manipulation in general. Note that ```NumPy``` has a lot of mathematical functionality, which will not be discussed in depth in this tutorial and will be reserved for further work! In this workshop, we will only cover the basics of getting started into ```NumPy``` and how it can be used to store and manipulate data!

## What is NumPy?
According to [NumPy.org](https://numpy.org/), `NumPy` is an open source Python library that is of utmost use when working with any **numerical** data in Python and it is one of the core libraries, along with `Matplotlib`, on any computational and scientific works.

Perhaps, `NumPy` is commonly used since it is designed to function on **multidimensional array** and **matrix data structures**, which will be discussed in the latter part of the workshop. In other words, it lets the user to efficiently operate on a **homogeneous** $n$-dimensional array object --- recall what **homogeneous** is in the context of Python lists!

Operation-wise, `NumPy` adds powerful data structures to Python that guarantee efficient calculations with arrays and matrices. It supplies an enormous library of high-level mathematical functions that operate on these arrays and matrices.

## What makes NumPy arrays powerful compared to Python Lists?
`NumPy` arrays are like Python's built-in `list` type, but `NumPy` arrays provide much more efficient storage and data operations as the **arrays grow larger in size**. `NumPy` arrays form the core of nearly the entire ecosystem of any computational tools in Python, so learning to use NumPy effectively will be valuable no matter what aspect of computational work or even data science interests you!

***

## Importing NumPy

Just like any other libraries in Python, it must be imported first.

In [1]:
import numpy as np

As shown above, we imported `NumPy` as `np` such that whenever we write any built-in function in `NumPy` like `np.function`, the code will look for the specified `function` in the module itself.

> **Coding tip** It is a good practice to import all of the libraries or modules you need at start of your notebook!

***
## Initializing NumPy arrays
The most notable feature of `NumPy` module is the array. These are similar to lists although `Numpy` provides a great deal of functionality which we will be utilising in this workshop. Initializing arrays is the first step of working with large sets of data, which can be achieved in several ways as discussed below.

### np.array
This command turns a **list** into a numpy array. *Nothing fancy here hehe*.

In [2]:
np.array([0, 1, 2, 3])

array([0, 1, 2, 3])

In [3]:
np.array([[0, 1, 2, 3], [4, 5, 6, 7]])

array([[0, 1, 2, 3],
       [4, 5, 6, 7]])

where `a` and `b` are one and two-dimensional arrays, respectively.
> **Food for the thought** Why does `a = np.array([0, 1, 2, 3])` work while `a = np.array(0, 1, 2, 3)` results to an error? Hmmmm.

Remember that unlike Python `list`, `NumPy` is constrained to arrays that all contain the same type, i.e. the elements of the array must be **homogeneous**. If the types do not match, `NumPy` will upcast if possible (here, integers are up-cast to a floating point):

In [4]:
np.array([2.1718, 1, 4, 10, 145])

array([  2.1718,   1.    ,   4.    ,  10.    , 145.    ])

### np.linspace
This function takes in three arguments: start, end, and number of points. `np.linspace` basically creates an array of **evenly spaced set of values** starting from a user-defined start and end number. For example,

In [5]:
np.linspace(0, 10, 6)

array([ 0.,  2.,  4.,  6.,  8., 10.])

or what if we want a hundred evenly-spaced points between $0$ to $\pi$? 

In [6]:
np.linspace(0, np.pi, 100)

array([0.        , 0.03173326, 0.06346652, 0.09519978, 0.12693304,
       0.1586663 , 0.19039955, 0.22213281, 0.25386607, 0.28559933,
       0.31733259, 0.34906585, 0.38079911, 0.41253237, 0.44426563,
       0.47599889, 0.50773215, 0.53946541, 0.57119866, 0.60293192,
       0.63466518, 0.66639844, 0.6981317 , 0.72986496, 0.76159822,
       0.79333148, 0.82506474, 0.856798  , 0.88853126, 0.92026451,
       0.95199777, 0.98373103, 1.01546429, 1.04719755, 1.07893081,
       1.11066407, 1.14239733, 1.17413059, 1.20586385, 1.23759711,
       1.26933037, 1.30106362, 1.33279688, 1.36453014, 1.3962634 ,
       1.42799666, 1.45972992, 1.49146318, 1.52319644, 1.5549297 ,
       1.58666296, 1.61839622, 1.65012947, 1.68186273, 1.71359599,
       1.74532925, 1.77706251, 1.80879577, 1.84052903, 1.87226229,
       1.90399555, 1.93572881, 1.96746207, 1.99919533, 2.03092858,
       2.06266184, 2.0943951 , 2.12612836, 2.15786162, 2.18959488,
       2.22132814, 2.2530614 , 2.28479466, 2.31652792, 2.34826

Feel free to experiment with your values, but note that the set number of evenly points desired must be of `int` type. Otherwise, an error will be encountered!

### np.arange
Similar to `np.linspace`, `np.arange` also takes in three arguments. The first two corresponds to the start and end point that is not **inclusive**! The third argument also indicates the spacing between each value. This what makes `np.arange` different from `np.linspace`.

In [7]:
np.arange(0, 25, 2)

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18, 20, 22, 24])

In [8]:
np.arange(-10, 10, 5)

array([-10,  -5,   0,   5])

### np.zeros and np.ones
In terms of producing arrays of a specific size, these are probably your most useful choice. It is more efficient to create arrays from scratch using routines built for `np.zeros` and `np.ones`. These arrays interpret the arguments in passed through them in the same way. The number inside for the functions to produce the first two arrays simply specifies how many values are in your array. As the name implies, `np.zeros` produces an array of zeros while `np.ones` yields an array of ones. For example,

In [9]:
np.ones(10)

array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

In [10]:
np.zeros(10)

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

produces a length-10 one-dimensional array of ones and zeros. If you want a multidimensional array of floating point array of zeros or ones,

In [11]:
np.zeros(shape = (3, 4))

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [12]:
np.zeros(shape = (7, 2))

array([[0., 0.],
       [0., 0.],
       [0., 0.],
       [0., 0.],
       [0., 0.],
       [0., 0.],
       [0., 0.]])

where the argument `shape` dictates the size of the array

### np.full
Similar to `np.zeros` and `np.ones`, `np.full` initializes an array of other floating points other than zero or one. For example, if we want a 6 by 6 array whose elements are all $\pi$, 

In [13]:
np.full((6, 6), 3.14)

array([[3.14, 3.14, 3.14, 3.14, 3.14, 3.14],
       [3.14, 3.14, 3.14, 3.14, 3.14, 3.14],
       [3.14, 3.14, 3.14, 3.14, 3.14, 3.14],
       [3.14, 3.14, 3.14, 3.14, 3.14, 3.14],
       [3.14, 3.14, 3.14, 3.14, 3.14, 3.14],
       [3.14, 3.14, 3.14, 3.14, 3.14, 3.14]])

In [14]:
np.full((3, 5), 11)

array([[11, 11, 11, 11, 11],
       [11, 11, 11, 11, 11],
       [11, 11, 11, 11, 11]])

In [15]:
np.full((2, 2), [0, 1])

array([[0, 1],
       [0, 1]])

***
## Common array operations
### Array indexing
Similar to indexing in Python's `list` as discussed in the previous parts of the workshop, indexing in `NumPy` will be quite familiar. In a **one-dimensional array**, the $n$th value (counting from zero) can be accessed by specifying the desired index in square brackets, just as with Python lists:

In [16]:
a = np.array([0, 1, 2, 3])
a[0] # first element

0

In [17]:
b = np.array([x for x in range(0, 20, 3)])
b

array([ 0,  3,  6,  9, 12, 15, 18])

In [18]:
b[4] # fifth element

12

To index from the end of the array, you can use negative indices as follows

In [19]:
b[-1] # last element

18

In [20]:
b[-3] # third to the last element, similar to b[4]

12

To index a **two-dimensional array**, it is intuitive to select an element by column and row entry! This is demonstrated explicity in the following examples. Of course, arrays of higher dimensions exist and can be indexed as well, but these will be outside the scope of the workshop. 

In general, indexing takes the syntax of `x[row number, column number]` where `x` is an arbitrary two-dimensional array. **Do not be confused that the Python indexing always starts at 0**, i.e the first row and column of `x` are always indexed as 0 by default.

In [21]:
c = np.array([[0,1,2],[3,4,5], [6, 7, 8]])
c

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

Let us say we are interested on the second column and third row entry of the multidimensional array or matrix `c`,

In [22]:
c[2,1]

7

or the first row and last column entry of `c` as well,

In [23]:
c[0,2]

2

In [24]:
c[0,-1]

2

Similar to lists, arrays are also **mutable**, i.e. the elements can still be changed after creating them. For example, if we wish to change the frist row and first column entry of `c`, 

In [25]:
c[0,0] = 2
c

array([[2, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

As expected, the element corresponding to `c[0,0]` was changed from 0 to 2.

***
### Array slicing
Similar to the syntax for array indexing, array slicing makes use of the slice notation `:` to indicate the range of subarrays that needs to be accessed. The `NumPy` slicing syntax follows that of the standard Python `list`. In general, the syntax is `x[start:stop:step]`. If any of these are unspecified, they default to the values `start = 0`, `stop = size of dimension`, and `step = 1`. We'll take a look at accessing subarrays in one dimension and in two dimensions.

To slice a **one-dimensional array**,

In [26]:
e = np.arange(-10, 10, 2)
e

array([-10,  -8,  -6,  -4,  -2,   0,   2,   4,   6,   8])

In [27]:
e[:5]

array([-10,  -8,  -6,  -4,  -2])

returns the first five elements of the array `e`. Similarly, if we want to return the elements after the fifth index,

In [28]:
e[5:]

array([0, 2, 4, 6, 8])

To slice the array on the middle elements,

In [29]:
e[3:8]

array([-4, -2,  0,  2,  4])

To return every other element of the array starting at index 0 and 1, respectively,

In [30]:
e[::2]

array([-10,  -6,  -2,   2,   6])

In [31]:
e[1::2]

array([-8, -4,  0,  4,  8])

Interestingly, we can reverse an array by setting the `step` to be negative. For example,

In [32]:
e[::-1]

array([  8,   6,   4,   2,   0,  -2,  -4,  -6,  -8, -10])

>**Question** What does the code below do and why does it return such specific element(s)?

In [33]:
e[2::-2]

array([ -6, -10])

On the other hand, slicing **two-dimensional** arrays can get confusing at first, but as long as you take into account that Python indexing always start at 0, you have nothing to worry about! We can consider the following examples to establish clarity.

In [34]:
f = np.array([[1,0,3,5],[4,5,0,1],[9,2,0,-2]])
f

array([[ 1,  0,  3,  5],
       [ 4,  5,  0,  1],
       [ 9,  2,  0, -2]])

To access the first row only,

In [35]:
f[0,:]

array([1, 0, 3, 5])

where `:` tells the program to access all columns of the matrix. But, if we only want the last three entries of the first row,

In [36]:
f[0,1:4]

array([0, 3, 5])

Similarly, if we want the first two column entries of the matrix,

In [37]:
f[:,0:2]

array([[1, 0],
       [4, 5],
       [9, 2]])

In [38]:
f[0:2,1:]

array([[0, 3, 5],
       [5, 0, 1]])

> **Question** How do we access the $\begin{bmatrix} 0&1\\0&-2 \end{bmatrix}$ entry of the matrix `f`?

> **Question** How do we reverse the elements of the matrix such that the resulting matrix is $\begin{bmatrix} -2&0&2&9 \\ 1&0&5&4 \\ 5&3&0&1 \end{bmatrix}$ ?
***
### np.reshape
This function passes the array to be reshaped as the first argument, then it brackets the dimensions you want to reshape it to. For example, if we want to put the numbers 1 through 9 in a 3 by 3 grid, you can do the following. Make sure that the array being reshaped fits into the set dimension, otherwise error will be encountered!

In [39]:
np.arange(1, 10, 1).reshape((3,3))

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [40]:
a = np.arange(2, 20)
np.reshape(a, (3, 6))

array([[ 2,  3,  4,  5,  6,  7],
       [ 8,  9, 10, 11, 12, 13],
       [14, 15, 16, 17, 18, 19]])

### np.append
Since arrays are **mutable**, we can add elements to it just like the `append()` in `list`. For example, we have two one-dimensional arrays and we want to sort of attach the elements of the second array to the first,

In [41]:
x = np.array([1, 2, 8, 4, 5, 7])
y = np.array([3, 6, 9, 12, 15])
z = np.append(x, y)
z

array([ 1,  2,  8,  4,  5,  7,  3,  6,  9, 12, 15])

Another example would be appending a two-dimensional array to a one-dimensional array,

In [42]:
np.append([1, 2, 3], [[4, 5, 6], [7, 8, 9]])

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

Additionally, sorting the array `z` in an ascending order can be done by

In [43]:
np.sort(z)

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 12, 15])

### np.concantenate
This is similar to `np.append`, but the arrays are attached together instead of just combining everything into a single array. Using the same arrays `x` and `y`,

In [44]:
a = np.array([[-2, 2], [3, 0]])
b = np.array([[4, -1]])

In [45]:
np.concatenate((a, b), axis = 0)

array([[-2,  2],
       [ 3,  0],
       [ 4, -1]])

In [46]:
np.concatenate((a, b.T), axis = 1)

array([[-2,  2,  4],
       [ 3,  0, -1]])

where `axis` specifies the directions along the rows and columns. In a two-dimensional array, remember that `axis = 0` corresponds to the first dimension, i.e. the rows, while `axis = 1` points to the columns as it is the second dimension.

***
## Mathematical operations and constants
Mathmatical operations behave differently when applied to `list` than they do to arrays. In fact, many of the operations cannot be applied to `list` or in the case of addition one needs to add another list. These makes `NumPy` much more convenient and efficient to work with compared to `list`. Suppose we have a one-dimensional array

In [47]:
h = np.arange(1, 8)
h

array([1, 2, 3, 4, 5, 6, 7])

Performing the usual mathematical operations,

In [48]:
h + h

array([ 2,  4,  6,  8, 10, 12, 14])

In [49]:
h + 2*h

array([ 3,  6,  9, 12, 15, 18, 21])

In [50]:
h**2

array([ 1,  4,  9, 16, 25, 36, 49], dtype=int32)

Other useful functions built-in to `NumPy` can also be used,

In [51]:
np.sum(h)

28

In [52]:
np.mean(h)

4.0

In [53]:
np.min(h)

1

In [54]:
np.max(h)

7

Some useful mathematical constants that are of use in common physics problems are

In [55]:
np.pi

3.141592653589793

In [56]:
np.e

2.718281828459045

Some common functions

In [57]:
np.sqrt(9)

3.0

In [58]:
np.exp(2)

7.38905609893065

In [59]:
np.sin(np.pi/2)

1.0

In [60]:
np.tan(np.pi)

-1.2246467991473532e-16

***
# Exercises
The best way to learn is to do it by yourself! Try to execute the following as instructed.

1. Create array from -100 to 100 with 250 elements, inclusively.
2. Set up an array from 1 to 27 with 9 elements, exclusively.
3. Create an arbitrary matrix with 6 rows and 10 columns whose entries are the numbers from 0 to 59.
    1. What are the second and fourth rows of the resulting matrix?
    2. What is the last column of the matrix?
4. Consider the array `[[3 ,6, 9, 12], [15 ,18, 21, 24], 
[27 ,30, 33, 36], [39 ,42, 45, 48], [51 ,54, 57, 60]]`. Write a **single** line of code that will return the array of odd rows and even columns of the array.
5. Initialize a 4 by 4 matrix whose entries on the main diagonal are the first four integers and the remaining elements are all zeros.
    1. Now, redefine the resulting matrix such that the main diagonals are now all nines.
6. **Optional** Consider two matrices $A = \begin{bmatrix} 1&2&7\\9&3&-2 \end{bmatrix}$ and $B = \begin{bmatrix} 0&-4\\3&-3\\0&1 \end{bmatrix}$.
    1. Find $A \cdot B$ and $A \times B$, i.e. the (dot) product and cross product between $A$ and $B$, respectively. Hint: The corresponding `NumPy` function for these operations were not discussed on the workshop, but you can look it up on the internet!
    2. What is the shape of the resulting matrices for both matrix operations?