# Numpy

Version: 2024-9-17

When it comes to calculation that involves numeric data, look no further than the ```numpy``` library.  ```numpy``` is a workhorse library whose format and functions are utilized in most Python data processing packages. In this notebook we will go through basic manipulation of numeric data, including linear algebra functions.

We start with importing the NumPy library:

In [None]:
# Import the NumPy library
import numpy as np

You might notice we are using a slightly different way to import the library
than before. The ```as``` keyword allows us to shorten the name of the imported
module to whatever we wish it to be, which is convenient when the original 
module name is long.

### A. Data Format: Matrix vs Array
When you convert data into numpy format, you have a choice of converting it into a ```matrix``` or an ```array```. Which one to use depends on your need: 
- ```matrix```: when data is in this format, it behaves in exactly the way you would expect in linear algebra. This format also provides shortcuts to functions commonly used in linear algebra. The downside is that matrix, by definition, can only have two dimensions, so operations on ```matrix``` assume the data is 2D. Furthermore you also lose some convenient features of numpy.
- ```array```: data in this format can have any number of dimension, which is often necessary when it comes to processing data. Operations such as multiplication and power are also treated differently from ```matrix```. Most python data processing libraries return data in this format.

We will first cover ```matrix``` because linear algebra is what you should be more familiar with, followed by ```array```.

### B. Creating Matrices

Let us start with creating a matrix. The syntax to create a matrix is
```python 
matrix_name = np.matrix([[row1],[row2],...])
```

So if we have a matrix 
$ A =
\begin{bmatrix}
a & b \\
c & d
\end{bmatrix}
$, we would create it by
```python
A = np.matrix([[a,b],[c,d]])
```

In [None]:
# Create a matrix


We can get the dimension/shape of the matrix by ```matrix_name.shape```.

In [None]:
# Shape of matrix


There are additional commands to create special matrices:
- Identity matrix: ```np.identity(dimension)``` or ```np.eye(dimension)```
- Matrix of ones: ```np.ones((row,column))```
- Matrix of zeros: ```np.zeros((row,column))```
- Empty matrix: ```np.empty((row,column))```
- Replacing all values with a constant: ```matrix_name.fill(new_value)```

In [None]:
# Identity matrix with 3 rows and 3 columns


In [None]:
# A 2x3 matrix of ones


In [None]:
# A 3x4 matrix of zeros


In [None]:
# A 3x3 matrix of 5
# There are many ways to achive this

# Method 1


# Method 2


# Method 3


### C. Arithmetics

Basic arithmetics work as you would expect:

In [None]:
# Addition


In [None]:
# Subtraction


In [None]:
# Multiplication


In [None]:
# Multiplication with constant term


### D. Transpose and Inverse

- Transpose: ```matrix_name.T```
- Inverse: ```matrix_name.I```
- Determinant: ```np.linalg.det(matrix_name)```
- Rank: ```np.linalg.matrix_rank(matrix_name)```

In [None]:
# Transpose


In [None]:
# Inverse


In [None]:
# Determinant


In [None]:
# Rank (number of independent rows)


To change to shape of a matrix, use ```matrix_name.reshape(size)```.

In [None]:
# A 2x3 matrix


# B is A reshaped to 3x2


### E. Accessing Individual elements, Rows and Columns

You can access an individual element within a matrix with 
```python
matrix_name[row - 1,column - 1]
```

In [None]:
A = np.matrix([[1,2],[3,4]])

# Element in the 1st row and 2nd column


Access a whole row by specifying only the row number:
```python
matrix_name[row - 1]
```

In [None]:
# 1st Row 


Access a whole column by specifying the row number to be ```:```,
which means all rows, and the column number:
```python
matrix_name[:,column - 1]
```

In [None]:
# 1st Column


The ```:``` above represents a range. You can specify the 
beginning and the end of the range as follows:
```python
start:end
```
As with ```range()```, the range starts at ```start``` but 
ends *before* ```end```. 

If you do not specify ```start```, 
the default is to start from the beginning. If you do not 
specify ```end```, the default is to end at the maximum.

In [None]:
# Create a 3x3 matrix
A = np.matrix([[1,2,3],[4,5,6],[7,8,9]])

# Bottom-right 2x2 matrix


# The two leftmost columns



### F. Arrays

To see the difference between ```matrix``` and ```array```, let us create identical sets of data in both formats:

In [None]:
A = np.matrix([[1,2],[3,4]])
B = np.matrix([[2,2],[2,2]])

C = np.array([[1,2],[3,4]])
D = np.array([[2,2],[2,2]])

Now let us try multiplying the data:

The results are different. ```A*B``` gave us the result of a matrix multiplication, while ```C*D``` gave us something else. What did it do, exactly?

Let us try squaring ```A``` and ```C``` instead:

Now it should be clear what ```array``` operations do: it is performing *elementwise* operations. In other words,
$$
A \text{ op } B = \left[a_{ij} \text{ op } b_{ij} \right]
$$

For matrix operations that are elementwise in nature, ```matrix``` and ```array``` format give the same results:

To perform matrix multiplication on arrays, you can use ```np.dot()```:

```Array``` does not have a ```.I``` shortcut for inverse:

Instead you have to use ```np.linalg.inv()```:

Conversion between the two formats can be done with ```np.asarray()``` and ```np.asmatrix()```:

### G. Basic Statistics

- Mean: ```np.mean()```
- Median: ```np.median()```
- Variance: ```np.var()```
- Standard Deviation: ```np.std()```
- Mode: ```scipy.stats.mode()```

The default behavior of these functions is to take all elements in the array into consideration:

In [None]:
# Mean


If you want the mean across a particular dimension, you need to specify that with the ```axis``` option:

In [None]:
# Mean across rows


# Mean across columns


### H: Random Number Generators

- Uniform, from 0 to smaller than 1: ```np.random.rand(d0,d1,...)```
- Uniform, integers, from *a* to smaller than *b*: ```np.random.randint(low,high,size)```
- Standard normal distribution: ```np.random.rand(d0,d1,...)```

In [None]:
# 10 random integers between 0 and 9
