# Numpy

When it comes to calculation that involves numeric data, look no further than the ```numpy``` library.  ```numpy``` is a workhorse library whose format and functions are utilized in most Python data processing packages. In this notebook we will go through basic manipulation of numeric data, including linear algebra functions.

We start with importing the NumPy library:

In [2]:
#Import the NumPy library
import numpy as np

You might notice we are using a slightly different way to import the library
than before. The ```as``` keyword allows us to shorten the name of the imported
module to whatever we wish it to be, which is convenient when the original 
module name is long.

### A. Data Format: Matrix vs Array
When you convert data into numpy format, you have a choice of converting it into a ```matrix``` or an ```array```. Which one to use depends on your need: 
- ```matrix```: when data is in this format, it behaves in exactly the way you would expect in linear algebra. This format also provides shortcuts to functions commonly used in linear algebra. The downside is that matrix, by definition, can only have two dimensions, so operations on ```matrix``` assume the data is 2D. Furthermore you also lose some convenient features of numpy.
- ```array```: data in this format can have any number of dimension, which is often necessary when it comes to processing data. Operations such as multiplication and power are also treated differently from ```matrix```. Most python data processing libraries return data in this format.

We will first cover ```matrix``` because linear algebra is what you should be more familiar with, followed by ```array```.

### B. Creating Matrices

Let us start with creating a matrix. The syntax to create a matrix is
```python 
matrix_name = np.matrix([[row1],[row2],...])
```

So if we have a matrix 
$ A =
\begin{bmatrix}
a & b \\
c & d
\end{bmatrix}
$, we would create it by
```python
A = np.matrix([[a,b],[c,d]])
```

In [3]:
#Create a matrix
A = np.matrix([[1,2],[3,4]])
A

matrix([[1, 2],
        [3, 4]])

We can get the dimension/shape of the matrix by ```matrix_name.shape```.

In [4]:
#Shape of matrix
A.shape

(2, 2)

There are additional commands to create special matrices:
- Identity matrix: ```np.identity(dimension)``` or ```np.eye(dimension)```
- Matrix of ones: ```np.ones((row,column))```
- Matrix of zeros: ```np.zeros((row,column))```
- Empty matrix: ```np.empty((row,column))```
- Replacing all values with a constant: ```matrix_name.fill(new_value)```

In [5]:
#Identity matrix with 3 rows and 3 columns
I = np.eye(3)
I

array([[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.]])

In [6]:
#A 2x3 matrix of ones
np.ones((2,3))

array([[ 1.,  1.,  1.],
       [ 1.,  1.,  1.]])

In [7]:
#A 3x4 matrix of zeros
np.zeros((3,4))

array([[ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.]])

In [8]:
#A 3x3 matrix of 5
#There are many ways to achive this

#Method 1
B = np.zeros((3,3))
B.fill(5)
print(B)

#Method 2
C = np.eye(3)
C.fill(5)
print(C)

#Method 3
D = np.ones((3,3))
D = D * 5
print(D)

[[ 5.  5.  5.]
 [ 5.  5.  5.]
 [ 5.  5.  5.]]
[[ 5.  5.  5.]
 [ 5.  5.  5.]
 [ 5.  5.  5.]]
[[ 5.  5.  5.]
 [ 5.  5.  5.]
 [ 5.  5.  5.]]


To change to shape of a matrix, use ```matrix_name.reshape(size)```.

In [12]:
#A 2x3 matrix
A = np.array([[1,2,3],[4,5,6]])
print(A)
#B is A reshaped to 3x2
B = A.reshape(3,2)
print(B)

[[1 2 3]
 [4 5 6]]
[[1 2]
 [3 4]
 [5 6]]


### C. Arithmetics

Basic arithmetics work as you would expect:

In [14]:
#Addition
A = np.matrix([[1,2],[3,4]])
B = np.ones((2,2)) * 2
print(A)
print(B)
print(A + B)

[[1 2]
 [3 4]]
[[ 2.  2.]
 [ 2.  2.]]
[[ 3.  4.]
 [ 5.  6.]]


In [15]:
#Subtraction
A - B

matrix([[-1.,  0.],
        [ 1.,  2.]])

In [16]:
#Multiplication
A * B

matrix([[  6.,   6.],
        [ 14.,  14.]])

In [17]:
#Multiplication with constant term
A * 2

matrix([[2, 4],
        [6, 8]])

### D. Transpose and Inverse

- Transpose: ```matrix_name.T```
- Inverse: ```matrix_name.I```
- Determinant: ```np.linalg.det(matrix_name)```
- Rank: ```np.linalg.matrix_rank(matrix_name)```

In [18]:
#Transpose
A.T

matrix([[1, 3],
        [2, 4]])

In [19]:
#Inverse
A.I

matrix([[-2. ,  1. ],
        [ 1.5, -0.5]])

In [20]:
#Determinant
np.linalg.det(A)

-2.0000000000000004

In [None]:
#Rank (number of independent rows)


### E. Accessing Individual elements, Rows and Columns

You can access an individual element within a matrix with 
```python
matrix_name[row - 1,column - 1]
```

In [21]:
#Element in the 1st row and 2nd column
A[0,1]

2

Access a whole row by specifying only the row number:
```python
matrix_name[row - 1]
```

In [22]:
#1st Row 
A[0]

matrix([[1, 2]])

Access a whole column by specifying the row number to be ```:```,
which means all rows, and the column number:
```python
matrix_name[:,column - 1]
```

In [23]:
#1st Column
A[:,0]

matrix([[1],
        [3]])

The ```:``` above represents a range. You can specify the 
beginning and the end of the range as follows:
```python
start:end
```
As with ```range()```, the range starts at ```start``` but 
ends *before* ```end```. 

If you do not specify ```start```, 
the default is to start from the beginning. If you do not 
specify ```end```, the default is to end at the maximum.

In [28]:
#Create a 3x3 matrix
A=np.matrix([[1,2,3],[4,5,6],[7,8,9]])
print(A)
#Bottom-right 2x2 matrix
print(A[1:3,1:3])

#The two leftmost columns
print(A[:,0:2])


[[1 2 3]
 [4 5 6]
 [7 8 9]]
[[5 6]
 [8 9]]
[[1 2]
 [4 5]
 [7 8]]


### F. Arrays

To see the difference between ```matrix``` and ```array```, let us create identical sets of data in both formats:

In [7]:
A = np.matrix([[1,2],[3,4]])
B = np.matrix([[2,2],[2,2]])

C = np.array([[1,2],[3,4]])
D = np.array([[2,2],[2,2]])

Now let us try multiplying the data:

In [8]:
A * B

matrix([[ 6,  6],
        [14, 14]])

In [9]:
C * D

array([[2, 4],
       [6, 8]])

The results are different. ```A*B``` gave us the result of a matrix multiplication, while ```C*D``` gave us something else. What did it do, exactly?

Let us try squaring ```A``` and ```C``` instead:

In [9]:
A**2

matrix([[ 7, 10],
        [15, 22]])

In [10]:
C**2

array([[ 1,  4],
       [ 9, 16]])

Now it should be clear what ```array``` operations do: it is performing *elementwise* operations. In other words,
$$
A \text{ op } B = \left[a_{ij} \text{ op } b_{ij} \right]
$$

For matrix operations that are elementwise in nature, ```matrix``` and ```array``` format give the same results:

In [7]:
A - B

matrix([[-1,  0],
        [ 1,  2]])

In [8]:
C - D 

array([[-1,  0],
       [ 1,  2]])

To perform matrix multiplication on arrays, you can use ```np.dot()```:

In [19]:
np.dot(C,D)

array([[ 6,  6],
       [14, 14]])

```Array``` does not have a ```.I``` shortcut for inverse:

In [17]:
C.I

AttributeError: 'numpy.ndarray' object has no attribute 'I'

Instead you have to use ```np.linalg.inv()```:

In [18]:
np.linalg.inv(C)

array([[-2. ,  1. ],
       [ 1.5, -0.5]])

Conversion between the two formats can be done with ```np.asarray()``` and ```np.asmatrix()```:

In [15]:
np.asarray(A)

array([[1, 2],
       [3, 4]])

### G. Basic Statistics

- Mean: ```np.mean()```
- Median: ```np.median()```
- Variance: ```np.var()```
- Standard Deviation: ```np.std()```
- Mode: ```scipy.stats.mode()```

The default behavior of these functions is to take all elements in the array into consideration:

In [28]:
#Mean
np.mean(A)

2.5

If you want the mean across a particular dimension, you need to specify that with the ```axis``` option:

In [30]:
#Mean across rows
print(np.mean(A, axis=0))

#Mean across columns
print(np.mean(A, axis=1))

[[ 2.  3.]]
[[ 1.5]
 [ 3.5]]


### H: Random Number Generators

- Uniform, from 0 to smaller than 1: ```np.random.rand(d0,d1,...)```
- Uniform, integers, from *a* to smaller than *b*: ```np.random.randint(low,high,size)```
- Standard normal distribution: ```np.random.randn(d0,d1,...)```

In [31]:
#10 random integers between 0 and 9
E = np.random.randint(0,10,10)
E

array([0, 6, 6, 5, 6, 8, 1, 6, 3, 0])

In [16]:
np.asmatrix(C)

matrix([[1, 2],
        [3, 4]])