![alt text](https://upload.wikimedia.org/wikipedia/commons/thumb/1/1a/NumPy_logo.svg/775px-NumPy_logo.svg.png)

<h1> Numpy and python </h1>

<h2> Introduction </h2> <br>
NumPy (Numerical python) is a very popular Python library with a huge number of functions and utilities.
Later when we introduce Pytorch (The Deep Learning Framework used in this unit) we will see that there are many similarities between it and Numpy. In this way Numpy is our stepping stone between basic Python and Pytorch!

In [1]:
print("Hello")

Hello


<h3> Python recap </h3> <br>
Lets use pure Python to perform some matrix opperations

In [2]:
#Create some "Matrices" as lists of lists  

#3x3
W = [[1, 1, 1],
     [1.5, 1.5, 1.5],
     [2, 2, 2]]

#3x1
x = [[6], [7], [8]]
#3x1
b = [[1], [1], [1]]

#Variable to store output
#3x1
y = [[0], [0], [0]]

In [7]:
for i in range(len(W)):
    for j in range(len(x)):
        y[i][0] += W[i][j] * x[j][0]

for i in range(len(y)):
    y[i][0] += b[i][0]
print("Output y:", y)

Output y: [[34], [32.5], [43]]


Lets now compute Wx + b

In [None]:
#First lets compute W*x
for i in range(len(W)):
    for j in range(len(x)):
        y[i][0] += W[i][j] * x[j][0]
        
#now lets add b
for i in range(len(y)):
    y[i][0] += b[i][0]
    
#print out the result
print("Output:\n", y)

<h3> Introducing.... NumPy! </h3> <br>
See how tedious matrix operations are with pure python!?<br> Instead of trying to create our own functions (which is often a waste of time as other implementations will often be much better than our own) let's use Numpy to do the same thing!

In [10]:
#First we must import the numpy library! 
import numpy as np

In [11]:
W_np = np.array(W)
x_np = np.array(x)
b_np = np.array(b)
y_np = W_np * x_np + b_np
print("Output using numpy:\n", y_np)

Output using numpy:
 [[ 7.   7.   7. ]
 [11.5 11.5 11.5]
 [17.  17.  17. ]]


In [None]:
#We can transform our list of lists into a "numpy array" by using the function "array"
W_np = np.array(W)

x_np = np.array(x)

#lets use the function "ones" to create an array of ones!
b_np = np.ones((3, 1))

#Lets now compute Wx + b using these numpy variables!
output = np.matmul(W_np, x_np) + b_np

#print out the result
print("Output:\n", output)
print("Output shape:\n", output.shape)

<h2> Tensors aka Multi-Dimensional Arrays </h2> <br>
We've seen how we can create a "matrix" using np.array and a list of lists, lets have a little bit of a closer look at these multidemensional "arrays" also known as "Tensors"
<br>
eg:<br>
A "Scalar" is a 0D Tensor<br>
A "Vector" is a 1D Tensor<br>
A "Matrix" is a 2D Tensor etc etc<br>
An understanding of Tensors will be essential when we move to pytorch!!

In [None]:
#Lets use numpy's "random" function (a part of the np.random module) to create a 3D Tensor!
T = np.random.random((2,3,4))
#Lets print it out!
print("Our 3D Tensor:\n", T)

From the above print out we can visualise what our 3D tensor looks like, we can pretend (for this Tensor) that it is just 2, 3x4 matrices stacked together. This interpretation can be useful when we move to higher dimensional Tensors later

<h2> Basic Element-wise Operations </h2> <br>
Lets see how we can perform some basic "Element-wise" operations on our numpy arrays (aka our Tensors)<br>
Note: By "Element wise" we mean that the operation is applied independently to every value, as opposed to something like a matrix operation

In [None]:
#lets create a 1D Tensor using "arange"
y = np.arange(10)
#this will create a "Vector" of numbers from 0 to 10
print("Our 1D Tensor:\n",y)

#We can perform normal python scalar arithmetic on numpy arrays
print("\nScalar Multiplication:\n",y * 10)
print("Addition and Square:\n",(y + 1)**2)
print("Addition:\n",y + y)
print("Addition and division:\n",y / (y + 1))

#We can use a combination of numpy functions and normal python arithmetic
print("\nPower and square root:\n",np.sqrt(y**2))

#Numpy arrays are objects and have functions
print("\nY -\n Min:%.2f\n Max:%.2f\n Standard Deviation:%.2f\n Sum:%.2f" %(y.min(), y.max(), y.std(), y.sum()))

<h2> Matrix opperations </h2> <br>
Now let's use some Numpy functions to perform some real matrix opperations

In [None]:
#Create Matrix-1
matrix_1 = np.random.random((3,3))
#Create Matrix-2
matrix_2 = np.random.random((3,3))

#Add the 2 Matrices
print("Addition:\n",np.add(matrix_1,matrix_2))

#Subtraction
print("Subtraction:\n",np.subtract(matrix_1,matrix_2))

#Multiplication
print("Multiplication:\n",np.matmul(matrix_1,matrix_2))

print("\nFor Matrix 1:")
#Calculate its inverse
print("The inverse is:\n",np.linalg.inv(matrix_1))

#Transpose the matrix
print("The Transpose is:\n", matrix_1.T)

#Calculate the Determinant
print("The Determinant is:\n", np.linalg.det(matrix_1))

#Print the Trace
print("The Trace is:\n", matrix_1.trace())

#Calculate the Rank
print("The Rank is:\n", np.linalg.matrix_rank(matrix_1))

# Calculate the Eigenvalues and Eigenvectors of that Matrix
eigenvalues ,eigenvectors=np.linalg.eig(matrix_1)

print("The Eigenvalues are:\n",eigenvalues)
print("The Eigenvectors are:\n",eigenvectors)

<h2> Tensor Manipulation </h2>
<b>Being able to index and change the shape of Tensors is one of the most important skills that you will need to learn going forward!<br>
I cannot overstate the importance of developing an intuative understanding of the following!!</b>
<br>
<h3> Indexing </h3> <br>
Just like with lists, indexing matrices (or Tensors) is very important, we can do the same with numpy arrays!

<b> Vectors! </b>

In [None]:
#Create a vector
vector = np.array([ 1,2,3,4,5,6 ])
print("Our Vector:\n",vector)

#Select all elements of a vector
print("All elements:\n",vector[:])

#Select everything up to and including the 3rd element
print("Within a range:\n",vector[:3])

#Select the everything after the 3rd element
print("Another range:\n",vector[3:])

#Select the last element
print("The last element:\n",vector[-1])

#Select 3rd element of Vector
#INDEXING STARTS AT 0
print("3rd Element:\n",vector[2])

<b> Matrix! </b>

In [None]:
#Create a Matrix
matrix = np.array([[1,2,3],[4,5,6],[7,8,9]])
print("Our Matrix:\n",matrix)

#Select 2nd row 2nd column
print("A single element:\n",matrix[1,1])

#Select the first 2 rows and all the columns of the matrix
print("Index rows with all columns:\n",matrix[:2,:])

#Select all rows and the 2nd column of the matrix
print("Index columns with all rows:\n",matrix[:,:2])

#Select rows 1 and 2 and column 2 and 3 of the matrix
print("Index columns and rows:\n",matrix[:2,1:])

<h3> Describing Tensors </h3> <br>
Let's see how we can view the characteristics of our Tensors

In [None]:
#Lets create a large 3D Tensor
Tensor = np.random.randint(0, 10, (3, 5, 5))

#View the Number of elements in every dimension
print("The Tensors shape is:", Tensor.shape)

#View the number of elements in total
print("The Tensors size is:", Tensor.size)

#View the number of Dimensions(2 in this case)
print("There are %d Dimensions" %(Tensor.ndim))


<h3> Reshaping </h3> <br>
We can change a Tensor to one of the same size (same number of elements) but a different shape by using some Numpy functions

In [None]:
print("Origional Tensor:\n", Tensor)

#We can also use the Flatten method to convert to a 1D Tensor
print("Flatten to a 1D Tensor:\n",Tensor.flatten())

#Let us reshape our Tensor to a 2D Tensor
print("Reshape to 3x25:\n", Tensor.reshape(3,25))

#Here the -1 tells Numpy to put as many elements as it needs here in order to maintain the given dimention sizes
#AKA "I don't care the size of this dimention as long as the first one is 3"
print("Reshape to 5xwhatever:\n",Tensor.reshape(5,-1))

<h3>Plotting with matplotlib</h3>
We've already seen the matplotlib library and used it to plot out lists of data. The matplotlib library works very closely with numpy and can plot out numpy arrays  

In [None]:
import matplotlib.pyplot as plt

In [None]:
#Create a sine wave using numpy functions and matplotlib
#create an array of 1000 points from 0 - 100
x = np.linspace(0, 100, 1000)
#calculate the sine of the points
y = np.sin(x)
#create a line plot
plt.plot(x, y)

In [None]:
#Let's create an image as a 2D array
img = np.zeros((60, 60))
#Set some points to 1
img[10:20, (15, 45)] = 1
img[35:45, (15, 45)] = 1
img[45, 15:46] = 1

#imshow can display numpy arrays as colour (3D array - HxWxC) or grayscale images (2D array - HxW)!
plt.imshow(img)

<h2> Broadcasting </h2>
<b>Important to know!</b> <br>
Broadcasting is a powerful tool that lets us perform element wise matrix or vector operations across higher dimensional Tensors. <br>

![alt text](https://www.researchgate.net/profile/Thanh_Dang_Diep/publication/326377197/figure/fig1/AS:647925902368772@1531488981904/Broadcasting-in-NumPy.png)

Lets see what we mean by this

In [None]:
#Lets create 2 differently shaped 2D Tensors (Matrices)
Tensor1 = np.random.randint(0, 10, (1, 4))
Tensor2 = np.random.randint(0, 10, (2, 1))

print("Tensor 1:\n", Tensor1)
print("With shape:\n", Tensor1.shape)

print("\nTensor 2:\n", Tensor2)
print("With shape:\n", Tensor2.shape)

We know from high school days that there is no way we can perform a normal matrix addition on these two matrices, so when we try Numpy should give us an error right?

In [None]:
Tensor3 = np.add(Tensor1, Tensor2)

print("The resulting Tensor:\n", Tensor3)
print("The resulting shape is:\n", Tensor3.shape)

WHAT!?! A 1x4 Matrix added a 2x1? resulting in a 2x4 Matrix, What did Numpy do here?<br>
Well, as suggested Numpy is NOT performing a normal Matrix addition. Instead Numpy is performing a broadcast operation, THEN a Matrix addition. <br>
So then, what is Broadcasting? <br>
Let's look again at the shape of those two 2D Tensors and the resulting Tensor

In [None]:
print("Tensor 1 shape:\n", Tensor1.shape)
print("Tensor 2 shape:\n", Tensor2.shape)
print("Resulting Tensor shape:\n", Tensor3.shape)

We can see the resulting shape of the Tensor addition seems to come from the larger dimensions of the multiplication<br>
1x<b>4</b>+<b>2</b>x1 = <b>4x2</b> <br>
During the "Broadcast" operation Numpy "repeats" (Broadcasts) dimensions of the two Tensors so that they are the same shape, and then performs the addition. <br>
Let's do this manually:

In [None]:
#Repeat entries of dim0 2 times
Tensor1_repeated = Tensor1.repeat(2, 0)
#Repeat entries of dim1 4 times
Tensor2_repeated = Tensor2.repeat(4, 1)

print("Tensor 1:\n", Tensor1_repeated)
print("Tensor 1 shape:\n", Tensor1_repeated.shape)

print("\nTensor 2:\n", Tensor2_repeated)
print("Tensor 2 shape:\n", Tensor2_repeated.shape)

We've repeated the SMALLER corresponding dimension to that of the larger dimension.<br>
NOTE: You can only broadcast Tensors if dimensions are indices multiples of each other!

Now lets add the two resulting Tensors

In [None]:
Tensor3_repeated = np.add(Tensor1_repeated, Tensor2_repeated)
print("Tensor 3:\n", Tensor3_repeated)

We can see that the resulting Tensor is the same as the result from the initial addition!<br>
NOTE: Numpy does not actually repeat the stored memory when performing the operation

<b>Broadcasting use case</b><br>
Lets see WHY we would use broadcasting with an example.<br>
Let’s create a "dataset" of 100 data points, each datapoint will be a vector of 10 values (lets imagine each value could represent some measured "quantity")

In [None]:
#Create a random dataset
data = np.random.randint(0, 100, (100, 10))
print("Data shape:", data.shape)

Lets use broadcasting to normalise every datapoint (i) independently for every quantity (q) aka we need to find the mean and standard deviation of every "quantity" (q) and the perform a normalisation<br>
\begin{equation*}
Xnorm_i^q = \frac{(x_i^q -\mu^q)}{\sigma^q}
\end{equation*}

In [None]:
#Find the mean across the 0th dimension
#Aka find the mean of each quantity over the 100 datapoint
#This should give us 10 means
mean_vector = data.mean(0).reshape(1, 10)
#Find the standard deviation across the 0th dimension
std_vector = data.std(0).reshape(1, 10)

print("Quantity mean:\n", mean_vector)
print("Quantity std:\n", std_vector.round(2))

print("mean shape:\n", mean_vector.shape)
print("std shape:\n", std_vector.shape)

Lets perform the normalisation on the whole dataset

In [None]:
data_norm = (data-mean_vector)/std_vector
print("Normalised dataset  shape:\n", data_norm.shape)

By using our knowledge of broadcasting we don't have to use loops and indexing to perform operations!<br>
This is incredibly useful as matrix (Tensor) operations are MUCH faster than iterative loops and MUCH MUCH faster when we start to use GPUs and operations are performed in parallel