# LAB1

## Numpy ##

Documentation: https://numpy.org/doc/stable/

It is a well maintained library used to perform (complex) mathematical computation on data structures such as vectors, matrices.

Radu-Daniel Voit, Last update 31/01/2022

Copyright University of Southampton, 2022. Permission is granted for copies to be made for personal use by University of Southampton students. This content should not be shared on published outside the University of Southampton.

### Initialisation  ###

Numpy arrays are initialised using the np.array() command. 
Run the following code:

In [None]:
import numpy as np

#Here is an example of vector initialisation
x = np.array([1,2,3])


Now, check x's datatype using the *type()* function

In [None]:
# Write code here:



The correct output: *numpy.ndarray*


Now, let's initialise a **matrix** *m*:

In [None]:
# Matrix initialisation
m = np.array([[1, 5],
              [2, 3],
              [4, 3]])


**Note:** When working with different mathematical structures in python, it is always a good ideea to keep track of the variable's dimensions

Write the code that outputs m's shape and run the code block:

In [None]:
print("x's size is:", x.shape)

#print("m's dimension is:", - your code here -)


**Note:** x is a one dimensional array, but one of the dimensions is empty in the output. In order to avoid bugs, it is preffered to represent vectors as (n,1) dimensional array.

In order to do this, you can use the *np.reshape()* method.
Search the numpy documentation and write the code to reshape *x* into a **(3,1)** sized array. Print x and its resulted shape.


In [None]:
# Complete, uncomment and run the code

#x = 

# print()
# print()


You can also initialise numpy arrays filled with random values or 0's by passing the dimensions of the array to the *np.zeros()* and *np.random.rand()* methods. 

Initialise a one dimensional array with 0 values, and a 3 x 3 matrix of random values:

In [None]:
# Write code here:


### Operators ####

In contrast to the python lists, numpy arrays are "specialised data structures". Also, most of the operations in numpy are implemented in C. This combined properties allow for much faster computations, especially on vectors and matrices. 

Numpy offers a wide variety of operators, ranging from elementwise to logical and statistical. I invite you to check the oficial documentation for more detail on each operator, as this tutorial will only touch on the basics.

Run the code below to see the time differece when performing the dot (scalar) product of two equally sized vectors using numpy *np.dot()* operator and classic for-loop python code.

In [None]:
import time

#initialise 2 vectors
a = np.random.rand(1000000)
b = np.random.rand(1000000)

tic = time.time() #register time
p = np.dot(a,b)
toc= time.time()

print("Numpy dot product took:", str(1000*(toc-tic)), "ms")

#for loop implementation
c = 0

tic = time.time()
for i in range(1000000):
    c += a[i] * b[i]
toc = time.time()

print("For loop implementation took:", str(1000*(toc-tic)), "ms")

Now, lets test more elementwise operations. Using the documentation, write the code to initialise a vector x with the values (1,2,3,4,5) and perform: scalar addition and multiplication    

In [None]:
# x = -your code here-
# print("Addition result:", -your code here-)
# print("Multiplication result:", -your code here-)

Initialise a vector y with the values (3,3,4,5,5). Uncomment the rest of the code and run it.

In [None]:
#y= -your code here-

#addition = x + y
#multiplication = x * y
#print("Addition result:", addition)
#print("Multipl result", multiplication)


**Note** Array multiplication it is not the same as matrix product. 

Now, run the code below to initialise two matrices.

In [None]:
# Initialise 2 3x3 matrices

a = np.array([[0, 3, 4],
              [1, 6, 4],
              [2, 3, 3]])

b = np.ones((3,3))

print("a =", a)
print()
print("b =", b)

Try some operations. What do you observe?

In [None]:
# Write your code here:


One of the most usefull operators used for array manipulation is *np.transpose*. In practice, for an numpy array *a*, its transpose is computed using *a.T*. Now, lets take for example one of the most popular equations in machine learning:

**Y = wX + b** 

In the following code sample example, *X* is a (3 x 10) dimensional matrix representing input data and
*w* is the weight vector of size 3

In order to find Y, we have to calculate wX first. Read the following code, and run it.
Make the necessary changes in order to avoid errors.

In [None]:
# Run and modify this code

X = np.random.rand(3,10)
w = np.random.rand(3,1)

product = np.dot(w,X)
print("wX = ", product)
print()
print("The shape of wX is:", product.shape)

**Note** Given an array X, functions such as *np.exp(X)* applies the exponential function to every element of X => np.exp(X) = (e^(x_1), e^(x_2), ..., e^(x_n)

In [None]:
# Run this code
np.exp(x)

In the following example, you will need to complete the code to implement a method for normalising a matrix row-wise. You will use the *np.linalg.norm(x, axis=1, keepdims=True)* method to compute the norm of a matrix x (x_norm). Then, using the formula X_normalised = x / x_norm we will normalise the matrix.

*axis = 1* indicates that the norm will be computed row-wise. When axis=0 is set, the function will be applied column wise

*keepdims=True* ensures that the result will shaped correctly against x. Reshaping and broadcasting are discussed later in this tutorial

In [None]:
# Function for normalising rows in a matrix

def normalise(x):

    """
    Arg: x - a numpy matrix of shape(n,m)
    Output: norm - matrix x normalised by row
    
    """
#    x_norm =  -your code here- 
#    norm = -your code here-

    return norm

### BroadCasting and Reshaping ####
**Documentation broadcasting:**https://numpy.org/doc/stable/user/basics.broadcasting.html 

**Documentation reshape:** https://numpy.org/doc/stable/reference/generated/numpy.reshape.html


In numpy, **broadcasting** refers to the process of reshaping an array of smaller size to make it compatible for operations with arrays of higher dimensions. The simplest usage of broadcasting can be seen when performing operations between an array (*a* ) and a scalar (*s*). As mentioned in the documentation, *s* is broadcasted into an array of the same size as *a*

*s := (s_1, s_2,..., s_n), where s_1 = s_2 = ... = s_n = s and n = len(a)*

**Note:** It is important to be mindful of when broadcasting is used as it can slow the computational process

When an operation requires broadcasting, Numpy first checks the compatibility of the two arrays by comparing the shapes, starting with the rightmost dimensions. Two dimensions are considered compatible if they are equal or one of them is 1.

Run the following code to see broadcasting at work.

In [None]:
# Run the following code:

a = np.array([[0, 3, 4],
              [1, 6, 4],
              [2, 3, 3]])

x = np.ones((1,3)) 

print("a_shape:", a.shape)
print("x_shape:", x.shape)
print()

ad = a + x
print("a + x = ")
print(ad)
print()
print("ad_shape:", ad.shape)

The shapes are compatible, therefore the operation can be performed, and it works with all dimensions:

In [None]:
# Run the following code:

a = np.ones((6,1,5,1,3))
x = np.ones((4,1,2,3))

print("a_shape:", a.shape)
print("x_shape:", "  ",x.shape)
print()
ad = a + x
print("ad_shape:", ad.shape)

When the arrays are not compatible, the following ValueError will be thrown: 

In [None]:
# Run the following code

a = np.ones((3,3,3))
x = np.ones((3,2,3))

print("a_shape:", a.shape)
print("x_shape:", x.shape)
print()
ad = a + x
print("ad_shape:", ad.shape)

In addition to this, you might choose to perform reshaping yourself, depending on the task at hand. One of the most encoutered practical examples would be the reshaping of images, represented as 3-dimensional arrays of shape: *(length, height, 3)*, are reshaped (or "rolled") into a 1D array *(length*height*3,1)* in order to be processed.

Please complete the following code in order to implement a method for reshaping pictures. Use the documentaiton.

In [None]:
# Complete the method and run the code:

def image_to_vector(image):
    
    """
    Arg: image - an 3D array of shape(l, h, 3)
    Output: vector - an 1D array of shape (l*h*3, 1)

    """
    #Hint: use x.shape[i] returns the i'th dimension of array x
    
    #vector = 
    return vector  

Now, let's test it:

In [None]:
# Run the followinng code

test = np.random.rand(3,2,3)
print("test image:")
print(test)
print()

test_vectorised = image_to_vector(test)
print("The resulted vector is:")
print(test_vectorised)
print()
print("The shape of the vector is:", test_vectorised.shape)

## HOMEWORK ##

### Exercise 1

Now, I invite you to test your numpy knowledge on a practical NLP exercise. Check the following sentences: 

*Margaret has a dog.*

*Jane has a dog and a cat.*

*The brown dog chases Jane's cat.*

When looking at the vocabulary used in these sentences, you can see there are 10 different words used. Let variables *sentences* be a list of sentences and *vocabulary* = [Margaret, has, a , dog, Jane, and, cat, the, brown, chases] be the list of words used.

Run the following code to declare them:

In [None]:
# Run this code:

sentences = ['Margaret has a dog',
             'Jane has a dog and a cat',
             'the brown dog chases Jane\'s cat']

vocabulary = ["Margaret", "has", "a" , "dog", "Jane", "and", "cat", "the", "brown", "chases"]

Please write the code to check the type of *sentences* and *vocabulary* variables

In [None]:
# Write your code

Now, for simplicity in operations, let's transform each sentence in a list of words and store them into the same list:

In [None]:
# Run this code:

for i in range(0,len(sentences)):
    sentence = sentences[i]
    sentences[i] = sentence.split()
    
print(sentences)

**Bonus information** In NLP, extracting the words from a document and counting their number is called a Bag of Words. You will learn more about them later in the course. Lets contruct the bag-of-words for the input sentences given, using a python dictionary - a data structure which is used to store data as (key,value) pairs. The values in the dictionary are accessed using the keys. 
Here is how it works:

In [None]:
# Run this code:

bow = {} # This is how an empty dictionary is initialised

for word in vocabulary:
        appearances = 0
        
        for sentence in sentences:
            appearances += sentence.count(word)
            
        # make a new entrance in the dictionary using the word as a key
        # and store the number of appearances    
        bow[word] = appearances 
        
print("bow =",bow)
print("bow type is:", type(bow))

Now, in order to transform a sentence into a computable form, we can iterate through the vocabulary and verify what words appear in the sentence. The sentence will be represented as a vector of a size equal to the vocabulary filled with 0s and 1, marking which word appears in the sentence. This method is called One-Hot Enconding, and you will learn it in more details later in the course. 

Complete the code below in order to obtain the One-Hot encoding of the sentences.

In [None]:
# Complete and Run the code:

def One_Hot_Encoder(data, vocabulary):

    """
    Arg: data - a list of n sentences, where each sentence is a list of words
         vocabulary - a set of m words used in the data
         
    Output: a numpy matrix (n x m) containing the one-hot encoded data, based on the vocabulary
    """
    
    n = len(data)
    m = len(vocabulary) #your code here
    
    # Initialise a numpy matrix of sshape n x m.
    # Think about what kind of initialisation would benefit this problem
    
    # Your code here 
    
    for i in range(0,n):
        sentence = data[i]
        
        for j in range(0,m):
            
            if vocabulary[j] in sentence:
                
                # Change the value at position i,j in the matrix to 1. 
                # Use the documentation as needed
                
                #Your code Here   
                
    return encoded_sentences

**Note** There are popular machine learning libraries, such as scikit-learn, which have this algorithm already implemented. Once you will have the grasps on the basic python programming skills, you will have a multitude of options available.

https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html

Now, please write the code to apply the newly created method on our data. Print out the resulted matrix and the shape

In [None]:
# Your code here

The expected result should be a matrix filled with 0's and 1's, of shape (3,10)

**Bonus exercise** Write a decoder which takes an encoded matrix and returns the sentences

In [None]:
#Your code here

Now, in real world problems, One-Hot Encoding textual data, using a variate and complex vocabulary, can lead to very big dimensional and scarce matrices - *curse of dimensionality*.

In practice, for a model to gain better understanding of the data, it can use an embedding layer. You will gain a more in depth understanding of word embeddings later in the course, but for now just know that it is a method for representing textual data using dense vectors.Depending on the problem, an embedding layer can be specifically trained on a particular corpus/vocabulary, or, it can use the weights of a pre-trained model (e.g. GloVe).

For the purposes of this exercise, let's pretend we are initialising an embedding layer with random weights. Write a method that calculates and outputs the matrix multiplication of the weight matrix and the one-hot-encoded data (OH_encoded x W). 

In [None]:
# Write your code here:

# Initialise a random weight matrix, where each word represents a column and for each word, there are 32 dimensions

#weight_matrix =

# Write the method definition:

Compile the method written by you and apply it to your One-Hot Encoded Matrix and weight matrix

In [None]:
# Your code here

The result should be a matrix of shape (3, 32)

### Exercise 2

A simple exercise to test your understanding of how functions are applied on entire arrays in numpy. You will have less hints than in the last one, but ask any demonstrator for help if needed.

You will have to create a method that implements a sigmoid function. 
**Hint:** *sigmoid(x) = 1 / (1 + e^(-x))*

The method should be able to take a scalar or a numpy array of any size *x* and return *sigmoid(x)*. To make things interesting, the sigmoid function implementation should take only one line of code

In [None]:
# Your code here

# Solutions:

1. type(x) 
2. print("m's dimension is:", m.shape)
3. x = x.reshape(3,1) 
4. z = np.zeros((5,1)); 
   r = np.random.rand(3,3)
5. x = np.array([1,2,3,4,5]); x + scalar; x * scalar

6. y = np.array([3,3,4,5,5])
7. product = np.dot(w.T,X)
8. x_norm = np.linalg.norm(x,axis=1,keepdims =  True); norm = x/x_norm
9. vector = image.reshape(image.shape[0]*image.shape[1]*image.shape[2],1)


### Homework Solutions:

1. use type(sentences)
2. encoded_sentences = np.zeros((n,m)) 
3. encoded_sentences[i][j] = 1   
4. OneHot_encoded = One_Hot_Encoder(sentences, vocabulary)
5. weight_matrix = np.random.rand(32,10)
6. def Simple_Embedder(W, OH_encoded): 
       return np.dot(OH_encoded, W.T)
7. e = Simple_Embedder(weight_matrix, OneHot_encoded)
8. sigmoid = 1/(1+np.exp(-x))

