# Assignment 1.3: NumPy
**<div style="text-align: right"> [Total score: 10]</div>**

NumPy is a Python library that lets you work with multidimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays. [Here is NumPy's official documentation](https://numpy.org/devdocs/) if you feel like you need it.

This assignment is divided into multiple sections. In the first section, you will implement the Normal Equation for linear regression. In the next section, you will build a simple, single-layer neural network that classifies a data point into three different classes.

## Section 1: The Normal Equation

### Exercise 1: Generating Random Matrices

**<div style="text-align: right"> [Score: 5]</div>**
**Task:** Generate a `4x4` random matrix with `np.random.rand` using a random seed of `42`. Store it in a variable named `data`.


In [1]:
import numpy as np
data = None
# YOUR CODE HERE
np.random.seed(42)
data = np.random.rand(4,4)
data

array([[0.37454012, 0.95071431, 0.73199394, 0.59865848],
       [0.15601864, 0.15599452, 0.05808361, 0.86617615],
       [0.60111501, 0.70807258, 0.02058449, 0.96990985],
       [0.83244264, 0.21233911, 0.18182497, 0.18340451]])

In [2]:
assert data is not None

### Exercise 2: Indexing, Slicing and Reshaping

**<div style="text-align: right"> [Score: 5]</div>**

**Task**: Slice out `X` and `y` from `data` such that`X` is the first three columns, `y` is the last column. Reshape `y` to be `(4, 1)`.

In [19]:
X = None
y = None
X=data[:,:3]
y= data[:,3:].reshape(4,1)
y.shape

(4, 1)

In [4]:
assert X.shape == (4, 3)
assert y.shape == (4, 1)

### Exercise 3: The Normal Equation
**<div style="text-align: right"> [Score: 10]</div>**

**Task**: Implement the normal equation in a function called `theta` that takes in the matrices `X` and `y` and returns the least squares solution to the system $X * \text{<theta>} = y$. For your reference, $ \text{<theta>} = (X^TX)^{-1}X^Ty$.

In [5]:
import numpy as np
def theta(X, y):
    return np.dot(np.linalg.inv(np.dot(X.T,X)), np.dot(X.T,y))



In [6]:
### INTENTIONALLY LEFT BLANK

## Section 2: Computational Network

In this task, we'll build a series of functions in NumPy that will form a computational network.

### Exercise 4: Generating Random Data

**<div style="text-align: right"> [Score: 5]</div>**

**Task:** Generate two matrices `X` and `W` with `np.random.rand`(with random seed `42`):

- **`X`**: `6 x 2 x 2` random array
- **`W`**: `4 x 3` random array

Also, reshape `X` to be `6 x 4`.

In [20]:
X = None
W = None
np.random.seed(42)
X=np.random.rand(6,2,2)
W=np.random.rand(4,3)
X=X.reshape(6,4)
X

array([[0.37454012, 0.95071431, 0.73199394, 0.59865848],
       [0.15601864, 0.15599452, 0.05808361, 0.86617615],
       [0.60111501, 0.70807258, 0.02058449, 0.96990985],
       [0.83244264, 0.21233911, 0.18182497, 0.18340451],
       [0.30424224, 0.52475643, 0.43194502, 0.29122914],
       [0.61185289, 0.13949386, 0.29214465, 0.36636184]])

In [21]:
assert X.shape == (6, 4)
assert W.shape == (4, 3)


### Exercise 5: Linear Equations

**<div style="text-align: right"> [Score: 5]</div>**
**Task:** Create a function named `linear` that accepts `X`, `W`, and `b`. It should return the value $XW + b$.

In [22]:
def linear(X, W, b):
    return np.dot(X,W)+b
linear([1,1],[1,1],[5,6])

array([7, 8])

In [23]:
### INTENTIONALLY LEFT BLANK

### Exercise 6: Softmax Function
**<div style="text-align: right"> [Score: 5]</div>**
**Task:** Implement the softmax function

$$
\text{Softmax}(x_{i}) = \frac{\exp(x_i)}{\sum_j \exp(x_j)}
$$

Your function should be named `softmax` and it should take in a NumPy array `x`. It should find the softmax of every single row in `x` if it has multiple rows, or the softmax of `x` if `x` is a vector.

In [24]:
def softmax(X):
    return np.exp(X) / np.sum(np.exp(X),axis=1).reshape(-1,1)

In [25]:
### INTENTIONALLY LEFT BLANK

### Exercise 7: Cross Entropy

**<div style="text-align: right"> [Score: 5]</div>**
**Task:** Implement the CrossEntropy Loss:

$$ CE(y, \hat y) = -\sum_{i}y_{i}\log \hat y_{i} $$

Build a function `cross_entropy` that takes in two matrices `y` and `yhat`, and returns the cross entropy between each row in `y` and the corresponding row in `yhat`. That is, the shape of the returned array should be `(y.shape[0],)`.


In [13]:
def cross_entropy(y, yhat):
    return -np.sum(y*np.log(yhat),axis=1)


In [14]:
### INTENTIONALLY LEFT BLANK

### Exercise 8: Argmax

**<div style="text-align: right"> [Score: 5]</div>**

The argmax of a vector $x$ is defined as the index of the maximum value in $x$. For example, the argmax of the vector $(0,-5,9)$ is 2, since the third element in this vector is larger than all other elements. Remember that in much of programming, you count starting at 0.

**Task:**
Define a function `argmax` that takes in a matrix `A`, and returns a vector that contains the argmax of every single row in the matrix `A`.

In [15]:
def argmax(A):
    return np.argmax(A,axis=1)


In [16]:
assert argmax(np.array([[0, -5, 9]]))[0] == 2


### Exercise 9: Bringing It All Together!

**<div style="text-align: right"> [Score: 10]</div>**

Even though the last three exercises in this section probably seemed completely unrelated to you, they were in fact building up to what you will create now. We will build a simple model that takes in some random data, and tries *classifying* them into one of **#todo** classes. Remember the two arrays `X` and `W` from the first exercise in this section? You will be using them in this exercise.


**Task:** You will build a function `predict` that takes in `X` and `W` from before and `y`, which is given below, and then computes and returns the following **in order**:

1. Treating the rows in `X` as data points, you will compute the output of the linear expression $XW+ b$, setting `b` = 1. Store this result to the variable `linear_result`.
2. Then, you will find out the softmax of `linear_result`. Store this result to the variable `softmax_result`.
3. You will compute the crossentropy of `softmax_result` and `y`. Store this result to the variable `ce`.
3. You will then find out the argmax of the rows of `softmax_result`. Store this result to the variable `argmax_result`. The result in `argmax_result` are then the predicted classes for the instances in $A$.

Remember that all three of the above tasks should be done using the functions you have defined in the previous exercises. If you do not, you will **not** receive any points for this exercise.

In [33]:
def predict(X, W, y):
    # fill in the code using the functions you have defined above
    linear_result = linear(X,W,1) # set b = 1
    softmax_result = softmax(linear_result)
    ce = cross_entropy(softmax_result,y)
    argmax_result = argmax(softmax_result)
    return linear_result, softmax_result, ce, argmax_result


y = np.array([
    [0.8,0.1,0.1],
    [0.1,0.8,0.1],
    [0.3,0.5,0.2],
    [0.3,0.3,0.4],
    [0.8,0.1,0.1],
    [0.1,0.2,0.7]
])
linear_result, softmax_result, ce, argmax_result= predict(X, W, y)
print(linear_result, softmax_result, ce, argmax_result,sep='\n\n')

[[2.67248407 2.56020334 1.65051822]
 [2.00856359 2.0612276  1.74239177]
 [2.57110525 2.83153974 1.93832878]
 [1.7733409  1.98751355 1.33617193]
 [1.94737268 1.90463428 1.34865193]
 [1.87590672 1.96663882 1.44382095]]

[[0.44371871 0.39659281 0.15968849]
 [0.35456179 0.37373487 0.27170334]
 [0.35353068 0.45870383 0.18776549]
 [0.34665664 0.42945091 0.22389245]
 [0.39877557 0.38209161 0.21913282]
 [0.36441431 0.39902479 0.2365609 ]]

[1.37989798 1.52542528 1.04578749 1.13956296 1.4733546  1.56567593]

[0 1 1 1 0 1]
[[2.67248407 2.56020334 1.65051822]
 [2.00856359 2.0612276  1.74239177]
 [2.57110525 2.83153974 1.93832878]
 [1.7733409  1.98751355 1.33617193]
 [1.94737268 1.90463428 1.34865193]
 [1.87590672 1.96663882 1.44382095]]


In [18]:
assert linear_result is not None
assert softmax_result is not None
assert argmax_result is not None
assert ce is not None
