# Multivariate Linear Regression: Cost function
M2U2 - Exercise 1

## What are we going to do?
- Implement the cost function for multivariate linear regression

Remember to follow the instructions for the submission of assignments indicated in [Submission Instructions](https://github.com/Tokio-School/Machine-Learning-EN/blob/main/Submission_instructions.md).

In [1]:
import numpy as np

## Task 1: Implement the cost function for multivariate linear regression

In this task, you must implement the cost function for multivariate linear regression in Python using NumPy. The cost function must follow the function included in the slides and in the course manual.

To do this, first fill in the code in the following cell to implement the cost function.

The differences between a vectorised and a non-vectorised implementation are as follows:
- Vectorised uses linear algebra operations, from operations between vectors/matrices.
- Non-vectorised is implemented with Python for control loops, iterating between sequences/lists of elements one at a time.
- Vectorised uses Numpy, its ndarray arrays and operations such as np.matmul().
- Non-vectorised is less efficient, as it does not use Numpy's numerical methods over C++ operations.
- However, non-vectorised is much easier to understand at first, being pure Python, without relying on other functions and vector dimensions.

Recall the equation:

$$Y = h_\Theta(X) = X \times \Theta^T$$

$$J_\theta = \frac{1}{2m} \sum_{i = 0}^{m} (h_\theta(x^i) - y^i)^2$$

To implement it, follow these steps:
1. Take some time to review the equation and make sure you understand all the mathematical operations reflected in it
1. Go back to previous exercises or review the slides and write down on a sheet of paper (or auxiliary cell) the dimensions of each vector or matrix in the equation
1. Write down the linear algebra operations step by step on this paper or in an auxiliary cell
    1. Start by substituting $h_{\theta}$ in the 2nd equation for its value from the 1st equation
    1. The first operation is to find the $h_{\theta}$ or Y predicted for each row of X (multiplying it by $\Theta$)
    1. 2nd, subtract the value of Y for this example/row of X, finding its residual
    1. Then square the result
    1. Then, sum all the squares of the residuals for all examples/rows of X
    1. Finally, divide them by 2 * m
1. Write down next to each step the dimensions that your result should have. Remember that the final result of the cost function is a scalar or number
1. Finally, think about how to iterate with for loops for each value of X, Θ, and Y, to implement the cost function:
    1. Implement the formula using only for loops and the sum() function from the standard Python library, without using Numpy methods or operators.
    1. Iterate over all the rows or examples of X (m rows)
    1. Within that loop, iterate over the features or values of X and $\Theta$ to calculate the predicted Y for that example
    1. Once all the residuals have been found, find the total cost

*Notes:*
- The steps mentioned above are only a guide, an assist. In each exercise, implement your code in your own way, with the approach you prefer, using the cell code scheme or not
- Don't worry too much for now about whether it is working correctly or not, as we will check it in the next task. If there are any errors, you can return to this cell to correct your code.

In [None]:
# TODO: Implement the non-vectorised cost function following the template below

def cost_function_non_vectorized(x, y, theta):
    """ Computes the cost function for the considered dataset and coefficients.
    
    Positional arguments:
    x -- Numpy 2D array with the values of the independent variables from the examples, of size m x n
    y -- Numpy 1D array with the dependent/target variable, of size m x 1
    theta -- Numpy 1D array with the weights of the model coefficients, of size 1 x n (row vector)
    
    Return:
    j -- float with the cost for this theta array
    """
    m = [...]
    
    # Remember to check the dimensions of the matrix multiplication to do it correctly
    j = [...]
    
    return j

## Task 2: Check your implementation

To test your implementation, retrieve your code from the previous notebook about synthetic datasets for multivariate linear regression and use it to generate a dataset in the following cell:

In [None]:
# TODO: Generate a synthetic dataset, with error term, in the form of your choice, with NumPy or Scikit-learn

m = 0
n = 0
e = 0.

X = [...]

Theta_true = [...]

Y = [...]

# Check the values and dimensions (form or "shape") of the vectors
print('Actual theta to be estimated:')
print()

print('First 10 rows and 5 columns of X and Y:')
print()
print()

print('Dimensions of X and Y:')
print('shape', 'shape')

Now let's check your implementation of the cost function in the following cells.

Remember that the cost function represents the "error" of your model, the sum of the squares of the residuals of your model.

Therefore, the cost function has the following features:
- It has no units, so we cannot know if its value is "too high or too low", just compare the costs of two different models (sets of $\Theta$)
- It has a value of 0 for the theoretically optimal $\Theta$
- Its values are always positive
- It has a higher value the further away the $\Theta$ used is from the optimal $\Theta$
- Its value increases with the square of the residuals of the model

Therefore, use the next cell to check the implementation of your function with different $\Theta$, correcting your function if necessary. Check that:
1. If $\Theta$ is equal to $\Theta_{true}$ (obtained when defining the dataset), the cost is 0
1. If $\Theta$ is different than $\Theta_{true}$, the cost is non-0 and positive
1. The further away $\Theta$ is from the $\Theta_{true}$, the higher the cost (check this with 3 different $\Theta$ other than $\Theta_{true}$, in order from lowest to highest)

*Note:* For this, use the same cell, modifying its variables several times.

In [None]:
#TODO: Check the implementation of your cost function

theta = Theta_true    # Modifiy and test several values of theta

j = cost_function_non_vectorized(X, Y, theta)

print('Cost of the model:')
print(j)
print('Theta checked and actual theta:')
print(theta)
print(Theta_true)

## Task 3: Vectorise the cost function

Now we are going to implement a new cost function, but this time vectorised.

A vectorised function is one that is implemented based on linear algebraic operations, instead of e.g., the for loops used in the first function, and therefore its computation is much faster and more efficient, especially if it is run on GPUs or specialised processors.

It implements the cost function again, but this time using exclusively linear algebra operations to operate with NumPy vectors/arrays.

Tips:
- Check the dimensions of the result of each operation or intermediate step, one by one if necessary
- Try to implement the equation with as few operations as possible, without loops or iterations
- Use functions like [numpy.matmul()](https://numpy.org/doc/stable/reference/generated/numpy.matmul.html) or numpy.sum().
- Use ndarray.reshape() for *theta* if you have any problems with matrix multiplying, to get a 2D (n+1, 1) ndarray vector instead of a 1D (n+1,).
- Make sure you return a float *j* value, not a 2D ndarray with 1 element. Extract it with its indices if necessary.

In [None]:
# TODO: Implement the non-vectorised cost function following the template below

def cost_function(x, y, theta):
    """ Computes the cost function for the considered dataset and coefficients.
    
    Positional arguments:
    x -- Numpy 2D array with the values of the independent variables from the examples, of size m x n
    y -- Numpy 1D array with the dependent/target variable, of size m x 1
    theta -- Numpy 1D array with the weights of the model coefficients, of size 1 x n (row vector)
    
    Return:
    j -- float with the cost for this theta array
    """
    m = [...]
    
    # Remember to check the dimensions of the matrix multiplication to do it correctly
    j = [...]
    
    return j

Finally, go back to task 2 and repeat the same steps to now check your vectorized function.