<a href="https://colab.research.google.com/github/williambrunos/Supervised-Machine-Learning-Regression-and-Classification/blob/main/Week_2/multi_linear_regression.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Multi Linear Regression

In this section, we'll take a look on how to implement linear regression with not only one feature, but various features! 

## Multiple Features

Those features can be represented as $x_1, x_2,...,x_n ⇒ x_j = j^{th}$ feature, with n as the number of features.

- $x_j = j^{th}$ feature;
- $n = $ number of features;
- $\vec x^{(i)} = $ features of the $i^{th}$ trainning example;
- $x_j^{(i)} = j^{th}$ feature on the $i^{th}$ trainning example; 

Previously:

$$f_{w, b}(x) = wx + b$$

where x was a single feature. But now, with multiple features, for instance n features, this function will change a bit:

$$f_{\vec w, b}(\vec x) = \sum_{i=1}^{n}(w_ix_i) + b$$

Each $x_i$ is the $i^{th}$ feature on the data and $w_i$ it is the equivalent weight determined by the model for that feature. The paramter b continues being the bias of the model.

We can rewrite the terms of the model using a little bit of mathematics:

- $\vec w = [w_1 w_2 w_3 ... w_n]$
- b is a real number;

Those above are the **parameters** of the model. At the same way, we can list all the features of the data as:

- $\vec x = [x_1 x_2 x_3 ... x_n]$

The model function can now be rewrited as:

$$f_{\vec w, b} = \vec w \bullet \vec x + b = \sum_{i=1}^{n}(w_ix_i) + b$$

The $\bullet$ operator is the **dot product** upon vectors. Fitting the data with multiple features into a linear regression yields to a model called **multiple linear regression**.

## Vectorization

Vectorizing your code yields to a more efficient adn powerful code processing.

In [1]:
import numpy as np

In [2]:
w = np.array([1, 2.5, -3.3], dtype=float)
b = 4
x = np.array([10, 20, 30])

Calculating $f$ with vectorization:

In [4]:
f = np.dot(w, x) + b
print(f)

-35.0


Or, we can use that in a more inefficient (but works) way than using numpy:

In [6]:
n = len(x)
f = 0
for j in range(0, n):
  f += x[j] * w[j]
f += b
print(f)

-35.0


### Benefits of vectorization

- Shorter code;
- More optimized operations;
- Use of parallel hardware by numpy;
- Scale to large datasets;