# Multiple Linear Regression

This notebook serves as a templte for builing a multiple linear regression machine learning model using the Scikit Learn library.

A multivariable linear relationship follows the general equation:

$y = b_0 + b_1x_1 + b_2x_2 + ... + b_nx_n$

## Importing the Libraries

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

## Importing the Dataset

In [None]:
# Imports `Data.csv` from the notebook's directory
# Slices columns 1 to (n-1) as `X`, and column n as `y`
# where n = number of columns
dataset = pd.read_csv('Data.csv')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, -1].values

## Encoding the Categorical Data

In [None]:
# Encoding categorical data with `n` unique values using one-hot encoding
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
ct = ColumnTransformer(transformers=[('encoder', OneHotEncoder(), [`n`])], remainder='passthrough')
X = np.array(ct.fit_transform(X))

## Splitting the Dataset into Training and Test Set 

In [None]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

## Training the Regressor on the Training Set

In [None]:
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(X_train, y_train)

## Predicing the Test Set Results

In [None]:
# Testing our model
y_pred = regressor.predict(X_test)
np.set_printoptions(precision=2) # setting the precision of print to clean output
print(np.concatenate((y_pred.reshape(len(y_pred), 1), y_test.reshape(len(y_test), 1)), axis=1))  # reshaping y_pred from rows to columns

## Predicting New Values 

In [None]:
# Using the regressor object we built to predict a new value
regressor.predict([[]])

In [None]:
# Displaying the coefficients of the multiple linear equation
regressor.coef_

## Evaluating the Model Performace

In [None]:
from sklearn.metrics import r2_score
r2_score(y_test, y_pred)