# Fitting a Multi-Regression Model  
The aim of this exercise is to understand how to use multi regression. Here we will observe the difference in MSE for each model as the predictors change.  
![2.1.5.png](2.1.5.png)
### Instructions:  
Read the file Advertisement.csv as a dataframe.  
For each instance of the predictor combination, we will form a model. For example, if you have 2 predictors,  A and B, you will end up getting 3 models - one with only A, one with only B, and one with both A and B.  
Split the data into train and test sets.  
fit a linear regression model on the train data.  
Compute the MSE of each model on the test data.  
Print the results for each Predictor - MSE value pair.  

Hints:
pd.read_csv(filename)
Returns a pandas dataframe containing the data and labels from the file data.


sklearn.model_selection.train_test_split()
Splits the data into random train and test subsets.


sklearn.linear_model.LinearRegression()
LinearRegression fits a linear model.


sklearn.linear_model.LinearRegression.fit()
Fits the linear model to the training data.


sklearn.linear_model.LinearRegression.predict()
Predict using the linear model.


sklearn.metrics.mean_squared_error()
Computes the mean squared error regression loss

In [2]:
# Import necessary libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn import preprocessing
from prettytable import PrettyTable
from sklearn.metrics import mean_squared_error
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
%matplotlib inline


### Reading the dataset

In [3]:
# Read the file "Advertising.csv"
df = pd.read_csv("Advertising.csv")


In [4]:
# Take a quick look at the data to list all the predictors
df.head()


Unnamed: 0,TV,Radio,Newspaper,Sales
0,230.1,37.8,69.2,22.1
1,44.5,39.3,45.1,10.4
2,17.2,45.9,69.3,9.3
3,151.5,41.3,58.5,18.5
4,180.8,10.8,58.4,12.9


In [11]:
# Just testing in this cell
cols = [["TV"],
        ["Radio"],
        ["Newspaper"],
        ["TV", "Radio"],
        ["TV", "Newspaper"],
        ["Radio", "Newspaper"],
        ["TV", "Radio", "Newspaper"]]
for i in cols:
    x = pd.DataFrame(df[i])
    print(x)

        TV
0    230.1
1     44.5
2     17.2
3    151.5
4    180.8
..     ...
195   38.2
196   94.2
197  177.0
198  283.6
199  232.1

[200 rows x 1 columns]
     Radio
0     37.8
1     39.3
2     45.9
3     41.3
4     10.8
..     ...
195    3.7
196    4.9
197    9.3
198   42.0
199    8.6

[200 rows x 1 columns]
     Newspaper
0         69.2
1         45.1
2         69.3
3         58.5
4         58.4
..         ...
195       13.8
196        8.1
197        6.4
198       66.2
199        8.7

[200 rows x 1 columns]
        TV  Radio
0    230.1   37.8
1     44.5   39.3
2     17.2   45.9
3    151.5   41.3
4    180.8   10.8
..     ...    ...
195   38.2    3.7
196   94.2    4.9
197  177.0    9.3
198  283.6   42.0
199  232.1    8.6

[200 rows x 2 columns]
        TV  Newspaper
0    230.1       69.2
1     44.5       45.1
2     17.2       69.3
3    151.5       58.5
4    180.8       58.4
..     ...        ...
195   38.2       13.8
196   94.2        8.1
197  177.0        6.4
198  283.6       66.2
19

### Create different multi predictor models 

In [12]:
### edTest(test_mse) ###

# Initialize a list to store the MSE values
mse_list = []

# Create a list of lists of all unique predictor combinations
# For example, if you have 2 predictors,  A and B, you would 
# end up with [['A'],['B'],['A','B']]
cols = [["TV"],
        ["Radio"],
        ["Newspaper"],
        ["TV", "Radio"],
        ["TV", "Newspaper"],
        ["Radio", "Newspaper"],
        ["TV", "Radio", "Newspaper"]]

# Loop over all the predictor combinations 
for i in cols:

    # Set each of the predictors from the previous list as x
    x = pd.DataFrame(df[i])
    
    # Set the "Sales" column as the reponse variable
    y = df["Sales"]
   
    # Split the data into train-test sets with 80% training data and 20% testing data. 
    # Set random_state as 0
    x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=0)

    # Initialize a Linear Regression model
    lreg = LinearRegression()

    # Fit the linear model on the train data
    lreg.fit(x_train, y_train)

    # Predict the response variable for the test set using the trained model
    y_pred = lreg.predict(x_test)
    
    # Compute the MSE for the test data
    MSE = mean_squared_error(y_test, y_pred)
    
    # Append the computed MSE to the initialized list
    mse_list.append(MSE)


### Display the MSE with predictor combinations

In [14]:
# Helper code to display the MSE for each predictor combination
t = PrettyTable(['Predictors', 'MSE'])

for i in range(len(mse_list)):
    t.add_row([cols[i],round(mse_list[i],3)])

print(t)


+------------------------------+--------+
|          Predictors          |  MSE   |
+------------------------------+--------+
|            ['TV']            | 10.186 |
|          ['Radio']           | 24.237 |
|        ['Newspaper']         | 32.137 |
|       ['TV', 'Radio']        | 4.391  |
|     ['TV', 'Newspaper']      | 8.688  |
|    ['Radio', 'Newspaper']    | 24.783 |
| ['TV', 'Radio', 'Newspaper'] | 4.402  |
+------------------------------+--------+
