<a href="https://colab.research.google.com/github/Jmsperdue/machineLearningNoteBooks/blob/multiLayerPerceptronRegressor/ast5.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Assignment #5**
**James Perdue 1012457081** 

**Daniel Schaefer 2001714504**


# **Multi-Layer Perceptron (MLP) Regressor:**
The data set we used the Auto MPG Data Set from 
https://archive.ics.uci.edu/ml/datasets/Auto+MPG 
 
With these data sets, we hope to train both models to accurately predict MPG off of the distinct 
attributes all except for car name, which is just a string and doesn’t bear any impact on 
performance and is unique in all instances. 
As stated on the website, 
“Source: 
This dataset was taken from the StatLib library which is maintained at Carnegie Mellon 
University. The dataset was used in the 1983 American Statistical Association Exposition "


# **Import Libraries**

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as ny
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import RobustScaler
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.neural_network import MLPRegressor
from sklearn.metrics import mean_squared_error
from sklearn.metrics import mean_absolute_error
from sklearn.metrics import r2_score

# **Import Dataset**
I removed the 6 entries that had a blank field for horsepower and had to convert the .data file to a .csv

Attribute Information: 
1. mpg: continuous  
2. cylinders: multi-valued discrete  
3. displacement: continuous  
4. horsepower: continuous  
5. weight: continuous  
6. acceleration: continuous  
7. model year: multi-valued discrete  
8. origin: multi-valued discrete  
9. car name: string (unique for each instance)” 
Steps we took to preprocess the data was to convert the .data file into a .csv. I also went in and 
deleted the data instances that had blank or null values in for MPG for simplicity rather than 
making up values or inserting 0

In [None]:
df = pd.read_csv("csvautompg.csv",sep=',')
df

Unnamed: 0,mpg,cylinders,displacement,horsepower,weight,acceleration,model year,origin,car name
0,18.0,8,307.0,130,3504,12.0,70,1,chevrolet chevelle malibu
1,15.0,8,350.0,165,3693,11.5,70,1,buick skylark 320
2,18.0,8,318.0,150,3436,11.0,70,1,plymouth satellite
3,16.0,8,304.0,150,3433,12.0,70,1,amc rebel sst
4,17.0,8,302.0,140,3449,10.5,70,1,ford torino
...,...,...,...,...,...,...,...,...,...
387,27.0,4,140.0,86,2790,15.6,82,1,ford mustang gl
388,44.0,4,97.0,52,2130,24.6,82,2,vw pickup
389,32.0,4,135.0,84,2295,11.6,82,1,dodge rampage
390,28.0,4,120.0,79,2625,18.6,82,1,ford ranger


# **Define x and y** 
We omitted the string car name which is unique for each instance and set the r^t to MPG 

In [None]:
x=df.drop(['mpg', 'car name'], axis=1).values
y=df['mpg'].values

# **Split the dataset into training set and test set**

In [None]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(x,y , test_size=0.8, random_state=1)

# **Hyper Parameter**

Rather than picking 3 arbitrary combinations of model parameters, 

We are using RandomizedSearchCV to tune our Hyper Parameters to find the best 

three performing combinations, on the bad side it does take some time about 15 seconds on average.

**Original HyperParameters:**

**Solver:** 'lbfgs', 'sgd', 'adam'

**Learning Rate:**'constant', 'invscaling', 'adaptive'

**Activator:** 'logistic', 'tanh', 'relu'

**Hidden Layer Sizes:** (15,10,3), (7,7,7),(7)

**Alpha:** .1, 0.001, 0.0001

We had problems with the model not converging when we had all these parameters available for GridSearchCV, So We played with the hyperparameters to get them tuned with our dataset as best as possible and these are the best hyperparameters as far as speed, accuracy, and convergence.

In [None]:
from sklearn.model_selection import GridSearchCV
estimator = Pipeline(steps=[("scaler", StandardScaler()),("estimator", MLPRegressor(max_iter=5000, early_stopping=True)),])
hidden_layer_sizes = [(7,7,7),(7),]
hyper_parameter = [
                   {
                    'estimator__solver': ['sgd'],                                                     #['lbfgs', 'sgd', 'adam']
                    'estimator__learning_rate' : ['constant', 'adaptive'],                            #['constant', 'invscaling', 'adaptive'],
                    'estimator__activation': ['tanh', 'relu'],                                        #['logistic', 'tanh', 'relu'], 
                    'estimator__hidden_layer_sizes': hidden_layer_sizes,
                    'estimator__alpha': [.1, 0.001, 0.0001],                                          #,[.1, 0.001, 0.0001],
                   },
]
grid = GridSearchCV( estimator, hyper_parameter, refit=True, n_jobs=-1, cv=None,)
grid.fit(X_train, y_train)
grid_predictions = grid.predict(X_test)

# **Train the Model on the training set**
**Before we started training the model we formatted the data frame so we can grab the top three estimator parameters** 

We're not concerned with sorting by rank at this point we just want three to compare.

In [None]:
df = pd.DataFrame(grid.cv_results_)
df = df.loc[df['rank_test_score'] <= 3]
df = df.loc[:, ~df.columns.isin(['mean_fit_time', 'std_fit_time', 'mean_score_time', 'std_score_time', 'split0_test_score', 'split1_test_score', 'split2_test_score', 'split3_test_score', 'split4_test_score', 'mean_test_score', 'std_test_score' , 'params'])]
df = df.reset_index(drop=True)
regrssor1 = Pipeline(steps=[("scaler", StandardScaler()), ("mlp",MLPRegressor(learning_rate=df.at[0, 'param_estimator__learning_rate'], random_state=1, max_iter=5000,solver=df.at[0,'param_estimator__solver'], hidden_layer_sizes=df.at[0,'param_estimator__hidden_layer_sizes'], alpha =df.at[0, 'param_estimator__alpha'], activation=df.at[0, 'param_estimator__activation'] )),])
regrssor2 = Pipeline(steps=[("scaler", StandardScaler()), ("mlp",MLPRegressor(learning_rate=df.at[1, 'param_estimator__learning_rate'], random_state=1, max_iter=5000,solver=df.at[1,'param_estimator__solver'], hidden_layer_sizes=df.at[1,'param_estimator__hidden_layer_sizes'], alpha =df.at[1, 'param_estimator__alpha'], activation=df.at[1, 'param_estimator__activation'] )),])
regrssor3 = Pipeline(steps=[("scaler", StandardScaler()), ("mlp",MLPRegressor(learning_rate=df.at[2, 'param_estimator__learning_rate'], random_state=1, max_iter=5000,solver=df.at[2,'param_estimator__solver'], hidden_layer_sizes=df.at[2,'param_estimator__hidden_layer_sizes'], alpha =df.at[2, 'param_estimator__alpha'], activation=df.at[2, 'param_estimator__activation'] )),])
regrssor1.fit(X_train, y_train)
regrssor2.fit(X_train, y_train)
regrssor3.fit(X_train, y_train)

Pipeline(steps=[('scaler', StandardScaler()),
                ('mlp',
                 MLPRegressor(hidden_layer_sizes=(7, 7, 7), max_iter=5000,
                              random_state=1, solver='sgd'))])

# **Table of Convergence** 

In [None]:
records = [regrssor1.named_steps["mlp"].get_params() , regrssor2.named_steps["mlp"].get_params(), regrssor3.named_steps["mlp"].get_params() ]
iter_array = [regrssor1.named_steps["mlp"].n_iter_ , regrssor2.named_steps["mlp"].n_iter_, regrssor3.named_steps["mlp"].n_iter_]
model_array = ["regrssor1", "regrssor2", "regrssor3"]
df = pd.DataFrame(records)
df.insert(0,"Model Name", model_array)
df.insert(1,"Iterations till Convergence", iter_array)
converge = df.loc[:, ~df.columns.isin(['batch_size', 'beta_1', 'beta_2', 'early_stopping', 'epsilon', 'max_fun', 'max_iter', 'momentum', 'n_iter_no_change', 'nesterovs_momentum', 'power_t', 'random_state', 'shuffle', 'tol', 'validation_fraction', 'verbose', 'warm_start', 'learning_rate_init'])]
converge

Unnamed: 0,Model Name,Iterations till Convergence,activation,alpha,hidden_layer_sizes,learning_rate,solver
0,regrssor1,4071,tanh,0.0001,7,constant,sgd
1,regrssor2,2416,relu,0.001,"(7, 7, 7)",adaptive,sgd
2,regrssor3,2362,relu,0.0001,"(7, 7, 7)",constant,sgd


# **Predict the training set results and test set results**


In [None]:
y1_pred = regrssor1.predict(X_train)
y2_pred = regrssor2.predict(X_train)
y3_pred = regrssor3.predict(X_train)
test1y_pred = regrssor1.predict(X_test)
test2y_pred = regrssor2.predict(X_test)
test3y_pred = regrssor3.predict(X_test)

# **Table of Evaluation Metrics for the Test Dataset and Training Dataset**

In [None]:
metrics = [['regrssor1', mean_squared_error(y_test, test1y_pred), mean_absolute_error(y_test, test1y_pred), r2_score(y_test, test1y_pred)],['regrssor2', mean_squared_error(y_test, test2y_pred),mean_absolute_error(y_test, test2y_pred),r2_score(y_test, test2y_pred)],['regrssor3', mean_squared_error(y_test, test3y_pred), mean_absolute_error(y_test, test3y_pred),r2_score(y_test, test3y_pred)]]
test_metrics = pd.DataFrame(metrics, columns = ['Model Name', 'MSE', 'MAE', 'R2'])
metrics = [['regrssor1', mean_squared_error(y_train, y1_pred), mean_absolute_error(y_train, y1_pred), r2_score(y_train, y1_pred)],['regrssor2', mean_squared_error(y_train, y2_pred),mean_absolute_error(y_train, y2_pred),r2_score(y_train, y2_pred)],['regrssor3', mean_squared_error(y_train, y3_pred), mean_absolute_error(y_train, y3_pred),r2_score(y_train, y3_pred)]]
train_metrics = pd.DataFrame(metrics, columns = ['Model Name', 'MSE', 'MAE', 'R2'])
print("Test Dataset\n")
test_metrics

Test Dataset



Unnamed: 0,Model Name,MSE,MAE,R2
0,regrssor1,12.615433,2.644191,0.799848
1,regrssor2,10.966274,2.424678,0.826013
2,regrssor3,10.964056,2.424278,0.826048


In [None]:
print("Training Dataset\n")
train_metrics

Training Dataset



Unnamed: 0,Model Name,MSE,MAE,R2
0,regrssor1,0.651338,0.643469,0.98735
1,regrssor2,0.563804,0.60094,0.98905
2,regrssor3,0.565185,0.60214,0.989023


# **Predicted Values Vs Actual Values**

In [None]:
pred_y_df=pd.DataFrame({'Actual MPG':y_test, 'regrssor1 y^t':test1y_pred, 'Difference1': y_test-test1y_pred, 'regrssor2 y^t':test2y_pred, 'Difference2': y_test-test2y_pred, 'regrssor3 y^t':test3y_pred, 'Difference3': y_test-test3y_pred})
pred_y_df

Unnamed: 0,Actual MPG,regrssor1 y^t,Difference1,regrssor2 y^t,Difference2,regrssor3 y^t,Difference3
0,23.0,19.019481,3.980519,22.400873,0.599127,22.399721,0.600279
1,29.0,26.019011,2.980989,26.507434,2.492566,26.509200,2.490800
2,32.4,33.137711,-0.737711,34.328320,-1.928320,34.328999,-1.928999
3,19.0,23.031938,-4.031938,21.277331,-2.277331,21.276138,-2.276138
4,38.0,35.925165,2.074835,39.021756,-1.021756,39.025393,-1.025393
...,...,...,...,...,...,...,...
309,26.0,28.618199,-2.618199,27.052679,-1.052679,27.054060,-1.054060
310,18.0,17.374404,0.625596,17.371426,0.628574,17.370684,0.629316
311,34.0,34.357076,-0.357076,36.971336,-2.971336,36.972768,-2.972768
312,37.3,31.058356,6.241644,30.642236,6.657764,30.647616,6.652384
