<a href="https://colab.research.google.com/github/abel-keya/Machine-Learning/blob/master/Python_Programming_Regression_using_Neural_Networks.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<font color="green">*To start working on this notebook, or any other notebook that we will use in the Moringa Data Science Course, we will need to save our own copy of it. We can do this by clicking File > Save a Copy in Drive. We will then be able to make edits to our own copy of this notebook.*</font>

## Python Programming:Regression using Neural Networks

### Import Libraries

In [0]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
# Import an Multi-Layer Perceptron Regressor model estimator from Scikit-Learn's neural_network library
from sklearn.neural_network import MLPRegressor
import matplotlib.pyplot as plt
from sklearn import metrics


### Example 1

In this example we are going to use a dataset that we've have worked with before for the purpose of comparing the perfomance of the model.

Remember the example we looked at while doing linear regression, we'll tackle the same problem but now using neural networks. 

To refresh your mind, we were trying to predict a students chance of getting into grad school based on some tests.

Let's dive in!!

**Loading data**

In [0]:
#load the data
data = pd.read_csv('http://bit.ly/uni_admission')
data.head()

#### Using 1 feauture
First, we'll use only 1 feature and see how it performs then, we'll go ahead and increase the number of features.

In [0]:
# First, we'll use the GRE test scores to predict

X = data['GRE'].as_matrix()
y = data['admit_chance']


data.plot(x='GRE', y='admit_chance', style='o')
plt.title('GRE Score VS Chance of admission')
plt.xlabel('GRE score')
plt.ylabel('chance of admission')
plt.show()

In [0]:
# Split the dataset into train and test set
X_train, X_test, y_train,y_test = train_test_split(X,y, test_size=0.2, random_state=20)

# Just like we did in the classifier, we need to normalize our data

# Initialize the scaler
scaler = StandardScaler()

# Fitting the scaler
scaler.fit(X_train.reshape(-1,1)) # Here, we are using reshape because normally the scaler expects a 2D array but we have given it a 1D array instead. So we reshape the array and tell it that we have 1 array and an unknown number rows,

# Applying the transformation to the data
X_train = scaler.transform(X_train.reshape(-1,1))

X_test = scaler.transform(X_test.reshape(-1,1))


**Training the Model**

Similar to the model classifier, the regressor also using the same parameters.

In [0]:
# Instatiating the model
mlp = MLPRegressor(hidden_layer_sizes=(50,50), solver='sgd', activation='identity') #Since we are doing a linear regression then we don't really need the activation function so we use activation as identity

# fitting the model
mlp.fit(X_train,y_train)

**Prediction**

In [0]:
# Predicting
y_pred = mlp.predict(X_test.reshape(-1,1))

**Visualization**

Since we are using only 1 feature, we can easily visualize the results

In [0]:
plt.scatter(X_test, y_test, color='black')
plt.plot(X_test, y_pred, color='red', linewidth=2)
plt.show()

**Evaluaton**


In [0]:
# Our first metric is MAE - Mean absolute error
print('Mean Absolute Error:', metrics.mean_absolute_error(y_test, y_pred))

# We can also use MSE - Mean squared error
print('Mean Squared Error:', metrics.mean_squared_error(y_test, y_pred))  

# Finally, the most popular metric: RMSE - Root mean squared error
print('Root Mean Squared Error:', np.sqrt(metrics.mean_squared_error(y_test, y_pred)))  


#### Using Muiltiple feautures

In [0]:
# Seperating our target from our features
 X = data[['GRE','TOEFL']].values
 y = data['admit_chance']

In [0]:
# Split the data
X_train, X_test, y_train,y_test = train_test_split(X,y, test_size=0.2, random_state = 20)

# Initialize the scaler
scaler = StandardScaler()

# Fitting the scaler
scaler.fit(X_train )

# Applying the transformation to the data
X_train = scaler.transform(X_train)

X_test = scaler.transform(X_test)


**Model Training**


In [0]:
mlp = MLPRegressor(hidden_layer_sizes=(50,50), solver='sgd', activation='identity')

# Fitting the model
mlp.fit(X_train,y_train)

MLPRegressor(activation='identity', alpha=0.0001, batch_size='auto', beta_1=0.9,
             beta_2=0.999, early_stopping=False, epsilon=1e-08,
             hidden_layer_sizes=(50, 50), learning_rate='constant',
             learning_rate_init=0.001, max_iter=200, momentum=0.9,
             n_iter_no_change=10, nesterovs_momentum=True, power_t=0.5,
             random_state=None, shuffle=True, solver='sgd', tol=0.0001,
             validation_fraction=0.1, verbose=False, warm_start=False)

**Prediction**

In [0]:
# Making predictions
y_pred = mlp.predict(X_test)

**Model Evaluation**

In [0]:
# Our first metric is MAE - Mean absolute error
print('Mean Absolute Error:', metrics.mean_absolute_error(y_test, y_pred))

# We can also use MSE - Mean squared error
print('Mean Squared Error:', metrics.mean_squared_error(y_test, y_pred))  

# Finally, the most popular metric: RMSE - Root mean squared error
print('Root Mean Squared Error:', np.sqrt(metrics.mean_squared_error(y_test, y_pred)))  

**Conclusion**

We can conclude that there is a slight improvement when using multiple features. This is refected by the decrease in the RMSE.

Remember, you can always tune your parameters for better results

### <font color='green'>Challenge</font>

In [0]:
# Use NN to predict a persons salary based on their experience
# Dataset url ------> http://bit.ly/salary_dataset