# Deep Learning Regression with Admissions Data

## 1. Project Goal

#### In this project, I will develope a deep learning regression that predicts the probability (0 to 1) that a students will be accepted to graduate schools based on application factors 

In [19]:
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import StandardScaler
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import InputLayer
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import Adam
    

## 2. Dataset 

#### The dataset is from [Kaggle](https://www.kaggle.com/mohansacharya/graduate-admissions?select=Admission_Predict_Ver1.1.csv) containing information about 500 applicants from various universities and their chance of getting admitted

In [2]:
admission = pd.read_csv('admissions_data.csv')

In [3]:
admission

Unnamed: 0,Serial No.,GRE Score,TOEFL Score,University Rating,SOP,LOR,CGPA,Research,Chance of Admit
0,1,337,118,4,4.5,4.5,9.65,1,0.92
1,2,324,107,4,4.0,4.5,8.87,1,0.76
2,3,316,104,3,3.0,3.5,8.00,1,0.72
3,4,322,110,3,3.5,2.5,8.67,1,0.80
4,5,314,103,2,2.0,3.0,8.21,0,0.65
...,...,...,...,...,...,...,...,...,...
495,496,332,108,5,4.5,4.0,9.02,1,0.87
496,497,337,117,5,5.0,5.0,9.87,1,0.96
497,498,330,120,5,4.5,5.0,9.56,1,0.93
498,499,312,103,4,4.0,5.0,8.43,0,0.73


The Dataset includes nine different columns as below:

* Serial No.: Index of each row (1-500)
* GRE Score: GRE test score (out of 340)
* TOEFL Score: TOEFL test score (out of 120)
* University Rating: Evaluated university rating (out of 5)
* SOP: Statement of Purpose Strength (out of 5)
* LOR: Letter of Recommendation Strength (out of 5)
* CGPA: Undergraduate GPA (out of 10)
* Research: Has research experience (either 0 or 1)
* Chance of Admit: Applicant’s chance of being admitted (ranging from 0 to 1)

## 3. Implementing Deep Learning 

### a. Preprocessing Data for learning

 (1) Separating features from labels using array slicing

In [4]:
#drop the first column (serial No.)
admission = admission.drop(columns = ['Serial No.'])

#Choose all columns except for last column for features
features = admission.iloc[:,0:-1]

#Choose the last column for labels
labels = admission.iloc[:,-1]

 (2) Splitting the data into training and test sets (test_size = 0.33)

In [5]:
#split the data into training and test sets 
feature_train,feature_test,label_train,label_test = train_test_split(features,labels,test_size=0.33, random_state =42)

 (3) Standardize the numerical features 

In [6]:
#standardize the numeric columns using Columntransformer
ct = ColumnTransformer([('standardize', StandardScaler(),[True,True,True,True,True,True,True])])
features_train_scale=ct.fit_transform(feature_train)
features_test_scale = ct.fit_transform(feature_test)

### b. Designing a Sequential model 

 (1) Creating input, output and hidden layers  

In [7]:
#create a Sequential object
model = Sequential()

#create input layers
input = InputLayer(input_shape = (features.shape[1],))

#add the input layer
model.add(input)

#add a hidden layer with 200 neurons
model.add(Dense(200,activation = 'relu'))

#add an output layer
model.add(Dense(1))

 (2) Choosing a learning rate hyperparameter using an optimizer: Adam

In [8]:
opt = Adam(learning_rate = .01)
model.compile(loss='mse',metrics=['mae'],optimizer=opt)

### c. Training the Model

In [12]:
#Train the model
model.fit(features_train_scale,label_train,epochs=40,batch_size=1,verbose=1)

Epoch 1/40
Epoch 2/40
Epoch 3/40
Epoch 4/40
Epoch 5/40
Epoch 6/40
Epoch 7/40
Epoch 8/40
Epoch 9/40
Epoch 10/40
Epoch 11/40
Epoch 12/40
Epoch 13/40
Epoch 14/40
Epoch 15/40
Epoch 16/40
Epoch 17/40
Epoch 18/40
Epoch 19/40
Epoch 20/40
Epoch 21/40
Epoch 22/40
Epoch 23/40
Epoch 24/40
Epoch 25/40
Epoch 26/40
Epoch 27/40
Epoch 28/40
Epoch 29/40
Epoch 30/40
Epoch 31/40
Epoch 32/40
Epoch 33/40
Epoch 34/40
Epoch 35/40
Epoch 36/40
Epoch 37/40
Epoch 38/40
Epoch 39/40
Epoch 40/40


<tensorflow.python.keras.callbacks.History at 0x7fd126129310>

In [14]:
#evaluate the model
val_mse, val_mae = model.evaluate(features_test_scale,label_test,verbose=0)

In [16]:
print('MAE: ', val_mae)

MAE:  0.04948580265045166


In [17]:
print('MSE:', val_mse)

MSE: 0.0050109680742025375
