# Deep Learning Regression Model for Predicting chances of applicant being admitted
---

- Author: [Stefanus Bernard Melkisedek](https://www.github.com/stefansphtr)
- Email: [stefanussipahutar@gmail.com](stefanussipahutar@gmail.com)
- Date: 2024-02-04

## Project Description

This project is a regression model that predicts the chances of an applicant being admitted to a university based on their scores in two exams. The dataset contains nine columns: 

1. Serial No. (index)
2. GRE Score (int)
3. TOEFL Score (int)
4. University Rating (int) 
5. SOP (Statement of Purpose) (float)
6. LOR (Letter of Recommendation) (float)
7. CGPA (Cumulative Grade Point Average) (float)
8. Research (int)
9. Chance of Admit (float)

The model is implemented using a neural network with two hidden layers. The model is trained on the training set and evaluated on the test set. The performance of the model is evaluated using the mean squared error (MSE) and the R-squared score.

## Prepare the libraries

In [2]:
# Importing libraries for data manipulation and analysis
import numpy as np  # For numerical operations
import pandas as pd  # For data manipulation and analysis
import matplotlib.pyplot as plt  # For data visualization

# Importing TensorFlow and Keras for creating and training the neural network model
import tensorflow as tf  # For machine learning and numerical computation
from tensorflow import keras  # High-level API to build and train models in TensorFlow
from keras.models import Sequential  # For linear stacking of layers
from keras.callbacks import EarlyStopping  # To stop training when a monitored metric has stopped improving
from keras.layers import Dense  # For fully connected layers

# Importing Scikit-learn libraries for data preprocessing and performance metrics
from sklearn.model_selection import train_test_split  # For splitting the data into train and test sets
from sklearn.preprocessing import StandardScaler  # For standardization of features
from sklearn.preprocessing import Normalizer  # For normalization of features
from sklearn.metrics import r2_score  # For regression performance metrics

## Data Wrangling

### Gathering the dataset

Dataset Graduate Admission 2 is obtained from [Kaggle](https://www.kaggle.com/datasets/mohansacharya/graduate-admissions) and stored in the `data` directory.

Read and store the dataset in a DataFrame.

In [3]:
df_admissions = pd.read_csv('./data/admissions_data.csv')
df_admissions.head()

Unnamed: 0,Serial No.,GRE Score,TOEFL Score,University Rating,SOP,LOR,CGPA,Research,Chance of Admit
0,1,337,118,4,4.5,4.5,9.65,1,0.92
1,2,324,107,4,4.0,4.5,8.87,1,0.76
2,3,316,104,3,3.0,3.5,8.0,1,0.72
3,4,322,110,3,3.5,2.5,8.67,1,0.8
4,5,314,103,2,2.0,3.0,8.21,0,0.65
