#### Models used in training
* Linear Regression
* XGBoost Regressor
* Neural Network

#### General Steps to Follow
* Defining some useful functions
* Importaing packages and reading the data
* Defining x_train, x_test, y_train, y_test
* Train the model using different algorithms

### Evaluation Functions
* Mean Squared Error
* Square Error
* R2 Score

## 1) Defining some useful functions

In [1]:
def square_error(y_true, y_pred):
    m = y_true.shape[0]
    result = np.sum(np.square(y_true-y_pred))
    result /= (2*m)
    return result

In [58]:
def evaluation(y_train, y_test, y_pred_train, y_pred_test):
    print("Train Data")
    mse = mean_squared_error(y_train, y_pred_train)
    sqe = square_error(y_train, y_pred_train)
    r2 = r2_score(y_train, y_pred_train)
    print("Mean Squared Error:", mse)
    print("Squared Error:",sqe)
    print("R-squared:", r2)
    print("Max Deviation", max(abs(y_pred_train-y_train)))
    print("---------------------------------------------------------")
    
    print("Test Data")
    mse = mean_squared_error(y_test, y_pred_test)
    sqe = square_error(y_test, y_pred_test)
    r2 = r2_score(y_test, y_pred_test)
    print("Mean Squared Error:", mse)
    print("Squared Error:",sqe)
    print("R-squared:", r2)
    print("Max Deviation", max(abs(y_pred_test-y_test)))
    return sqe

In [135]:
def random_search(model,it):
    random_search = RandomizedSearchCV(model, param_distributions=param_dist, n_iter=it, cv=5,n_jobs=-1, scoring='neg_mean_squared_error', random_state=42)
    random_search.fit(x_train, y_train)
    return random_search.best_estimator_

In [108]:
models_path = {
    
    "Linear_Regrission" : "../../data/linear_regression_model.pkl",
    "XGBRegressor" : "../../data/xgboost_model.pkl",
    "Neural_Network" : "../../data/neural_network_model.pkl"
}

In [102]:
def save_model(sqe_new, new_model, x_test, y_test, model_name):
    model_path = models_path[model_name]
    old_model = joblib.load(model_path)
    y_pred_test = old_model.predict(x_test)
    sqe_old = square_error(y_test, y_pred_test)
    print("Old Square Error:", sqe_old)
    print("New Square Error:", sqe_new)

    if sqe_new < sqe_old:
        joblib.dump(new_model,model_path)

## 2) Importaing packages and reading the data

In [2]:
import pandas as pd
import numpy as np
import joblib
from sklearn.metrics import mean_squared_error, r2_score

In [3]:
train_data = pd.read_csv("../../data/train.csv")
test_data = pd.read_csv("../../data/test.csv")

In [4]:
print("Shape of Train Data", train_data.shape)
print("Shape of Test Data", test_data.shape)

Shape of Train Data (800, 18)
Shape of Test Data (200, 18)


In [5]:
train_data.head(2)

Unnamed: 0,average_score,gender_female,gender_male,race/ethnicity_group A,race/ethnicity_group B,race/ethnicity_group C,race/ethnicity_group D,race/ethnicity_group E,parental level of education_associate's degree,parental level of education_bachelor's degree,parental level of education_high school,parental level of education_master's degree,parental level of education_some college,parental level of education_some high school,lunch_free/reduced,lunch_standard,test preparation course_completed,test preparation course_none
0,69.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,1.0
1,77.333333,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0


### ----------------------------------------------------------------------------------------------------------------------------------------------------------

## 3) Defining x_train, x_test, y_train, y_test

In [6]:
x_train = train_data.copy()
y_train = train_data["average_score"].copy()
x_train.drop(["average_score"], axis = 1, inplace = True)


x_test = test_data.copy()
y_test = test_data["average_score"].copy()
x_test.drop(["average_score"], axis = 1, inplace = True)

### ----------------------------------------------------------------------------------------------------------------------------------------------------------

## 4) Linear Regression

### 4.1 LinearRegression()

In [103]:
from sklearn.linear_model import LinearRegression

In [104]:
model = LinearRegression()

In [105]:
model.fit(x_train,y_train)

### Evaluation on train and testdata

In [106]:
y_pred_train = model.predict(x_train)
y_pred_test = model.predict(x_test)
sqe = evaluation(y_train, y_test, y_pred_train, y_pred_test) #sqe here for test set

Train Data
Mean Squared Error: 150.7522482638889
Squared Error: 75.37612413194445
R-squared: 0.2441580701926087
Max Deviation 39.75
---------------------------------------------------------
Test Data
Mean Squared Error: 182.58111111111108
Squared Error: 91.29055555555554
R-squared: 0.1482761796758213
Max Deviation 49.75


In [110]:
save_model(sqe, model, x_test, y_test, "Linear_Regrission")

Old Square Error: 89.81331178190334
New Square Error: 91.29055555555554


https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations


### --------------------------------------------------------------------------------

### 4.2 Ridge()

In [111]:
from sklearn.linear_model import Ridge

In [112]:
model = Ridge(alpha = 10)

In [113]:
model.fit(x_train,y_train)

### Evaluation on train and test data

In [114]:
y_pred_train = model.predict(x_train)
y_pred_test = model.predict(x_test)
sqe = evaluation(y_train, y_test, y_pred_train, y_pred_test) #sqe here for test set

Train Data
Mean Squared Error: 148.8261902963277
Squared Error: 74.41309514816385
R-squared: 0.2538149435585968
Max Deviation 38.36050432494929
---------------------------------------------------------
Test Data
Mean Squared Error: 180.04791123980397
Squared Error: 90.02395561990198
R-squared: 0.16009332033678159
Max Deviation 49.460713007330845


In [115]:
save_model(sqe, model, x_test, y_test, "Linear_Regrission")

Old Square Error: 89.81331178190334
New Square Error: 90.02395561990198


https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations


### --------------------------------------------------------------------------------

### 4.3 Lasso()

In [116]:
from sklearn.linear_model import Lasso

In [117]:
model = Lasso(alpha = 0.0001)

In [118]:
model.fit(x_train,y_train)

  model = cd_fast.enet_coordinate_descent(


### Evaluation on train and test data

In [119]:
y_pred_train = model.predict(x_train)
y_pred_test = model.predict(x_test)
sqe = evaluation(y_train, y_test, y_pred_train, y_pred_test) #sqe here for test set

Train Data
Mean Squared Error: 148.73411155242135
Squared Error: 74.36705577621068
R-squared: 0.2542766081525907
Max Deviation 39.065890338709465
---------------------------------------------------------
Test Data
Mean Squared Error: 179.60499522257035
Squared Error: 89.80249761128518
R-squared: 0.16215948216472342
Max Deviation 49.04319227439025


In [120]:
save_model(sqe, model, x_test, y_test, "Linear_Regrission")

Old Square Error: 89.81331178190334
New Square Error: 89.80249761128518


https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations


### --------------------------------------------------------------------------------

### 4.4 My LinearRegression

In [322]:
#batch_gradient_decent_multi_features

def batch_gradient_decent_multi_features(X,Y,alpha): 
    try:
        num_features = X.shape[1]
    except:
        num_features = 1
        
    num_training_ex = X.shape[0]
    theta = np.zeros(num_features)
    b = 0
    Y_predict = X @ theta + b
    cost_fun = np.sum(np.square(Y_predict - Y)) / (2*num_training_ex)
    
    i = 0
    while i < 100000:
        i += 1
        theta_temp = theta - (alpha/num_training_ex)*(X.T@(Y_predict - Y))
        b_temp = b - (alpha/num_training_ex)*(np.sum(Y_predict - Y))
        Y_predict_temp = X @ theta_temp + b_temp
        temp_cost = np.sum(np.square(Y_predict_temp - Y)) / (2*num_training_ex)
        
        if temp_cost > cost_fun:
            return theta, b, cost_fun, Y_predict
        else:
            cost_fun = temp_cost
            theta = theta_temp
            b = b_temp
            Y_predict = Y_predict_temp
    return theta, b, cost_fun, Y_predict

In [335]:
alpha = 0.0001
theta, b, cost_fun, y_pred_train = batch_gradient_decent_multi_features(x_train, y_train, alpha)

In [None]:
cost_fun #of train data

### Evaluation on train and test data

In [196]:
y_pred_train = x_train@theta+b
y_pred_test = x_test@theta+b
sqe = evaluation(y_train, y_test, y_pred_train, y_pred_test) #sqe here for test set

Mean Squared Error: 149.87672455431854
Squared Error: 74.93836227715927
R-squared: 0.24854777275329887


In [None]:
old_model = joblib.load("../../data/linear_regression_model.pkl")
y_pred_test = old_model.predict(x_test)
sqe_old = square_error(y_test, y_pred_test)
print("Old Square Error:", sqe_old)
print("New Square Error:", sqe)

if sqe < sqe_old:
    joblib.dump(model,"../../data/linear_regression_model.pkl")

### ----------------------------------------------------------------------------------------------------------------------------------------------------------

## 5) XGBoost Regressor

In [122]:
from xgboost import XGBRegressor

In [123]:
model = XGBRegressor()

In [124]:
model.fit(x_train,y_train)

### Evaluation on train and test data

In [125]:
y_pred_train = model.predict(x_train)
y_pred_test = model.predict(x_test)
sqe = evaluation(y_train, y_test, y_pred_train, y_pred_test) #sqe here for test set

Train Data
Mean Squared Error: 118.04742345589871
Squared Error: 59.023711727949355
R-squared: 0.4081335875169847
Max Deviation 34.32858530680339
---------------------------------------------------------
Test Data
Mean Squared Error: 230.39285920885115
Squared Error: 115.19642960442557
R-squared: -0.07476115698165198
Max Deviation 54.73445129394531


In [126]:
save_model(sqe, model, x_test, y_test, "XGBRegressor")

Old Square Error: 115.19642960442557
New Square Error: 115.19642960442557


### Hyperparameter Tuning

In [127]:
from sklearn.model_selection import RandomizedSearchCV, GridSearchCV

In [207]:

# Define hyperparameter search spaces
param_dist = {
 
 "n_estimators"     : [2,5,10, 15,20,30, 50,70],
 "learning_rate"    : [0.001,0.01, 0.05, 0.10, 0.15, 0.20, 0.25, 0.30,0.4,0.5 ] ,
 "max_depth"        : [ 3, 4, 5, 6, 8, 10, 12, 15,17,29,21],
 "min_child_weight" : [ 3, 5, 7,11,13,15,17 ],
 "gamma"            : [ 0.0, 0.1, 0.2 , 0.3, 0.4 ],
 "colsample_bytree" : [ 0.3, 0.4, 0.5 , 0.7 ]
    
}

param_grid = {
    'n_estimators': [10, 50, 100, 200,400,800],
    'max_depth': [None, 10, 20, 30],
    'min_samples_split': [2, 5, 10],
    'min_samples_leaf': [1, 2, 4],
}


In [208]:
best_model = XGBRegressor()
best_model = random_search(best_model,500)
best_model

### Evaluation on train and test data

In [209]:
y_pred_train = best_model.predict(x_train)
y_pred_test = best_model.predict(x_test)
sqe = evaluation(y_train, y_test, y_pred_train, y_pred_test) #sqe here for test set

Train Data
Mean Squared Error: tf.Tensor(144.60245, shape=(), dtype=float32)
Squared Error: 72.30122389737126
R-squared: 0.2749919523274429
Max Deviation 36.194557189941406
---------------------------------------------------------
Test Data
Mean Squared Error: tf.Tensor(186.8168, shape=(), dtype=float32)
Squared Error: 93.40839131468175
R-squared: 0.12851716788533651
Max Deviation 47.171878814697266


In [168]:
save_model(sqe, best_model, x_test, y_test, "XGBRegressor")

Old Square Error: 92.84849808870837
New Square Error: 92.79405205142066


### ----------------------------------------------------------------------------------------------------------------------------------------------------------

## 6) Neural Network

In [132]:
import tensorflow as tf
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense, BatchNormalization
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.losses import mean_squared_error

2023-09-17 14:39:16.124103: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-09-17 14:39:16.160683: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-09-17 14:39:16.161495: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [127]:
model = Sequential(
    [
        BatchNormalization(),
#         Dense(8192, activation = 'relu', kernel_initializer='he_normal'),
#         BatchNormalization(),
        Dense(4096, activation = 'relu'),
        BatchNormalization(),
        Dense(2048, activation = 'relu'),
        BatchNormalization(),
        Dense(1024, activation = 'relu'),
        BatchNormalization(),
        Dense(128, activation = 'relu'),
        BatchNormalization(),
        Dense(64, activation = 'relu'),
        BatchNormalization(),
        Dense(1, activation = 'linear'),
    ]

)

In [136]:
model.compile(
    optimizer = Adam(0.00001),
    loss = mean_squared_error,
)

In [137]:
model.fit(x_train, y_train, epochs = 100)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

<keras.callbacks.History at 0x7f40fc63bfd0>

### Evaluation on train and test data

In [None]:
y_pred_train = model.predict(x_train)
y_pred_test = model.predict(x_test)

m = x_train.shape[0]
y_pred_train = model.predict(x_train)
y_pred_train = y_pred_train.reshape((m,))

m = x_test.shape[0]
y_pred_test = model.predict(x_test)
y_pred_test = y_pred_test.reshape((m,))

sqe = evaluation(y_train, y_test, y_pred_train, y_pred_test) #sqe here for test set

In [144]:
save_model(sqe, model, x_test, y_test, "Neural_Network")