a)

To show that $X^⊤Xβ + λβ = X^⊤y$ holds, we take the derivative of the objective function (loss function) with respect to $β$ and set it equal to zero, which gives:

$∂/∂β [λ/2 ∥β∥^2 + ∥y - Xβ∥^2] = 0$

$2λβ + 2X^⊤Xβ - 2X^⊤y = 0$

On simplifying, we get:

$λβ + X^⊤Xβ - X^⊤y = 0$

Rearranging the terms, we get: 

$X^⊤Xβ + λβ = X^⊤y$

To show that $β = X^⊤α$ where $α = 1/λ (y - Xβ)$, we substitute the value of $α$ in $X^⊤Xβ + λβ = X^⊤y:$

$X^⊤Xβ + λβ = X^⊤y$

$X^⊤Xβ + λβ = X^⊤Xα + X^⊤(y - Xβ)$

$X^⊤Xβ + λβ = X^⊤X(1/λ (y - Xβ)) + X^⊤(y - Xβ)$

Multiplying both sides by $λ$, we get:

$X^⊤Xβλ + λ^2β = X^⊤y - X^⊤Xβ + λ(y - Xβ)$

$X^⊤Xβλ + λ^2β = λy + X^⊤Xβ - λX^⊤β + λy - λX^⊤β$

Simplifying, we get:

$X^⊤Xβλ + λ^2β = 2λy - 2λX^⊤β$

Dividing both sides by $λ$ and simplifying, we get:

$X^⊤Xβ + λβ = X^⊤y$

which is the same as the earlier result. Hence, $β = X^⊤α$ where $α = 1/λ (y - Xβ).$

To find $α$, we substitute the value of $β$ in $α = 1/λ (y - Xβ):$

$α = 1/λ (y - Xβ)$

$α = 1/λ (y - XX^⊤α)$

Multiplying both sides by $λ$, we get:

$λα = y - XX^⊤α$

$λα + XX^⊤α = y$

$(XX^⊤ + λI)α = y$

**$α = (XX^⊤ + λI)^{-1}y$**

This is the desired result. Note that we can replace $X^⊤X$ with a kernel matrix $K$ to get $α = (K + λI)^{-1}y.$

The inference function $⟨β, x⟩$ is given by:

$⟨β, x⟩ = x^⊤β$

Substituting $β = X^⊤α$, we get:

$⟨β, x⟩ = x^⊤X^⊤α$

Using the kernel function idea discussed in class, we can write:

**$⟨β, x⟩ = ∑_i α_i K(x, x_i)$**

This is the inference function in terms of $α$ and kernels. This extension to ridge regression is called kernel ridge regression.

Reading the data set in Data Q2.csv into a pandas dataframe.

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
water_consumption_unstd = pd.read_csv('https://raw.githubusercontent.com/Samagra06/ML/main/Data_Q2.csv')
water_consumption_unstd.head()


Unnamed: 0,Temperature,Humidity,Wind Speed,Flow,Consumption
0,5.578,93.0,0.082,0.185,5935.17407
1,15.51,64.38,0.085,0.133,6044.657863
2,15.73,64.21,0.084,0.152,6061.944778
3,15.62,65.22,0.083,0.145,6108.043217
4,15.45,67.69,0.083,0.189,6119.567827


Performing standardization

In [2]:
mean = water_consumption_unstd.mean()
std = water_consumption_unstd.std()
water_consumption = (water_consumption_unstd - mean)/ std
water_consumption.head()

Unnamed: 0,Temperature,Humidity,Wind Speed,Flow,Consumption
0,-1.930912,1.404144,-0.64147,-0.664038,-3.448482
1,0.40745,-0.874036,-0.640097,-0.665391,-3.280801
2,0.459247,-0.887568,-0.640555,-0.664897,-3.254325
3,0.433349,-0.807172,-0.641013,-0.665079,-3.183722
4,0.393324,-0.610557,-0.641013,-0.663934,-3.166071


Splitting into test and training

In [3]:
water_consumption_training_initial = water_consumption.sample(frac = 0.8, random_state = 200)
water_consumption_test = water_consumption.drop(water_consumption_training_initial.index)
water_consumption_training_initial.head()

Unnamed: 0,Temperature,Humidity,Wind Speed,Flow,Consumption
674,0.24029,0.679775,-0.641928,-0.665677,0.587856
233,-0.920415,0.958378,1.575988,-0.665183,-0.714162
739,0.473373,0.894698,1.570037,-0.665677,0.689674
865,0.880679,0.854897,1.570495,-0.464559,0.993567
523,0.527523,0.862857,-0.64559,-0.622086,0.180422


In [4]:
water_consumption_train_X = water_consumption_training_initial.drop('Consumption', axis = 1)
water_consumption_train_X.head()

Unnamed: 0,Temperature,Humidity,Wind Speed,Flow
674,0.24029,0.679775,-0.641928,-0.665677
233,-0.920415,0.958378,1.575988,-0.665183
739,0.473373,0.894698,1.570037,-0.665677
865,0.880679,0.854897,1.570495,-0.464559
523,0.527523,0.862857,-0.64559,-0.622086


In [5]:
water_consumption_train_Y = water_consumption_training_initial['Consumption']
water_consumption_train_Y.head()

674    0.587856
233   -0.714162
739    0.689674
865    0.993567
523    0.180422
Name: Consumption, dtype: float64

In [6]:
water_consumption_test_X = water_consumption_test.drop('Consumption', axis = 1)
water_consumption_test_Y = water_consumption_test['Consumption']
water_consumption_test_X.head()

Unnamed: 0,Temperature,Humidity,Wind Speed,Flow
0,-1.930912,1.404144,-0.64147,-0.664038
2,0.459247,-0.887568,-0.640555,-0.664897
6,-1.104056,0.807136,-0.641013,-0.664715
7,-0.531944,0.592214,-0.640555,-0.664506
10,0.39097,0.210129,-0.641928,-0.659405


In [7]:
from sklearn.kernel_ridge import KernelRidge
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import mean_squared_error
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split

**Training T1 for kernel ridge regression model and finding best gamma value** 

In [8]:
kr_model = KernelRidge(kernel='rbf')
param_grid = {'gamma': np.logspace(-5, 5, 11)}
cv = 5
kr_grid = GridSearchCV(kr_model, param_grid, cv=cv, scoring='neg_mean_squared_error')
kr_grid.fit(water_consumption_train_X, water_consumption_train_Y)
BestGamma = kr_grid.best_params_['gamma']
print("Best gamma value:", BestGamma)

Best gamma value: 10.0


In [9]:
# from sklearn.metrics import accuracy_score
# K = KernelRidge(kernel='rbf', gamma = BestGamma)
# K.fit(water_consumption_train_X, water_consumption_train_Y)
# y_pred_test = K.predict(water_consumption_test_X)
# mse_test = mean_squared_error(water_consumption_test_Y, y_pred_test)
# rmse_test = np.sqrt(mse_test)
# # calculate the R^2 score
# r2_test = r2_score(water_consumption_test_Y, y_pred_test)
# # print the results
# print("RMSE for test dataset:", rmse_test)
# print("R^2 score for test dataset:", r2_test)

**RMSE and $R^2$ values for train and test dataset**

In [10]:
from sklearn.metrics import mean_squared_error, r2_score
y_pred_train = kr_grid.predict(water_consumption_train_X)
mse_train = mean_squared_error(water_consumption_train_Y, y_pred_train)
rmse_train = np.sqrt(mse_train)
r2_train = r2_score(water_consumption_train_Y, y_pred_train)
print("RMSE for training dataset:", rmse_train)
print("R^2 score for training dataset:", r2_train)

y_pred_test = kr_grid.predict(water_consumption_test_X)
mse_test = mean_squared_error(water_consumption_test_Y, y_pred_test)
rmse_test = np.sqrt(mse_test)
r2_test = r2_score(water_consumption_test_Y, y_pred_test)
print("RMSE for test dataset:", rmse_test)
print("R^2 score for test dataset:", r2_test)

RMSE for training dataset: 0.6241619075497642
R^2 score for training dataset: 0.5990166565932755
RMSE for test dataset: 0.9258472065130882
R^2 score for test dataset: 0.2227860920870176


In [11]:
frame2 = pd.read_csv('https://raw.githubusercontent.com/Samagra06/ML/main/Data_Q2.csv')
frame2.head()

Unnamed: 0,Temperature,Humidity,Wind Speed,Flow,Consumption
0,5.578,93.0,0.082,0.185,5935.17407
1,15.51,64.38,0.085,0.133,6044.657863
2,15.73,64.21,0.084,0.152,6061.944778
3,15.62,65.22,0.083,0.145,6108.043217
4,15.45,67.69,0.083,0.189,6119.567827


**Labeling the dataset on the basis of Consumption values**

In [12]:
frame2["Class"] = frame2["Consumption"].apply(lambda x: 1 if x <= 6500 else (2 if 6500 <= x <= 7000 else (3 if 7000 <= x <= 7500 else (4 if 7500 <= x <= 8000 else (5 if 8000 <= x <= 8500 else (6 if 8500 <= x <= 9000 else 7))))))
frame2

Unnamed: 0,Temperature,Humidity,Wind Speed,Flow,Consumption,Class
0,5.578,93.00,0.082,0.185,5935.174070,1
1,15.510,64.38,0.085,0.133,6044.657863,1
2,15.730,64.21,0.084,0.152,6061.944778,1
3,15.620,65.22,0.083,0.145,6108.043217,1
4,15.450,67.69,0.083,0.189,6119.567827,1
...,...,...,...,...,...,...
995,17.330,42.24,4.917,31.540,9443.855422,7
996,7.010,76.40,4.920,65.890,9449.638554,7
997,14.810,82.30,4.913,0.159,9449.638554,7
998,12.090,77.40,0.073,0.104,9449.638554,7


In [13]:
Classcolumn = frame2['Class']
Classcolumn

0      1
1      1
2      1
3      1
4      1
      ..
995    7
996    7
997    7
998    7
999    7
Name: Class, Length: 1000, dtype: int64

In [14]:
frame2woclass = frame2.drop('Class', axis = 1)
frame2woclass.head()

Unnamed: 0,Temperature,Humidity,Wind Speed,Flow,Consumption
0,5.578,93.0,0.082,0.185,5935.17407
1,15.51,64.38,0.085,0.133,6044.657863
2,15.73,64.21,0.084,0.152,6061.944778
3,15.62,65.22,0.083,0.145,6108.043217
4,15.45,67.69,0.083,0.189,6119.567827


Seperating each class for Standardization

In [15]:
frame2_Class1 = frame2woclass[frame2['Class'] == 1]
frame2_Class2 = frame2woclass[frame2['Class'] == 2]
frame2_Class3 = frame2woclass[frame2['Class'] == 3]
frame2_Class4 = frame2woclass[frame2['Class'] == 4]
frame2_Class5 = frame2woclass[frame2['Class'] == 5]
frame2_Class6 = frame2woclass[frame2['Class'] == 6]
frame2_Class7 = frame2woclass[frame2['Class'] == 7]

In [16]:
print(len(frame2_Class1))
print(len(frame2_Class2))
print(len(frame2_Class3))
print(len(frame2_Class4))
print(len(frame2_Class5))
print(len(frame2_Class6))
print(len(frame2_Class7))

15
27
103
218
280
255
102


**Since its evident there is a class imbalance in the dataset, therefore using a
data augmentation technique for the minority class, known as SMOTE for tackling this issue.** 

In [17]:
frame2_Class1.head()

Unnamed: 0,Temperature,Humidity,Wind Speed,Flow,Consumption
0,5.578,93.0,0.082,0.185,5935.17407
1,15.51,64.38,0.085,0.133,6044.657863
2,15.73,64.21,0.084,0.152,6061.944778
3,15.62,65.22,0.083,0.145,6108.043217
4,15.45,67.69,0.083,0.189,6119.567827


WRONG

In [18]:
def stdardise(dataset):
  mean = dataset.mean()
  std = dataset.std()
  dataset_std = (dataset - mean)/ std
  return dataset_std

In [19]:
frame2_Class1_std = stdardise(frame2_Class1)
frame2_Class2_std = stdardise(frame2_Class2)
frame2_Class3_std = stdardise(frame2_Class3)
frame2_Class4_std = stdardise(frame2_Class4)
frame2_Class5_std = stdardise(frame2_Class5)
frame2_Class6_std = stdardise(frame2_Class6)
frame2_Class7_std = stdardise(frame2_Class7)

In [20]:
frame2_seperate_std = pd.concat([frame2_Class1_std, frame2_Class2_std, frame2_Class3_std, frame2_Class4_std, frame2_Class5_std, frame2_Class6_std, frame2_Class7_std], axis=0, ignore_index=True)
#frame2_seperate_std = np.concatenate((frame2_Class1_std, frame2_Class2_std, frame2_Class3_std, frame2_Class4_std, frame2_Class5_std, frame2_Class6_std, frame2_Class7_std))
frame2_seperate_std.head()

Unnamed: 0,Temperature,Humidity,Wind Speed,Flow,Consumption
0,-1.633251,1.510316,-0.491477,-0.353228,-1.860482
1,1.046563,-1.473479,0.561688,-0.398291,-1.259987
2,1.105922,-1.491202,0.210633,-0.381826,-1.165172
3,1.076242,-1.385904,-0.140422,-0.387892,-0.912332
4,1.030374,-1.128393,-0.140422,-0.349762,-0.849122


In [21]:
frame2_seperate_std['Class'] = Classcolumn
frame2_seperate_std.head()

Unnamed: 0,Temperature,Humidity,Wind Speed,Flow,Consumption,Class
0,-1.633251,1.510316,-0.491477,-0.353228,-1.860482,1
1,1.046563,-1.473479,0.561688,-0.398291,-1.259987,1
2,1.105922,-1.491202,0.210633,-0.381826,-1.165172,1
3,1.076242,-1.385904,-0.140422,-0.387892,-0.912332,1
4,1.030374,-1.128393,-0.140422,-0.349762,-0.849122,1


In [22]:
X = frame2_seperate_std
Y = Classcolumn

In [23]:
from imblearn.over_sampling import RandomOverSampler, SMOTE
from imblearn.under_sampling import RandomUnderSampler
from imblearn.pipeline import make_pipeline

pipeline = make_pipeline(SMOTE(), RandomUnderSampler())
Xresampled, Yresampled = pipeline.fit_resample(X, Y)

In [24]:
print(len(Xresampled))
print(len(Yresampled))

1960
1960


In [25]:
Xresampled['Class'] = Yresampled
frame2_seperate_std = Xresampled


In [26]:
frame2_Class1_resampled = frame2_seperate_std[frame2_seperate_std['Class'] == 1]
frame2_Class2_resampled = frame2_seperate_std[frame2_seperate_std['Class'] == 2]
frame2_Class3_resampled = frame2_seperate_std[frame2_seperate_std['Class'] == 3]
frame2_Class4_resampled = frame2_seperate_std[frame2_seperate_std['Class'] == 4]
frame2_Class5_resampled = frame2_seperate_std[frame2_seperate_std['Class'] == 5]
frame2_Class6_resampled = frame2_seperate_std[frame2_seperate_std['Class'] == 6]
frame2_Class7_resampled = frame2_seperate_std[frame2_seperate_std['Class'] == 7]

In [27]:
print(len(frame2_Class1_resampled))
print(len(frame2_Class2_resampled))
print(len(frame2_Class3_resampled))
print(len(frame2_Class4_resampled))
print(len(frame2_Class5_resampled))
print(len(frame2_Class6_resampled))
print(len(frame2_Class7_resampled))

280
280
280
280
280
280
280


As it can be seen now that the samples in each class are same and therefore the problem of class imbalance is no longer prelevant.

In [28]:
frame2_seperate_std_training_initial = frame2_seperate_std.sample(frac = 0.8, random_state = 200)
frame2_seperate_std_test = frame2_seperate_std.drop(frame2_seperate_std_training_initial.index)
frame2_seperate_std_training_initial.head()

Unnamed: 0,Temperature,Humidity,Wind Speed,Flow,Consumption,Class
228,1.032148,-0.063119,-0.777869,-0.228503,0.601386,1
1212,-0.513013,0.150757,-0.749759,0.06125,-0.513622,5
1706,-1.497911,1.121162,-0.628612,-0.7183,0.988486,7
1193,-1.214943,1.349804,-0.751917,1.515866,-0.98684,5
1933,-0.478265,-0.833873,-0.629982,-0.507987,0.722806,7


In [29]:
frame2_seperate_std_training_initial_wo_consump = frame2_seperate_std_training_initial.drop('Consumption', axis = 1)
frame2_seperate_std_training_initial_wo_consump.head()

Unnamed: 0,Temperature,Humidity,Wind Speed,Flow,Class
228,1.032148,-0.063119,-0.777869,-0.228503,1
1212,-0.513013,0.150757,-0.749759,0.06125,5
1706,-1.497911,1.121162,-0.628612,-0.7183,7
1193,-1.214943,1.349804,-0.751917,1.515866,5
1933,-0.478265,-0.833873,-0.629982,-0.507987,7


In [30]:
frame2_T3_X = frame2_seperate_std_training_initial_wo_consump.drop('Class', axis = 1)
frame2_T3_Y = frame2_seperate_std_training_initial_wo_consump['Class']

In [31]:
frame2_seperate_std_test_wo_consump = frame2_seperate_std_test.drop('Consumption', axis = 1)
frame2_T4_X = frame2_seperate_std_test_wo_consump.drop('Class', axis = 1)
frame2_T4_Y = frame2_seperate_std_test_wo_consump['Class']

In [32]:
from sklearn.svm import LinearSVC
from sklearn.metrics import accuracy_score
from sklearn.model_selection import KFold
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.model_selection import GridSearchCV
from sklearn.svm import SVC

In [33]:
def kernelSVM(X,Y):
  param_grid = {'gamma': [0.01, 0.1, 1, 10]}
  svc = SVC(kernel='rbf')
  from sklearn.model_selection import GridSearchCV
  grid_search = GridSearchCV(svc, param_grid, cv=5)
  grid_search.fit(X, Y)
  best_params = grid_search.best_params_
  print('Best hyperparameters:', best_params)
  best_svc = SVC(kernel='rbf', gamma=best_params['gamma'])
  best_svc.fit(X, Y)
  return best_params

In [34]:
def KerRidge (X, Y):
  kr_model = KernelRidge(kernel='rbf')
  param_grid = {'gamma': np.logspace(-5, 5, 11)}
  cv = 5
  kr_grid = GridSearchCV(kr_model, param_grid, cv=cv, scoring='neg_mean_squared_error')
  kr_grid.fit(frame2_T3_X, frame2_T3_Y)
  BestGamma = kr_grid.best_params_['gamma']
  print("Best gamma value:", BestGamma)
  return BestGamma

In [35]:
BestGammaX3 = kernelSVM(frame2_T3_X, frame2_T3_Y)
BestGammaX3

Best hyperparameters: {'gamma': 10}


{'gamma': 10}

In [36]:
BestGammaX4 = kernelSVM(frame2_T4_X, frame2_T4_Y)

Best hyperparameters: {'gamma': 10}


In [37]:
def ClassT3(c):
  T3Class = frame2_seperate_std_training_initial[frame2_seperate_std_training_initial['Class'] == c]
  X_T3_Class = T3Class.drop(['Consumption','Class'], axis = 1)
  Y_T3_Class = T3Class['Consumption']
  return X_T3_Class, Y_T3_Class

In [38]:
X_T3_Class1, Y_T3_Class1 = ClassT3(1)
X_T3_Class2, Y_T3_Class2 = ClassT3(2)
X_T3_Class3, Y_T3_Class3 = ClassT3(3)
X_T3_Class4, Y_T3_Class4 = ClassT3(4)
X_T3_Class5, Y_T3_Class5 = ClassT3(5)
X_T3_Class6, Y_T3_Class6 = ClassT3(6)
X_T3_Class7, Y_T3_Class7 = ClassT3(7)

Tuning the gamma parameter for each class.

In [39]:
BestGammaX3_Class1 = KerRidge(X_T3_Class1, Y_T3_Class1)
BestGammaX3_Class2 = KerRidge(X_T3_Class2, Y_T3_Class2)
BestGammaX3_Class3 = KerRidge(X_T3_Class3, Y_T3_Class3)
BestGammaX3_Class4 = KerRidge(X_T3_Class4, Y_T3_Class4)
BestGammaX3_Class5 = KerRidge(X_T3_Class5, Y_T3_Class5)
BestGammaX3_Class6 = KerRidge(X_T3_Class6, Y_T3_Class6)
BestGammaX3_Class7 = KerRidge(X_T3_Class7, Y_T3_Class7)

Best gamma value: 1.0
Best gamma value: 1.0
Best gamma value: 1.0
Best gamma value: 1.0
Best gamma value: 1.0
Best gamma value: 1.0
Best gamma value: 1.0
