# **Assignment - 5**

Ques1. (Based on Step-by-Step Implementation of Ridge Regression using Gradient Descent Optimization)  
Generate a dataset with atleast seven highly correlated columns and a target variable. Implement Ridge Regression using Gradient Descent Optimization. Take different values of learning rate (such as 0.0001,0.001,0.01,0.1,1,10) and regularization parameter 
(10-15,10-10,10-5,10-3,0,1,10,20). 
Choose the best parameters for which ridge
regression cost function is minimum and R2_score is maximum. 

In [28]:
import numpy as np
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import r2_score
import warnings
warnings.filterwarnings("ignore")

np.random.seed(42)
n = 200
X = np.random.rand(n, 7)
for i in range(1, 7):
    X[:, i] = X[:, 0] + np.random.normal(0, 0.01, n)
y = 3*X[:, 0] + 2*X[:, 1] + np.random.normal(0, 0.1, n)

sc = StandardScaler()
X = sc.fit_transform(X)
X = np.c_[np.ones(n), X]

def ridge_gd(X, y, lr, lam, iters):
    m, n = X.shape
    theta = np.zeros(n)
    for _ in range(iters):
        y_pred = X.dot(theta)
        if np.any(np.isnan(y_pred)) or np.any(np.isinf(y_pred)):
            return theta, np.inf, -np.inf
        grad = (1/m) * X.T.dot(y_pred - y)
        grad[1:] += 2 * lam * theta[1:]
        theta -= lr * grad
    y_pred = X.dot(theta)
    cost = (1/(2*m)) * np.sum((y_pred - y)**2) + lam * np.sum(theta[1:]**2)
    r2 = r2_score(y, y_pred)
    return theta, cost, r2

lrs = [0.0001, 0.001, 0.01, 0.1]
lambdas = [1e-15, 1e-10, 1e-5, 1e-3, 0, 1, 10, 20]

best_theta, best_cost, best_r2 = None, np.inf, -np.inf
best_lr, best_lam = None, None

for lr in lrs:
    for lam in lambdas:
        theta, cost, r2 = ridge_gd(X, y, lr, lam, 1000)
        if np.isfinite(cost) and (r2 > best_r2 or (r2 == best_r2 and cost < best_cost)):
            best_theta, best_cost, best_r2 = theta, cost, r2
            best_lr, best_lam = lr, lam

print("Best Parameters :")
print("Learning Rate :", best_lr)
print("Regularization :", best_lam)
print("R2 Score :", round(best_r2, 4))
print("Minimized Cost :", round(best_cost, 6))


Best Parameters :
Learning Rate : 0.1
Regularization : 0
R2 Score : 0.9957
Minimized Cost : 0.00445


Ques2. Load the Hitters dataset from the following link  
https://drive.google.com/file/d/1qzCKF6JKKMB0p7ul_lLy8tdmRk3vE_bG/view?usp=sharing  
(a) Pre-process the data (null values, noise, categorical to numerical encoding)  
(b) Separate input and output features and perform scaling  
(c) Fit a Linear, Ridge (use regularization parameter as 0.5748), and LASSO (use regularization parameter as 0.5748) regression function on the dataset.  
(d) Evaluate the performance of each trained model on test set. Which model performs the best and Why?  

In [1]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression, Ridge, Lasso
from sklearn.metrics import r2_score
import warnings
warnings.filterwarnings("ignore")

df = pd.read_csv('Hitters.csv')
df = df.dropna(subset=['Salary'])
df = df.fillna(df.median(numeric_only=True))
df = pd.get_dummies(df, drop_first=True)

X = df.drop('Salary', axis=1)
y = df['Salary']

sc = StandardScaler()
X = sc.fit_transform(X)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

lin = LinearRegression().fit(X_train, y_train)
ridge = Ridge(alpha=0.5748).fit(X_train, y_train)
lasso = Lasso(alpha=0.5748).fit(X_train, y_train)

r2_lin = r2_score(y_test, lin.predict(X_test))
r2_ridge = r2_score(y_test, ridge.predict(X_test))
r2_lasso = r2_score(y_test, lasso.predict(X_test))

print("R2 Score:")
print("Linear:", round(r2_lin, 4), ", Ridge:", round(r2_ridge, 4), ", Lasso:", round(r2_lasso, 4))


R2 Score:
Linear: 0.2907 , Ridge: 0.2998 , Lasso: 0.2991


Ques3. Cross Validation for Ridge and Lasso Regression  
Explore Ridge Cross Validation (RidgeCV) and Lasso Cross Validation (LassoCV) function of Python.   
Implement both on Boston House Prediction Dataset (load_boston dataset from sklearn.datasets).   

In [3]:
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import RidgeCV, LassoCV
import warnings
warnings.filterwarnings("ignore")

data = pd.read_csv('Housing.csv')
data = data.dropna()

X_data = data.iloc[:, :-1]
y_data = data.iloc[:, -1]

scaler = StandardScaler()
X_scaled = scaler.fit_transform(X_data)

ridge_cv_model = RidgeCV(alphas=[0.1, 1, 10]).fit(X_scaled, y_data)
lasso_cv_model = LassoCV(alphas=[0.1, 1, 10]).fit(X_scaled, y_data)

print("Ridge:", ridge_cv_model.alpha_, ", Lasso:", lasso_cv_model.alpha_)


Ridge: 10.0 , Lasso: 0.1


Ques4. Multiclass Logistic Regression: Implement Multiclass Logistic Regression (step-by step) on Iris dataset using   
one vs. rest strategy?

In [4]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler

iris = load_iris()
X = iris.data
y = iris.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

model = LogisticRegression(multi_class='ovr', max_iter=1000).fit(X_train, y_train)

acc = accuracy_score(y_test, model.predict(X_test))
print("Accuracy:", round(acc, 4))


Accuracy: 0.9667
