## Yedihilal Introduction to Machine Learning Summer School 2023

# Assignment 2
#### Due date: 23.59 Sunday, August 13

In [17]:
# Name:Suleyman Asim
# Surname:Gelisgen

In [None]:
# This cell generates data used in Part 1 and 2. Please do not change here.

import numpy as np
np.random.seed(0)

def polynomial(values, coeffs):
    # Coeffs are assumed to be in order 0, 1, ..., n-1
    expanded = np.column_stack([coeffs[i] * (values ** i) for i in range(0, len(coeffs))])
    return np.sum(expanded, axis=-1)

def polynomial_data(coeffs, n_data=100, x_range=[-1, 1], eps=0.1):
    x = np.random.uniform(x_range[0], x_range[1], n_data)
    poly = polynomial(x, coeffs)
    return x.reshape([-1, 1]), np.reshape(poly + eps * np.random.randn(n_data), [-1, 1])


# 1 + 0.5 * x - 0.5 x^2 - 0.2 x^3 - 0.W1 x^4
coeffs = [1, 0.5, -0.5, -0.2, -0.1]
X, y = polynomial_data(coeffs, 100, [90, 110], 200)

## Part 1: Linear Regression and Gradient Descent

### Data Splitting

Split data into training and test datasets with the test ratio of 33% and random_state=0 using Scikit-learn library

In [None]:
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=0) #egitim ve test olarak ikiye ayırdık  


### Data Preprocessing: Scaling

Transform the data to have zero mean and unit variance.

In [None]:
scaler = StandardScaler() #olceklendirme (0 ort ve birim varyns)
scaler.fit(X_train)

X_train_scaled = scaler.transform(X_train)
X_test_scaled = scaler.transform(X_test)

## Linear Regression Model Implementation with Gradient Descent

The model function for linear regression, which is a function that maps from `x` to `y` is represented as:
**$$f_{w,b}(x) = wx + b$$**
 To train a linear regression model, We want to find the best $(w,b)$ parameters that fit our dataset.
 
## Forward Pass
The forward method computes the linear regression output for the input data X using the current weights and biases.

## Loss (Cost) Function
The loss function is used to evaluate the performance of the model. The compute_loss method computes the loss of the linear regression model using the predicted values and actual values. The loss function is given by:

$$J(w,b) = \frac{1}{2m} \sum_{i=1}^{m}(f_{w,b}(x^{(i)}) - y^{(i)})^2$$

where m is the number of training examples, x is the input data, y is the actual output, and w and b are the weights and biases respectively.

## Backward Pass
The backward method computes the gradients of the weights and biases using the predicted values and actual values. The gradients are used to update the weights and biases during training.

$$
\frac{\partial J(w,b)}{\partial b}^{}  = \frac{1}{m} \sum\limits_{i = 0}^{m-1} \ (f_{w,b}(X^{}) - y^{}) 
$$
$$
\frac{\partial J(w,b)}{\partial w}^{}  =  \frac{1}{m} \sum\limits_{i = 0}^{m-1}\ (f_{w,b}(X^{}) -y^{})X^{} 
$$

## Training
The fit method trains the linear regression model for the specified number of iterations using the input data X and actual values y. The method computes the forward pass, computes the cost function, computes the backward pass, and updates the weights and biases. Optionally, it plots the cost function for each iteration. Where updating parameter equations are given by:
 
$$W \leftarrow W - \alpha \frac{\partial J}{\partial W}$$
$$b \leftarrow b - \alpha \frac{\partial J}{\partial b}$$



In [45]:
class LRwithGradientDecent:
    def __init__(self, lr):
        self.lr = lr
        self.weights = None
        self.bias = None
        self.loss_values = []
        self.X = None  # Input data
        self.y = None  # Actual values
        
    def initialize_parameters(self):
        if self.X.ndim ==1:
             self.W = 0
        else:
            self.W = self.W = np.random.randn(self.X.shape[-1]) * np.sqrt(2 / (self.X.shape[-1] + 1))

        self.b = 0
        
    def forward(self, X):
        Z=np.dot(X, self.weights) + self.bias
        return Z
       
    
    def compute_loss(self, preds, y):
        self.y=y
        self.preds=preds

        mse_loss = ((preds - y) ** 2).mean()
        self.loss_values.append(mse_loss)

        return mse_loss
    
    def backward(self, preds):
        d_loss = 2 * (preds - y) / len(preds) 

       
        self.dW = self.X.T.dot(d_loss)  
        self.db = d_loss.sum()  

    
    def update(self,learning_rate):
        
        self.weights -= learning_rate * self.dW
        self.bias -= learning_rate * self.db

    def fit(self, X, y, n_iter, plot_cost=True):
        self.X = X
        self.y = y
        self.initialize_parameters()
        loss_values = []

        for _ in range(n_iter):
            preds = self.predict(X)
            loss = self.compute_loss(preds, y)
            self.backward(preds, y)
            self.update(learning_rate=0.01)  # Adjust the learning rate as needed

            self.loss_values.append(loss)

        if plot_cost:
            import matplotlib.pyplot as plt
            plt.plot(range(n_iter), self.loss_values)
            plt.xlabel("Iterations")
            plt.ylabel("Loss")
            plt.title("Cost Function Evolution")
            plt.show()
            
        
    

    def predict(self, X):

        preds = X.dot(self.weights) + self.bias
        return preds

model = LRwithGradientDecent(lr=33)


model.fit(X,y,n_iter=1000)     

### Training Linear Regression with Gradient Descent

Fit a linear regressor using LRwithGradientDecent class on the training dataset

Try a list of lr (at most 0.1) and a list of num_iter values (num_iter=2000 at most) for the training.

In [None]:
# TODO: Your code here

### Evaluation

Evaluate the model on the training data and test data

In [12]:
# TODO: Your code here

### Scikit-learn Linear Regression

Train a linear regressor using sklearn library and compare its performance with your LinearRegression.

In [None]:
# TODO: Your code here

### Visualization

Plot the training data (scatter plot) and the fitted lines by LRwithGradientDecent and LinearRegression (with different colored line plots) in a single figure.

In [None]:
# TODO: Your code here


Plot the test data (scatter plot) and the fitted lines by LRwithGradientDecent and LinearRegression (with different colored line plots) in a single figure.

In [13]:
# TODO: Your code here

## Part 2: Polynomial Regression

Fit 2-degree, 4-degree, 8-degree polynomial regression models using only Scikit-learn library.

In [14]:
from sklearn.preprocessing import PolynomialFeatures
poly2 = PolynomialFeatures(degree=2, include_bias=False)
poly3 = PolynomialFeatures(degree=4, include_bias=False)
poly4 = PolynomialFeatures(degree=8, include_bias=False)
poly_features = poly2.fit_transform(x.reshape(-1,1))

Evaluate the models on the training data and test data and write a comment which is the best model.

In [16]:
# TODO: Your code here

Visualize the test data (scatter plot) and the predictions of three polynomial models (different colored line plots) in a single figure.

In [15]:
# TODO: Your code here

## Part 3: Regularized Linear Regression


Read 'data.csv'.

Drop the column 'y' from the dataframe to obtain input data X.

Get the column 'y' from the dataframe to obtain target data y.

Split data (X,y) into training and test datasets with the test ratio of 25% and random_state=0 using Scikit-learn library.

Transform the data to have zero mean and unit variance.

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=0)  


Train a Linear Regression model.

In [None]:
model = LRwithGradientDecent()

Find the classes for linear regression models with L1 and L2 Loss on https://scikit-learn.org/stable/modules/classes.html#module-sklearn.linear_model and import them from sklearn. 
Please, read the class descriptions to learn class parameters and to accomplish this part correctly.

Set the regularization constant for training L1 model to 0.01.

Set the regularization constant for training L2 model to 0.7.

In [None]:
# TODO: Your code here

Import the corresponding function from Scikit-learn library for R-squared metric.

Evaluate the performance of three models based on R-squared and comment which is the best model.

In [None]:
# TODO: Your code here