# Multivariable Linear Regression Model Training Using Momentum-based and Vanilla GD with Batch/mini-Batch/Stochastic Variants:
## Please read and follow the instructions very carefully (Good Luck).
### In this task you should implement a mini-batch momentum based gradient descent optimizer to train a multivariable linear regression model.
### Your optimizer should be able to work as Stochastic/mini-batch/Batch by just adjusting mini-batch size without any code modification.
### Your optimizer should also work as GD without momentum by adjusting the hyperparameter value (think how?) without any code modification.
### Make your implementation as a function.
### The function should return the model hyperparameters and required output for plotting the learning curves.
### Data shuffle and adding the first bias feature (containig ones) must be berformed inside the function.
### The optimizer must be able to deal with any mini-batch size.
### Maximum number of epochs must be stated in order to avoid infinite loop.
### Gradient check and cost convergence check stop criterias must be implemented.
### You must plot the following learning curves:
#### - Loss vs. iterations (not Epoch).
#### - Loss vs. theta0, loss vs. theta1, loss vs. theta2.
### You must evaluate your model using r2_score metrics and achieve at least 0.8 score in any scenario of your choice and plot all the learning curves.
### You must run (at least) the following scenarios and show results and learning curves for each scenario (Do not do new implementation for each scenario. Use your implemenation and change input parameter values of the function):
#### - Momentum based with mini-batch size of your choice.
#### - Momentum based with Stochastic Gradient Descent.
#### - Momentum based with Batch Gradient Descent.
#### - Vanilla GD with mini-batch size of your choice.
#### - Vanilla GD with Stochastic Gradient Descent.
#### - Vanilla GD with Batch Gradient Descent.
#### - Any scenario with mini-batch size 32.
### You must use vectorize implementation. i.e., you should not have only two loops one for epochs and one for iterations (Only optimizer's loops).
### Your function should take the following inputs:
#### Input features, target label, learning rate, momentum term, mini-batch size, max number of epochs, gradient check tolerence, cost convergence check tolerence, and any other argument you think useful.
### The function should initialize the model parameters to zeros.
### Generate a regression data with two input features and 500 observations (You can see a sample code below for data generation. feel free to change any of the values to btain your own data to achieve the required score).

In [1]:
import numpy as np
import pandas as pd

# Generate random data
np.random.seed(42) 

# Create 500 records
n_samples = 500

# Independent variables (features)
x1 = np.random.uniform(low=2.0, high=5.0, size=n_samples)
x2 = np.random.uniform(low=4.0, high=8.0, size=n_samples)

# Dependent variable (target)
y = 1000 + 50 * x1 - 30 * x2 + np.random.normal(loc=0, scale=10, size=n_samples)

# Create a DataFrame
data = {
    'x1': x1,
    'x2': x2,
    'y': y
}

df = pd.DataFrame(data)

df.head()  


Unnamed: 0,x1,x2,y
0,3.12362,6.792647,954.178622
1,4.852143,6.144385,1044.922138
2,4.195982,5.23811,1056.457756
3,3.795975,7.25518,978.249228
4,2.468056,6.738925,926.83296


# Your submission must be in .ipynb file format that contains all results and curves.
# Do not submit any link for a notebook.
# Do not forget to klick on Hand In after uploading your file.
# Good Luck

In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.metrics import r2_score
%matplotlib inline

In [3]:
Data=np.hstack((np.ones((df.shape[0], 1)),df))
np.random.shuffle(Data)
Data

array([[1.00000000e+00, 2.25604239e+00, 7.68999752e+00, 8.90604411e+02],
       [1.00000000e+00, 2.28332888e+00, 6.57599817e+00, 9.10496881e+02],
       [1.00000000e+00, 3.15830791e+00, 6.23856133e+00, 9.70914346e+02],
       ...,
       [1.00000000e+00, 2.35645375e+00, 6.95213447e+00, 9.25542620e+02],
       [1.00000000e+00, 4.43034018e+00, 7.38580917e+00, 1.00390077e+03],
       [1.00000000e+00, 3.10334940e+00, 5.42438690e+00, 9.79632819e+02]])

In [4]:
features=Data[:,:-1]
y_true=Data[:,-1]

Input features, target label, learning rate, momentum term, mini-batch size, max number of epochs, gradient check tolerence, cost convergence check tolerence

In [None]:
def Momentum_Based_Multi_Variable_GD(X, y, learning_rate, momentum_term,batch_size, beta,max_epoch,gradient_tolerence,convergence_tolerence):
    all_costs = []
    all_theta = []
    # Initialize theta vector with zeros
    thetas = np.zeros((X.shape[1], 1))
    # Initialize velocity vector with zeros
    v_t = np.zeros((X.shape[1], 1))
    
    

    for i in range(max_iter):
        print(f"****************Iteration {i}*******************")

        # Calculate hypothesis
        h = X @ thetas

         #  for plot thetas with loss later
        all_theta.append(thetas.flatten().tolist())

        # Calculate error vector
        error_vector = h - y

        # Calculate cost function
        s = error_vector.T @ error_vector
        E = s / (2 * len(y))
        costs.append(E)
        print("Cost (J):", E)

        # Calculate gradient vector
        gradient_vector = (1 / len(X)) * (X.T @ (h - y))

        # Calculate norm of the gradient vector
        gradient_vector_norm = np.linalg.norm(gradient_vector)
        print("Gradient Vector Norm:", gradient_vector_norm)

        # Gradient Check
        if gradient_vector_norm < 0.001:
            break

        # convergence check
        if i > 0 and (abs(costs[i-1] - costs[i]) < 0.001):
            break

        # Update velocity
        v_theta = beta * v_theta + (1 - beta) * gradient_vector

        # Update thetas
        thetas = thetas - learning_rate * v_theta
        print("Updated Thetas:", thetas)

    all_theta = np.array(all_theta)

    print("*************Training Report************\n")
    print(f"Gradient Descent converged after {i} iterations\n")
    print("Final Cost (J):", E)
    print("Final Thetas:", thetas)
    print("Final Hypothesis (h(X)):", h)
    

    

    return thetas, h, costs, all_theta










In [None]:
def Momentum_Based_Multi_Variable_GD(X, y, learning_rate, momentum_term,batch_size, beta,max_epoch,gradient_tolerence,convergence_tolerence):
    
    all_costs = []
    all_theta = []
    # Initialize theta vector with zeros
    thetas = np.zeros((X.shape[1], 1))
    # Initialize velocity vector with zeros
    v_t = np.zeros((X.shape[1], 1))

    for i in range(Max_iter):
        print(f"*********** Iteration {i} ***********************")

        for m in range(0, len(x), batch_size):
            
            all_theta.append(theta.flatten().tolist())


            
            x_batch = x[m:m+batch_size]
            y_batch = y[m:m+batch_size]

            h = x_batch @ theta
            

            
            error = h - y_batch
            norm_error = np.linalg.norm(error)
            norm_squar = norm_error ** 2
            E = norm_squar / (2 * len(x_batch))
            all_costs.append(E)

            
            gradient_vector = (x_batch.T @ error) / len(x_batch)

           

            
            # Update v_theta
            v_t = beta * v_t + (1 - beta) * gradient_vector

            # Update thetas
            thetas = thetas - learning_rate * v_t


        print("*************Training Report************\n")
        print('j =', E, "\n")
        print('h(x):', h)
        print('Error Vector:\n', error, "\n")
        Gradient_norm = np.linalg.norm(gradient_vector)
        print('Gradient Vector Norm :', Gradient_norm, "\n")
      

        # Gradient Check:
        if Gradient_norm <gradient_tolerence:
            break

        # convergence check:
        if i > 0 and abs(all_costs[-int((len(x)/batch_size)+1)] - all_costs[-1]) < convergence_tolerence1:
            break

        print(f"v_t{i} : ", V_t, "\n")
        print("theta_updated :", theta, "\n")

    
    all_theta = np.array(all_theta)

    return theta, h, all_costs, all_theta


