# Performance Metrics

## Linear Regression

### Imports

In [1]:
import pandas as pd
import numpy as np

•	pandas (pd): Used for loading, manipulating, and analyzing datasets in the form of DataFrames.                  
•	numpy (np): A powerful library for numerical computing, used for handling arrays and performing mathematical operations.                                                                                                          

### Original Dataset

In [2]:
df = pd.read_csv('dataset.csv')
print("Original Dataset:")
print(df)

Original Dataset:
      Hours Studied  Previous Scores Extracurricular Activities  Sleep Hours  \
0                 7               99                        Yes            9   
1                 4               82                         No            4   
2                 8               51                        Yes            7   
3                 5               52                        Yes            5   
4                 7               75                         No            8   
...             ...              ...                        ...          ...   
9995              1               49                        Yes            4   
9996              7               64                        Yes            8   
9997              6               83                        Yes            8   
9998              9               97                        Yes            7   
9999              7               74                         No            8   

      Sample Question

### Dropping Categorical Column

In [3]:
df = df.drop(columns=['Extracurricular Activities'])

•	df.drop(columns=[]): Removes the column 'Extracurricular Activities', which is categorical, and hence not suitable for linear regression.

### Separating Features and Target

In [4]:
X = df.drop(columns=['Performance Index']).values
y = df['Performance Index'].values

•	X: The independent variables (features) from the dataset, excluding the target 'Performance Index'.              
•	y: The target (dependent variable), 'Performance Index', which we aim to predict.

### Setting up for Gradient Descent

In [5]:
X = (X - np.mean(X, axis=0)) / np.std(X, axis=0)
X_b = np.c_[np.ones((len(X), 1)), X]

#Initializing Parameters
theta = np.random.randn(X_b.shape[1], 1)

#Setting Hyperparameters
learning_rate = 0.01
iterations = 1000
m = len(X)

#Reshaping Target
y = y.reshape(-1, 1)

•	np.mean(X, axis=0): Computes the mean of each feature (column) in X.                                            
•	np.std(X, axis=0): Computes the standard deviation of each feature (column).                                    
•	Normalization: Standardizes the features so they have a mean of 0 and a standard deviation of 1, improving gradient descent convergence.                                                                                        
•	np.ones((len(X), 1)): Creates a column of 1s, representing the intercept (bias) term in the linear regression model.                                                                                                              
•	np.c_[]: Concatenates the column of 1s to the original features X.                                              
•	np.random.randn(): Randomly initializes the model parameters (theta) for the linear regression model. There is one parameter for each feature plus the intercept term.

### Gradient Descent Algorithm

In [6]:
for iteration in range(iterations):
    gradients = (2/m) * X_b.T.dot(X_b.dot(theta) - y)
    theta -= learning_rate * gradients

•	X_b.T.dot(): Computes the gradients of the cost function with respect to the model parameters.                  
•	theta -= learning_rate * gradients: Updates the parameters theta by subtracting the gradient scaled by the learning rate.                                                                                                      
•	Loop: Repeats this process iterations number of times to minimize the error between predicted and actual values.

### Calculating Predicted Values

In [7]:
predicted_values = X_b.dot(theta)
df['Predicted Performance Index'] = predicted_values
print("\nDataset with Predicted Performance Index:")
print(df[['Performance Index', 'Predicted Performance Index']])


Dataset with Predicted Performance Index:
      Performance Index  Predicted Performance Index
0                  91.0                    91.532244
1                  65.0                    63.469569
2                  45.0                    44.736195
3                  36.0                    36.241825
4                  66.0                    67.390699
...                 ...                          ...
9995               23.0                    21.296025
9996               58.0                    56.186280
9997               74.0                    72.685938
9998               95.0                    94.054071
9999               64.0                    65.591322

[10000 rows x 2 columns]


•	X_b.dot(theta): Computes the predicted 'Performance Index' values for each instance by multiplying the features with the learned parameters theta.

### Calculating Mean Squared Error (MSE) 

In [8]:
mse = sum((y[i][0] - predicted_values[i][0]) ** 2 for i in range(len(y))) / len(y)
print(f"\nMean Squared Error (MSE): {mse}")


Mean Squared Error (MSE): 4.245176108662539


### Calculating Root Mean Squared Error (RMSE)

In [9]:
rmse = (sum((y[i][0] - predicted_values[i][0]) ** 2 for i in range(len(y))) / len(y)) ** 0.5
print(f"\nRoot Mean Squared Error (RMSE): {rmse}")


Root Mean Squared Error (RMSE): 2.0603825151322117


### Calculating Mean Absolute Error (MAE)

In [10]:
mae = sum(abs(y[i][0] - predicted_values[i][0]) for i in range(len(y))) / len(y)
print(f"\nMean Absolute Error (MAE): {mae}")


Mean Absolute Error (MAE): 1.6375973348488966


•	abs(): Takes the absolute difference between the actual and predicted values for each data point.                
•	sum(): Sums up the absolute differences.                                                                        
•	/ len(y): Divides by the number of data points to get the average absolute error.                                
•	MAE: Provides a measure of prediction error that’s less sensitive to outliers compared to MSE.

### Calculating R-Squared Error

In [11]:
ss_total = np.sum((y - np.mean(y)) ** 2)
ss_residual = np.sum((y - predicted_values) ** 2)
r_squared = 1 - (ss_residual / ss_total)
print(f"\nR-squared: {r_squared}")


R-squared: 0.9884981216772581


•	ss_total: This is the total variation in the target variable by summing the squared differences between each actual value and the mean of the actual values.
•	ss_residual: Computes the sum of squared differences between the actual values and the predicted values, representing the variation that is not explained by the model.

### Calculating Adjusted R-Squared Error

In [12]:
n = len(y)  
p = X_b.shape[1] - 1  
adjusted_r_squared = 1 - (1 - r_squared) * (n - 1) / (n - p - 1)
print(f"\nAdjusted R-squared: {adjusted_r_squared}")


Adjusted R-squared: 0.9884935186244026


•	n: Number of observations in the dataset - to calculate dof.                                                    
•	p: This line counts the number of predictors (features) used in the model, excluding the intercept term.        
•	adjusted_r_squared: The final line computes Adjusted R-squared, which adjusts R-squared for the number of predictors, penalizing for adding unnecessary variables that do not improve the model.