# Performance Metrics

## Logistic Regression

### Imports

In [1]:
import pandas as pd
from sklearn.metrics import mean_squared_log_error, mean_absolute_percentage_error

•	pandas (pd): A powerful data manipulation and analysis library, used here for reading and handling the dataset.  
•	mean_squared_log_error: A function from sklearn.metrics to calculate Mean Squared Logarithmic Error (MSLE), a common metric for evaluating regression models.                                                                      
•	mean_absolute_percentage_error: Another function from sklearn.metrics used to compute Mean Absolute Percentage Error (MAPE), which evaluates model accuracy based on percentage differences.

### Original Dataset

In [2]:
data = pd.read_csv('dataset.csv')
print(data)

      Hours Studied  Previous Scores Extracurricular Activities  Sleep Hours  \
0                 7               99                        Yes            9   
1                 4               82                         No            4   
2                 8               51                        Yes            7   
3                 5               52                        Yes            5   
4                 7               75                         No            8   
...             ...              ...                        ...          ...   
9995              1               49                        Yes            4   
9996              7               64                        Yes            8   
9997              6               83                        Yes            8   
9998              9               97                        Yes            7   
9999              7               74                         No            8   

      Sample Question Papers Practiced 

### Mapping Categorical Values

In [3]:
data['Extracurricular Activities'] = data['Extracurricular Activities'].map({'Yes': 1, 'No': 0})
print(data.head())

   Hours Studied  Previous Scores  Extracurricular Activities  Sleep Hours  \
0              7               99                           1            9   
1              4               82                           0            4   
2              8               51                           1            7   
3              5               52                           1            5   
4              7               75                           0            8   

   Sample Question Papers Practiced  Performance Index  
0                                 1               91.0  
1                                 2               65.0  
2                                 2               45.0  
3                                 2               36.0  
4                                 5               66.0  


•	data[‘Extracurricular Activities’].map(): Converts the categorical values ‘Yes’ and ‘No’ in the “Extracurricular Activities” column to binary values, where ‘Yes’ is mapped to 1 and ‘No’ to 0. This conversion is necessary because many machine learning algorithms work better with numerical data.

### Selecting Parameters X and Y

In [4]:
X = data[['Extracurricular Activities']]
y_true = data['Performance Index']

#Creating Predicted Values using mean
y_pred = [y_true.mean()] * len(y_true)

•	X: Represents the independent variable, i.e., the predictor in the model. In this case, it’s the ‘Extracurricular Activities’ column that was just converted to binary.                                              
•	y_true: Represents the actual values of the target variable, which is ‘Performance Index’. This column holds the true values that we will compare against the predictions.                                                        
•	y_pred: Creates a list of predicted values. Instead of performing actual predictions, it uses the mean of y_true for all instances, which serves as a basic baseline prediction.                                              
•	y_true.mean(): Computes the mean of the ‘Performance Index’ column.                                              
•	len(y_true): Ensures that the same mean value is repeated for every row, matching the length of the dataset.

### Calculating Mean Squared Logarithmic Error

In [5]:
msle = mean_squared_log_error(y_true, y_pred)
print(f"The calculated MSLE is: {msle:.4f} or {msle * 100:.4f}%")

The calculated MSLE is: 0.1552 or 15.5161%


### Calculating Mean Absolute Percentage Error

In [6]:
mape = mean_absolute_percentage_error(y_true, y_pred)
print(f"The calculated MAPE is: {mape:.2f} or {mape * 100:.2f}%")

The calculated MAPE is: 0.38 or 38.26%
