**The Student Performance** Dataset is a dataset designed to examine the factors influencing academic student performance. The dataset consists of 10,000 student records, with each record containing information about various predictors and a performance index.


***A) Variables:***

**1. Hours Studied:** The total number of hours spent studying by each student.


Previous Scores: The scores obtained by students in previous tests.

**2. Extracurricular Activities:** Whether the student participates in extracurricular activities (Yes or No).

**3. Sleep Hours:** The average number of hours of sleep the student had per day.

**4. Sample Question Papers Practiced:** The number of sample question papers the student practiced.

***B) Target Variable:***

**Performance Index:** A measure of the overall performance of each student. The performance index represents the student's academic performance and has been rounded to the nearest integer. The index ranges from 10 to 100, with higher values indicating better performance.

In [None]:
from google.colab import drive
drive.mount('/content/drive')


Mounted at /content/drive


In [None]:
!pip install fastapi uvicorn

Collecting fastapi
  Downloading fastapi-0.111.1-py3-none-any.whl (92 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m92.2/92.2 kB[0m [31m953.2 kB/s[0m eta [36m0:00:00[0m
[?25hCollecting uvicorn
  Downloading uvicorn-0.30.1-py3-none-any.whl (62 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.4/62.4 kB[0m [31m3.6 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting starlette<0.38.0,>=0.37.2 (from fastapi)
  Downloading starlette-0.37.2-py3-none-any.whl (71 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m71.9/71.9 kB[0m [31m5.0 MB/s[0m eta [36m0:00:00[0m
Collecting fastapi-cli>=0.0.2 (from fastapi)
  Downloading fastapi_cli-0.0.4-py3-none-any.whl (9.5 kB)
Collecting httpx>=0.23.0 (from fastapi)
  Downloading httpx-0.27.0-py3-none-any.whl (75 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.6/75.6 kB[0m [31m6.0 MB/s[0m eta [36m0:00:00[0m
Collecting python-multipart>=0.0.7 (from fastapi)
  Down

In [None]:
pip install pickle-mixin



In [7]:
# Import libraries
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error, mean_squared_error
import pickle

In [8]:
# load the datasrt
data = pd.read_csv('Student_Performance.csv')

In [9]:
# display the first few rows of the dataset
data.head()

Unnamed: 0,Hours Studied,Previous Scores,Extracurricular Activities,Sleep Hours,Sample Question Papers Practiced,Performance Index
0,7,99,Yes,9,1,91.0
1,4,82,No,4,2,65.0
2,8,51,Yes,7,2,45.0
3,5,52,Yes,5,2,36.0
4,7,75,No,8,5,66.0


In [10]:
# Convert 'Extracurricular Activities' to numerical
data['Extracurricular Activities'] = data['Extracurricular Activities'].apply(lambda x: 1 if x == 'Yes' else 0)

In [11]:
# Define features and target variable
X = data[['Hours Studied', 'Previous Scores', 'Extracurricular Activities', 'Sleep Hours', 'Sample Question Papers Practiced']]
y = data['Performance Index']

In [12]:
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [13]:
# Train the model
lin_model = LinearRegression()
lin_model.fit(X_train, y_train)

In [14]:
# Evaluate the model
y_pred = lin_model.predict(X_test)
print("Mean Absolute Error:", mean_absolute_error(y_test, y_pred))
print("Mean Squared Error:", mean_squared_error(y_test, y_pred))
print("Root Mean Squared Error:", np.sqrt(mean_squared_error(y_test, y_pred)))
print("Model Score:", lin_model.score(X_test, y_test) * 100)

Mean Absolute Error: 1.6111213463123044
Mean Squared Error: 4.082628398521853
Root Mean Squared Error: 2.0205515085050054
Model Score: 98.89832909573146


In [15]:
# Save the model to a file
with open('lin_model.pkl', 'wb') as file:
    pickle.dump(lin_model, file)