
# Predicting Student Grades Using Linear Regression

This notebook demonstrates how to build and evaluate a **linear regression model** using student performance data.
The objective is to predict a student’s final grade (`G3`) based on several features such as previous grades, study time, and absences.

---
**Steps covered in this notebook:**
1. Import necessary libraries.
2. Load and explore the dataset.
3. Select features and preprocess data.
4. Train and evaluate the model.
5. Visualize results.
6. Save and reload the model using Pickle.


In [None]:

# Import necessary libraries
import pandas as pd
import numpy as np
import sklearn
from sklearn import linear_model
from sklearn.utils import shuffle
import matplotlib.pyplot as plt
import pickle
from matplotlib import style



## Step 1: Load and Explore the Dataset

We will use the `student-mat.csv` dataset, which contains information on students' academic performance.


In [None]:

# Load the dataset
data = pd.read_csv('student-mat.csv', sep=';')

# Display the first few rows
data.head()



## Step 2: Feature Selection and Preprocessing

We select relevant features (`G1`, `G2`, `studytime`, `failures`, `absences`) to predict the final grade (`G3`).


In [None]:

# Select relevant columns
data = data[['G1', 'G2', 'G3', 'studytime', 'failures', 'absences']]

# Define the target variable
predict = 'G3'

# Split data into features (X) and target (y)
X = np.array(data.drop([predict], axis=1))
y = np.array(data[predict])



## Step 3: Train/Test Split

We split the dataset into training and testing sets with a test size of 10%.


In [None]:

# Split the data into training and test sets
x_train, x_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y, test_size=0.1)



## Step 4: Model Training

We train a **linear regression model** on the training data.


In [None]:

# Create and train the linear regression model
model = linear_model.LinearRegression()
model.fit(x_train, y_train)

# Print the accuracy on the test set
accuracy = model.score(x_test, y_test)
print(f'Accuracy: {accuracy:.2f}')



## Step 5: Model Evaluation

We evaluate the model's performance by printing the coefficients and intercept.


In [None]:

# Print the model's coefficients and intercept
print('Coefficients:', model.coef_)
print('Intercept:', model.intercept_)



## Step 6: Visualization

We can visualize the relationship between predicted and actual values.


In [None]:

# Plot predicted vs actual values
style.use('ggplot')
plt.scatter(y_test, model.predict(x_test))
plt.xlabel('Actual Grades (G3)')
plt.ylabel('Predicted Grades')
plt.show()



## Step 7: Saving and Reloading the Model

We can use the **Pickle** library to save the model for future use.


In [None]:

# Save the model
with open('student_model.pickle', 'wb') as f:
    pickle.dump(model, f)

# Load the model
with open('student_model.pickle', 'rb') as f:
    loaded_model = pickle.load(f)

# Verify the loaded model's accuracy
print(f'Loaded Model Accuracy: {loaded_model.score(x_test, y_test):.2f}')
