# Student Grade Predictor using Linear Regression

This is a machine learning algorithm for predicting student performance using the Linear Regression technique. The goal of this program is to forecast the final grades of students based on their academic performance and other related factors.

## Overview

In this algorithm, we use the "student-mat.csv" dataset, which is part of the [UCI Machine Learning Repository](https://archive.ics.uci.edu/ml/datasets/Student+Performance). The dataset contains information about student performance in mathematics. The features include attributes such as first-period grade, second-period grade, weekly study time, school type, family size, parent's occupation, and more.

## Steps Performed by the Code

The Student Grade Predictor is a tool that uses a Linear Regression model to predict the final grade of a student based on their first-period grade (G1), second-period grade (G2), and weekly study time. The model is trained on a dataset containing student information, and the user can input values for G1, G2, and study time through an interactive Graphical User Interface (GUI) to obtain the predicted final grade for a new student.

The predictor uses one-hot encoding for categorical variables and is trained on a dataset (assuming the dataset is in the same directory as the script) that is preprocessed to handle missing values or categorical variables.

1. **Data Loading:** The code reads the "student-mat.csv" file, which contains the student performance data, using the pandas library. The data is loaded into a DataFrame for further processing.

2. **Data Preprocessing:** The dataset may have missing values or categorical variables that need handling. The code preprocesses the data, converting categorical variables into numerical form using one-hot encoding. This transformation is necessary because most machine learning algorithms, including Linear Regression, require numerical inputs.

3. **Data Splitting:** The data is split into training and testing sets using the `train_test_split()` function from sklearn. This ensures that the model is trained on a subset of the data and evaluated on unseen data to assess its generalization performance.

4. **Model Training:** The Linear Regression model from sklearn is created and trained on the training data using the `fit()` method. The model aims to learn the relationships between the features and the target variable (final grade).

5. **Model Evaluation:** After training, the model's performance is evaluated using the test data. Two common evaluation metrics used are Mean Squared Error (MSE) and R-squared (R2). MSE measures the average squared difference between the predicted and actual grades, while R2 indicates how well the model explains the variance in the target variable.

6. **Example Prediction with GUI:** The code features an interactive GUI that allows users to input the first-period grade, second-period grade, and weekly study time of a new student. The model will predict their final grade (G3) based on these inputs, providing a convenient and user-friendly way to utilize the predictor.

---

## Install the required packages

These packages are essential for different aspects of the project, from data handling and machine learning to creating an interactive GUI within the Jupyter notebook environment.

- **Pandas**  # Data manipulation and analysis
- **Numpy**   # Fundamental package for numerical computations
- **Scikit-learn**  # Machine learning library
- **IPywidgets**    # Interactive widgets for Jupyter notebooks
- **Ttkthemes**     # Theming extension for Tkinter

In [1]:
# Install the required packages
!pip install pandas
!pip install numpy
!pip install tk
!pip install scikit-learn
!pip install ttkthemes

Collecting tk
  Downloading tk-0.1.0-py3-none-any.whl.metadata (693 bytes)
Downloading tk-0.1.0-py3-none-any.whl (3.9 kB)
Installing collected packages: tk
Successfully installed tk-0.1.0
Collecting ttkthemes
  Downloading ttkthemes-3.2.2.tar.gz (891 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m891.2/891.2 kB[0m [31m116.4 kB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25h  Preparing metadata (setup.py) ... [?25ldone
Building wheels for collected packages: ttkthemes
  Building wheel for ttkthemes (setup.py) ... [?25ldone
[?25h  Created wheel for ttkthemes: filename=ttkthemes-3.2.2-cp312-cp312-macosx_11_0_arm64.whl size=1793985 sha256=894f8635929e31d3ad61df504dfe9f592e314ffe8424a5b7b9b51c25cd3eec35
  Stored in directory: /Users/chukkanavyabharathi/Library/Caches/pip/wheels/d4/69/04/0666a275c3b5cf4ed1036378c0ab691467df0a47a6379bda53
Successfully built ttkthemes
Installing collected packages: ttkthemes
Successfully installed ttkthemes-3.2.2


## Student Grade Predictor  Code

The "Student Grade Predictor" code is a Python script that uses machine learning to predict a student's final grade based on their academic performance and study time. It features an interactive GUI for easy input and visualization of the predicted grade.

In [1]:
import pandas as pd
import numpy as np
import tkinter as tk
from tkinter import ttk
from tkinter import messagebox
from sklearn import linear_model
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score
from ttkthemes import ThemedStyle

# Load the dataset (assuming the dataset is in the same directory as this script)
file_path = "data/student-mat.csv"
data = pd.read_csv(file_path, sep=';')

# Data preprocessing - handle missing values or categorical variables
# For categorical variables, we'll use one-hot encoding

# Convert categorical variables to one-hot encoding
data = pd.get_dummies(data, columns=['school', 'sex', 'address', 'famsize', 'Pstatus',
                                     'Mjob', 'Fjob', 'reason', 'guardian', 'schoolsup',
                                     'famsup', 'paid', 'activities', 'nursery', 'higher',
                                     'internet', 'romantic'], drop_first=True)

# Select features and target variable
features = data.drop(columns=['G3'])  # Features: all columns except 'G3' (final grade)
target = data['G3']  # Target variable: 'G3' (final grade)

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(features, target, test_size=0.2, random_state=42)

# Create and train the Linear Regression model
model = linear_model.LinearRegression()
model.fit(X_train, y_train)

# Make predictions on the test set
y_pred = model.predict(X_test)

# Evaluate the model
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

# Function to perform the prediction and display the result on the GUI
def predict_grade():
    def predict():
        try:
            G1_input = int(g1_entry.get())
            G2_input = int(g2_entry.get())
            study_time_input = int(studytime_entry.get())

            new_student_features = pd.DataFrame({
                'G1': [G1_input],      # First-period grade
                'G2': [G2_input],      # Second-period grade
                'studytime': [study_time_input], # Weekly study time (hours)
            })

            # Perform one-hot encoding for the new student data and align with training data
            new_student_features_encoded = pd.get_dummies(new_student_features, drop_first=True)
            new_student_features_encoded = new_student_features_encoded.align(features, join='right', axis=1, fill_value=0)[0]

            predicted_grade = model.predict(new_student_features_encoded)

            print("Predicted Final Grade for the New Student:", predicted_grade[0])
            predicted_label.config(text=f"Predicted Final Grade: {predicted_grade[0]:.2f}")
        except ValueError:
            print("Error: Please enter valid numeric values for G1, G2, and study time.")
            predicted_label.config(text="Please enter valid numeric values for G1, G2, and study time.")

    # Create the tkinter window
    window = tk.Tk()
    window.title("Student Grade Predictor")

    # Set a fixed window size
    window.geometry("400x300")  # Adjust the size as needed

    # Apply a themed style to the window
    style = ThemedStyle(window)
    style.theme_use("arc")  # You can change the theme here (try "clam", "equilux", etc.)

    # Create and pack a description label
    description_label = ttk.Label(window, text="Welcome to the Student Grade Predictor!\n"
                                               "Please enter the student's information below:")
    description_label.pack(pady=20)

    # Create and pack input fields
    g1_label = ttk.Label(window, text="G1 (first-period grade):")
    g1_label.pack()
    g1_entry = ttk.Entry(window)
    g1_entry.pack()

    g2_label = ttk.Label(window, text="G2 (second-period grade):")
    g2_label.pack()
    g2_entry = ttk.Entry(window)
    g2_entry.pack()

    studytime_label = ttk.Label(window, text="Weekly study time (hours):")
    studytime_label.pack()
    studytime_entry = ttk.Entry(window)
    studytime_entry.pack()

    # Create and pack the Predict button
    predict_button = ttk.Button(window, text="Predict", command=predict)
    predict_button.pack(pady=20)

    # Create and pack the label to display the predicted grade
    predicted_label = ttk.Label(window, text="", background=style.lookup("TLabel", "background"))
    predicted_label.pack()

    # Start the tkinter main loop
    window.mainloop()

# Call the function to predict grade
predict_grade()


FileNotFoundError: [Errno 2] No such file or directory: 'data/student-mat.csv'