🎓 Student Performance Prediction

A machine learning project to predict student academic performance and identify key factors influencing success, helping educational institutions provide proactive support.

🚀 Project Overview

This project addresses the challenge of identifying at-risk students by building predictive models based on demographic, social, and academic data. The goal is to provide educational institutions with a data-driven tool for early intervention. The project involves two main tasks:

Regression: Predicting a student's final numeric grade.
Classification: Predicting whether a student will pass or fail.

✨ Key Features

Data Exploration (EDA): In-depth analysis of student data to uncover initial trends and correlations.
Data Preprocessing: A complete pipeline for cleaning data, encoding categorical variables, and scaling features.
Dual-Task Modeling: Implements both regression and classification models to provide a comprehensive performance analysis.
Performance Evaluation: Uses a wide range of metrics (R², RMSE, Accuracy, Precision, Recall, F1-Score) for robust model assessment.
Feature Importance Analysis: Identifies the key drivers of academic success from the dataset.

⚙️ Methodology

The project follows a standard machine learning workflow:

Data Understanding: The UCI Student Performance Dataset was used. Initial analysis was performed to understand its structure, quality, and statistical properties.
Exploratory Data Analysis (EDA): Visualizations such as histograms and a correlation heatmap were created to identify relationships between variables, especially their impact on the final grade (G3).
Data Preprocessing:
- A binary pass_fail feature was engineered from the G3 grade.
- Categorical features were converted to a numerical format using one-hot encoding.
- All features were scaled using StandardScaler to prepare the data for modeling.
Model Building:
- Regression Task: A Multiple Linear Regression model was trained to predict the final grade.
- Classification Task: Logistic Regression and Decision Tree models were trained to predict the pass/fail outcome.
Model Evaluation: The models were evaluated on an unseen test set (20% of the data) to measure their real-world performance.

📊 Results & Key Insights

The models demonstrated strong predictive capabilities:

Model / Task	Metric	Score
Linear Regression	R² Score	0.72
Classification Models	Accuracy	90%
(Logistic & Decision Tree)	F1-Score	0.92

Key Insight: The feature importance analysis revealed that a student's second-period grade (G2) is overwhelmingly the most significant predictor of their final academic outcome, accounting for over 70% of the decision-making power in the model. Other important factors include parental education (Medu), student absences, and social habits.

🔧 How to Run This Project

To run this project on your local machine, follow these steps:

Clone the repository:

git clone [https://github.com/YourUsername/YourRepositoryName.git](https://github.com/YourUsername/YourRepositoryName.git)
cd YourRepositoryName

Create a virtual environment (recommended):

python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate`

Install the required libraries: Create a requirements.txt file with the following content:
```
pandas
numpy
scikit-learn
matplotlib
seaborn
jupyter
```
Then, run the installation command:
```
pip install -r requirements.txt
```
Download the Dataset:
- Download the student-mat.csv file from the UCI repository.
- Place the student-mat.csv file in the root directory of the project.
Launch Jupyter Notebook:
```
jupyter notebook
```
Open the .ipynb notebook file and run the cells.

🛠️ Technologies Used

Technology	Description
Python	Core programming language for the project.
Pandas	Data manipulation and analysis library.
NumPy	For numerical operations and array handling.
Scikit-learn	For building and evaluating machine learning models.
Matplotlib & Seaborn	For data visualization and creating plots.
Jupyter Notebook / Colab	For interactive development and documentation.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Project_Student_Performance_Prediction.ipynb		Project_Student_Performance_Prediction.ipynb
README.md		README.md
student-mat.csv		student-mat.csv
student-merge.R		student-merge.R
student-por.csv		student-por.csv
student.txt		student.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🎓 Student Performance Prediction

🚀 Project Overview

✨ Key Features

⚙️ Methodology

📊 Results & Key Insights

🔧 How to Run This Project

🛠️ Technologies Used

About

Uh oh!

Releases

Packages

Languages

tanmaymaind/Project_Student_Performance_Prediction.ipynb

Folders and files

Latest commit

History

Repository files navigation

🎓 Student Performance Prediction

🚀 Project Overview

✨ Key Features

⚙️ Methodology

📊 Results & Key Insights

🔧 How to Run This Project

🛠️ Technologies Used

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages