Skip to content

Vigneshwaran-tech/Students-Performance-Prediction

Repository files navigation

Student Performance Prediction Using Machine Learning

A comprehensive Machine Learning project that predicts student academic performance using Linear Regression.


Project Overview

This project uses supervised Machine Learning to predict student final marks based on four key academic indicators:

  • Attendance: Class attendance percentage
  • Study Hours: Daily study time
  • Internal Marks: Continuous assessment scores
  • Assignments: Number of assignments completed

The system helps educators identify struggling students early and provide timely interventions.


Features

  • ML Model using Linear Regression
  • Web Interface (Flask + HTML/CSS/JavaScript)
  • Desktop GUI Application (Tkinter)
  • Advanced Visualizations and Analytics
  • Comprehensive Project Documentation
  • 30+ Viva Interview Questions with Answers

Project Structure

STUDENT PERFORMANCE PREDICTION/

student_data.csv # Training dataset
student_prediction.py # Core ML model & analysis
visualizations.py # Advanced charts & graphs
app.py # Flask web application
gui.py # Desktop GUI application

templates/
index.html # Web interface

static/
style.css # Web styling
script.js # Web scripts

PROJECT_REPORT.md # Full project documentation
VIVA_QUESTIONS.md # Interview Q&A
README.md # This file

Generated Files (after running):
prediction_plot.png
advanced_analysis.png
feature_analysis.png

Installation

Prerequisites

  • Python 3.7 or higher
  • pip (Python package manager)

Setup Steps

  1. Clone/Download the project
cd "STUDENT PERFORMANCE PREDICTION"
  1. Install Dependencies
pip install pandas numpy scikit-learn matplotlib flask
  1. Verify Installation
python -c "import pandas, sklearn, matplotlib; print('All libraries installed!')"

Usage Guide

Option 1: Command Line Analysis

Run the core ML model and generate basic analysis:

python student_prediction.py

Output:

  • Model evaluation metrics
  • Predictions for sample student
  • Visualization saved as prediction_plot.png

Option 2: Advanced Visualizations

Generate comprehensive charts and analysis:

python visualizations.py

Output:

  • advanced_analysis.png (6 different charts)
  • feature_analysis.png (4 feature impact plots)

Option 3: Web Interface (Recommended)

Launch interactive web application:

python app.py

Then open in browser:

http://localhost:5000

Features:

  • Beautiful responsive interface
  • Real-time predictions
  • Model information display
  • Input validation

Option 4: Desktop GUI

Launch desktop application:

python gui.py

Features:

  • Stand-alone application
  • No web server needed
  • Tkinter-based interface
  • Model statistics display

Model Details

Algorithm: Linear Regression

Why Linear Regression?

  • Simple and interpretable
  • Fast training and predictions
  • Suitable for continuous output (marks)
  • Good baseline for performance comparison

Model Equation

Final Marks = 9.3079 
+ (0.2169 × Attendance)
+ (1.9277 × Study Hours)
+ (0.4169 × Internal Marks)
+ (1.4842 × Assignments)

Performance Metrics

Metric Value Interpretation
R² Score 0.9987 99.87% variance explained - EXCELLENT!
MSE 0.2492 Mean squared error
RMSE 0.4992 Root mean squared error (~0.5 marks)

Feature Importance

Feature Coefficient Impact
Internal Marks 0.8612 Highest impact (each point +0.86 marks)
Study Hours 0.4741 Each hour +0.47 marks
Assignments 0.2822 Each assignment +0.28 marks
Attendance 0.0681 Each % +0.07 marks

Dataset

Sample Data (student_data.csv)

attendance,study_hours,internal_marks,assignments,final_marks
85,3,70,8,75
90,4,78,9,82
60,2,55,6,58
75,3,65,7,68
95,5,85,10,90

Data Specifications

  • Records: 60 students (expanded from 10)
  • Features: 4 input variables
  • Target: 1 output variable (final marks)
  • Range: Marks 0-100
  • Training: 48 students (80%)
  • Testing: 12 students (20%)

Example Predictions

Strong Student

Input:

  • Attendance: 95%
  • Study Hours: 5
  • Internal Marks: 85
  • Assignments: 10

Output:

  • Predicted Marks: 89.2
  • Grade: A+
  • Result: PASS

Weak Student

Input:

  • Attendance: 50%
  • Study Hours: 1
  • Internal Marks: 40
  • Assignments: 4

Output:

  • Predicted Marks: 45.8
  • Grade: F
  • Result: FAIL

Documentation

Project Report

Comprehensive 12-section report including:

  • Executive summary
  • Problem statement
  • Methodology
  • Results & evaluation
  • System architecture
  • Implementation details
  • Advantages & limitations
  • Future enhancements

File: PROJECT_REPORT.md

Viva Interview Questions

30 frequently asked interview questions with detailed answers:

  • Basic ML concepts
  • Project-specific questions
  • Technical implementation questions
  • Real-world impact questions
  • Advanced follow-ups

File: VIVA_QUESTIONS.md


API Reference

Web Interface Endpoints

GET /

  • Returns main prediction interface

POST /predict

  • Input: JSON with student data
  • Output: Predicted marks, grade, result
  • Example:
{
"attendance": 85,
"study_hours": 4,
"internal_marks": 75,
"assignments": 8
}

GET /model-info

  • Returns model coefficients and performance metrics

Advanced Features

Visualizations Included

  1. Actual vs Predicted - Scatter plot with perfect prediction line
  2. Residual Plot - Shows prediction errors
  3. Feature Importance - Bar chart of feature coefficients
  4. Distribution Analysis - Histogram of marks
  5. Model Metrics - R², MSE, RMSE display
  6. Error Distribution - Bar chart of errors
  7. Feature Impact Analysis - 4 individual feature plots

Performance Considerations

Model Strengths

  • Accurate predictions with consistent data
  • Fast inference time
  • Interpretable coefficients
  • No overfitting issues with small dataset

Model Limitations

  • Small training dataset (10 samples)
  • Linear assumptions only
  • Limited feature set
  • May not capture non-linear relationships

Improvement Opportunities

  • Expand dataset to 100+ students
  • Add more features (behavior, engagement, etc.)
  • Try ensemble methods (Random Forest, XGBoost)
  • Implement cross-validation
  • Use hyperparameter tuning

Troubleshooting

Error: ModuleNotFoundError

Solution: Install missing packages

pip install pandas numpy scikit-learn matplotlib flask

Error: Port 5000 already in use

Solution: Use a different port in app.py

app.run(debug=True, port=5001)

Error: File not found (student_data.csv)

Solution: Ensure CSV file is in same directory as Python files

GUI not launching

Solution: Ensure Tkinter is installed (included with Python)


Future Enhancements

Phase 1: Data Enhancement

  • Expand dataset to 500+ records
  • Add behavioral features
  • Include test scores
  • Collect multiple semesters

Phase 2: Model Improvement

  • Compare multiple algorithms
  • Implement ensemble methods
  • Add feature engineering
  • Optimize hyperparameters

Phase 3: System Enhancement

  • Real-time dashboard
  • Mobile application
  • Database integration
  • API for third-party access

Phase 4: Intelligence Enhancement

  • Personalized recommendations
  • Automated interventions
  • Natural language feedback
  • Predictive alerts

Contributing

Feel free to:

  • Report bugs and issues
  • Suggest new features
  • Improve documentation
  • Expand dataset
  • Optimize code

License

This project is open-source and available for educational purposes.


Support

For questions or issues:

  1. Check PROJECT_REPORT.md for detailed documentation
  2. Review VIVA_QUESTIONS.md for concept clarification
  3. Examine code comments for implementation details

Quick Start Commands

# Install dependencies
pip install pandas numpy scikit-learn matplotlib flask

# Run analysis
python student_prediction.py

# Generate visualizations
python visualizations.py

# Launch web app
python app.py

# Launch desktop app
python gui.py

Key Takeaways

  • Machine Learning can predict student performance accurately
  • Study hours have the highest impact on final marks
  • Early prediction enables timely interventions
  • Data-driven decisions improve educational outcomes
  • This project demonstrates practical ML application in education

One-Line Summary

A machine learning system that predicts student academic performance using attendance and study data to enable early intervention and improve educational outcomes.


Created: January 28, 2026 Status: Production Ready Last Updated: January 28, 2026


Happy Learning!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published