A comprehensive Machine Learning project that predicts student academic performance using Linear Regression.
This project uses supervised Machine Learning to predict student final marks based on four key academic indicators:
- Attendance: Class attendance percentage
- Study Hours: Daily study time
- Internal Marks: Continuous assessment scores
- Assignments: Number of assignments completed
The system helps educators identify struggling students early and provide timely interventions.
- ML Model using Linear Regression
- Web Interface (Flask + HTML/CSS/JavaScript)
- Desktop GUI Application (Tkinter)
- Advanced Visualizations and Analytics
- Comprehensive Project Documentation
- 30+ Viva Interview Questions with Answers
STUDENT PERFORMANCE PREDICTION/
student_data.csv # Training dataset
student_prediction.py # Core ML model & analysis
visualizations.py # Advanced charts & graphs
app.py # Flask web application
gui.py # Desktop GUI application
templates/
index.html # Web interface
static/
style.css # Web styling
script.js # Web scripts
PROJECT_REPORT.md # Full project documentation
VIVA_QUESTIONS.md # Interview Q&A
README.md # This file
Generated Files (after running):
prediction_plot.png
advanced_analysis.png
feature_analysis.png
- Python 3.7 or higher
- pip (Python package manager)
- Clone/Download the project
cd "STUDENT PERFORMANCE PREDICTION"- Install Dependencies
pip install pandas numpy scikit-learn matplotlib flask- Verify Installation
python -c "import pandas, sklearn, matplotlib; print('All libraries installed!')"Run the core ML model and generate basic analysis:
python student_prediction.pyOutput:
- Model evaluation metrics
- Predictions for sample student
- Visualization saved as
prediction_plot.png
Generate comprehensive charts and analysis:
python visualizations.pyOutput:
advanced_analysis.png(6 different charts)feature_analysis.png(4 feature impact plots)
Launch interactive web application:
python app.pyThen open in browser:
http://localhost:5000
Features:
- Beautiful responsive interface
- Real-time predictions
- Model information display
- Input validation
Launch desktop application:
python gui.pyFeatures:
- Stand-alone application
- No web server needed
- Tkinter-based interface
- Model statistics display
Why Linear Regression?
- Simple and interpretable
- Fast training and predictions
- Suitable for continuous output (marks)
- Good baseline for performance comparison
Final Marks = 9.3079
+ (0.2169 × Attendance)
+ (1.9277 × Study Hours)
+ (0.4169 × Internal Marks)
+ (1.4842 × Assignments)
| Metric | Value | Interpretation |
|---|---|---|
| R² Score | 0.9987 | 99.87% variance explained - EXCELLENT! |
| MSE | 0.2492 | Mean squared error |
| RMSE | 0.4992 | Root mean squared error (~0.5 marks) |
| Feature | Coefficient | Impact |
|---|---|---|
| Internal Marks | 0.8612 | Highest impact (each point +0.86 marks) |
| Study Hours | 0.4741 | Each hour +0.47 marks |
| Assignments | 0.2822 | Each assignment +0.28 marks |
| Attendance | 0.0681 | Each % +0.07 marks |
attendance,study_hours,internal_marks,assignments,final_marks
85,3,70,8,75
90,4,78,9,82
60,2,55,6,58
75,3,65,7,68
95,5,85,10,90- Records: 60 students (expanded from 10)
- Features: 4 input variables
- Target: 1 output variable (final marks)
- Range: Marks 0-100
- Training: 48 students (80%)
- Testing: 12 students (20%)
Input:
- Attendance: 95%
- Study Hours: 5
- Internal Marks: 85
- Assignments: 10
Output:
- Predicted Marks: 89.2
- Grade: A+
- Result: PASS
Input:
- Attendance: 50%
- Study Hours: 1
- Internal Marks: 40
- Assignments: 4
Output:
- Predicted Marks: 45.8
- Grade: F
- Result: FAIL
Comprehensive 12-section report including:
- Executive summary
- Problem statement
- Methodology
- Results & evaluation
- System architecture
- Implementation details
- Advantages & limitations
- Future enhancements
File: PROJECT_REPORT.md
30 frequently asked interview questions with detailed answers:
- Basic ML concepts
- Project-specific questions
- Technical implementation questions
- Real-world impact questions
- Advanced follow-ups
File: VIVA_QUESTIONS.md
GET /
- Returns main prediction interface
POST /predict
- Input: JSON with student data
- Output: Predicted marks, grade, result
- Example:
{
"attendance": 85,
"study_hours": 4,
"internal_marks": 75,
"assignments": 8
}GET /model-info
- Returns model coefficients and performance metrics
- Actual vs Predicted - Scatter plot with perfect prediction line
- Residual Plot - Shows prediction errors
- Feature Importance - Bar chart of feature coefficients
- Distribution Analysis - Histogram of marks
- Model Metrics - R², MSE, RMSE display
- Error Distribution - Bar chart of errors
- Feature Impact Analysis - 4 individual feature plots
- Accurate predictions with consistent data
- Fast inference time
- Interpretable coefficients
- No overfitting issues with small dataset
- Small training dataset (10 samples)
- Linear assumptions only
- Limited feature set
- May not capture non-linear relationships
- Expand dataset to 100+ students
- Add more features (behavior, engagement, etc.)
- Try ensemble methods (Random Forest, XGBoost)
- Implement cross-validation
- Use hyperparameter tuning
Solution: Install missing packages
pip install pandas numpy scikit-learn matplotlib flaskSolution: Use a different port in app.py
app.run(debug=True, port=5001)Solution: Ensure CSV file is in same directory as Python files
Solution: Ensure Tkinter is installed (included with Python)
- Expand dataset to 500+ records
- Add behavioral features
- Include test scores
- Collect multiple semesters
- Compare multiple algorithms
- Implement ensemble methods
- Add feature engineering
- Optimize hyperparameters
- Real-time dashboard
- Mobile application
- Database integration
- API for third-party access
- Personalized recommendations
- Automated interventions
- Natural language feedback
- Predictive alerts
Feel free to:
- Report bugs and issues
- Suggest new features
- Improve documentation
- Expand dataset
- Optimize code
This project is open-source and available for educational purposes.
For questions or issues:
- Check
PROJECT_REPORT.mdfor detailed documentation - Review
VIVA_QUESTIONS.mdfor concept clarification - Examine code comments for implementation details
# Install dependencies
pip install pandas numpy scikit-learn matplotlib flask
# Run analysis
python student_prediction.py
# Generate visualizations
python visualizations.py
# Launch web app
python app.py
# Launch desktop app
python gui.py- Machine Learning can predict student performance accurately
- Study hours have the highest impact on final marks
- Early prediction enables timely interventions
- Data-driven decisions improve educational outcomes
- This project demonstrates practical ML application in education
A machine learning system that predicts student academic performance using attendance and study data to enable early intervention and improve educational outcomes.
Created: January 28, 2026 Status: Production Ready Last Updated: January 28, 2026
Happy Learning!