# Student Data Analysis

## 🔧 Preprocessing
 - Identify and impute missing values for continuous and categorical features.
 - Encode gender and other categorical fields using Label Encoding or One-Hot Encoding.
 - Normalize numerical features using StandardScaler.
 - Remove outliers in exam_stress_level and part_time_work_hours using Z-score.
 - Drop duplicate records if present.

## 📊 Exploratory Data Analysis (EDA)
 - Plot histograms of gpa, mental_health_score, and exam_stress_level.
 - Analyze the relationship between hours_studied and gpa using a regression plot.
 - Compare average GPA between students with high and low sleep_quality.
 - Plot a correlation heatmap for all numeric variables.
 - Visualize gender distribution with a pie chart.

## 📈 Regression Models
 - Build a Linear Regression model to predict gpa using all other numeric features.
 - Evaluate the model using MAE, RMSE, and R² score.
 - Visualize predicted vs actual GPA with a scatter plot.
 - Compare prediction results for mental_health_score using a regression model — which features are more predictive?

## 🎯 Classification Models
 - Train a Logistic Regression model to classify passed_gpa.
 - Evaluate using precision, recall, accuracy, and F1 score.
 - Train a Decision Tree Classifier to predict healthy_lifestyle.
 - Visualize the tree and discuss feature importance.
 - Use a Random Forest Classifier and compare its F1 score with the Decision Tree.

## 🧩 Clustering
 - Cluster students into 3 groups using KMeans on sleep_quality, exam_stress_level, and mental_health_score.
 - Reduce dimensionality with t-SNE and visualize clusters with color-coded points.

## ⚙ Support Vector Machine (SVM)
 - Use SVM to classify whether a student passed_gpa using all features.
 - Try both linear and RBF kernels and compare their performance metrics.

| Column Name            | Description                                                    |
|-------------------------|----------------------------------------------------------------|
| student_id              | Unique student identifier                                      |
| age                     | Student's age                                                  |
| gender                  | 'Male', 'Female', 'Non-binary'                                 |
| sleep_quality           | Self-reported sleep quality on a scale of 1–10                 |
| meal_quality            | Self-reported diet rating (1–10)                               |
| exercise_frequency      | Days per week of moderate exercise                             |
| hours_studied           | Average study hours per week                                   |
| part_time_work_hours    | Weekly hours spent on part-time work                           |
| mental_health_score     | Composite score (higher = better mental health)                |
| exam_stress_level       | Stress level before exams (1–10)                               |
| gpa                     | Final GPA (0.0 to 4.0)                                          |
| passed_gpa              | 1 if GPA ≥ 2.5, else 0                                          |
| healthy_lifestyle       | 1 if sleep_quality, meal_quality, and exercise_frequency are all above median |
| cluster_type            | Pre-labeled cluster from prior segmentation                    |
