# 05. Modelling and evaluation

# Objectives

The purpose of this notebook is to train, evaluate, and interpret machine learning models that predict **CEFR levels** based on learners’ language proficiency scores and engineered features. Specifically, we aim to:

- Train classification models using the processed dataset (numeric + categorical engineered features).  
- Compare model performance across multiple algorithms (e.g., Logistic Regression, Random Forest, Gradient Boosting).  
- Evaluate models using accuracy, precision, recall, F1-score, and confusion matrices to assess classification reliability.  
- Select the best-performing model for deployment and future integration into a personalized recommendation system.  
- Save the trained model and evaluation results for reproducibility and downstream use.  


# Inputs

- **Processed dataset**: `data/processed/features.csv`  
  - Includes original skill scores, engineered features (e.g., strongest/weakest skill, profile type), and encoded CEFR target variable.  
- **Feature matrix (X)**: Scaled numeric features + encoded categorical engineered features.  
- **Target vector (y)**: Encoded CEFR levels (A1–C2).  


# Outputs

- Trained machine learning models (baseline + advanced).  
- Evaluation metrics (accuracy, precision, recall, F1-score, confusion matrix).  
- Visualizations of model performance.  
- Final selected model, serialized (e.g., `model.joblib`) for reuse.  
- Documentation of why the chosen model best supports the **business goal** of automatic learner placement.  


# Additional Information

This stage directly addresses the **business requirement**: predicting learners’ CEFR levels to enable **automatic placement** and **personalized learning recommendations**.  
While the model outputs only the predicted CEFR level, the engineered features (skill strengths, weaknesses, balance profiles) provide the contextual insights needed for tailored feedback.  
By rigorously evaluating different models and selecting the best one, we ensure predictions are not only accurate but also interpretable, reliable, and suitable for real-world integration into an adaptive learning platform.  

---