This Lab demonstrates how to use GitHub Actions to automate the full lifecycle of a machine-learning model — from training, evaluation, and versioning to probability calibration — using the UCI Bank Marketing dataset.
It follows an MLOps-style continuous-integration workflow, where each code push automatically retrains and calibrates the model, stores versioned artifacts, and logs metrics for reproducibility.
Predict whether a bank client will subscribe to a term deposit (yes or no) after being contacted by a marketing campaign.
Goal:
Build a supervised machine-learning model that classifies customers into two groups:
- 1 → yes – customer subscribed
- 0 → no – customer did not subscribe
Type: Binary Classification
Algorithm Used: Random Forest Classifier
Metric Focus: Accuracy • F1 Score • Brier Score
Source: UCI Machine Learning Repository – Bank Marketing
File Used: bank.csv
Rows: ≈ 4,521
Columns: 17
| Feature | Description | 
|---|---|
| age | Age of the client | 
| job | Type of job (e.g., admin, technician) | 
| marital | Marital status | 
| education | Education level | 
| default | Has credit in default? | 
| balance | Average yearly balance in € | 
| housing | Has a housing loan? | 
| loan | Has a personal loan? | 
| contact | Contact communication type | 
| day,month | Last contact day and month | 
| duration | Last contact duration (seconds) | 
| campaign | Number of contacts during this campaign | 
| pdays,previous,poutcome | Past campaign details | 
| y | Target: Subscribed ( yes / no ) | 
GitHub_Lab_LAB2/
├── .github/
│   └── workflows/
│       ├── model_retraining_on_push.yml
│       └── model_calibration_on_push.yml
├── src/
│   ├── train_model.py
│   ├── evaluate_model.py
│   └── calibrate.py
├── models/
│   ├── model_<timestamp>.pkl
│   └── calibrated_model_<timestamp>.pkl
├── metrics/
│   ├── metrics.txt
│   └── calibration_metrics.txt
├── requirements.txt
└── README.md
- Automatically downloads & extracts the UCI Bank dataset
- Label-encodes categorical features
- Splits data into training and testing sets
- Trains a RandomForestClassifier(n_estimators=200)
- Saves the trained model → /models/model_<timestamp>.pkl
- Loads test data and computes Accuracy & F1 Score
- Writes results → /metrics/metrics.txt
- Uses CalibratedClassifierCV(method='isotonic')
- Compares Brier Score & F1 before vs after calibration
- Saves calibrated model → /models/calibrated_model_<timestamp>.pkl
Trigger: Automatically runs on every push to main branch
Steps:
- Checkout repository
- Set up Python environment
- Install dependencies from requirements.txt
- Run train_model.pyto train the Random Forest model
- Run evaluate_model.pyto compute metrics
- Commit new models and metrics back to the repository
- Push updated artifacts to main
Example Output:
models/model_20250120_143022.pkl
metrics/metrics.txt
Trigger: Automatically runs after model retraining completes
Steps:
- Checkout repository with latest trained model
- Set up Python environment
- Install dependencies
- Run calibrate.pyto perform isotonic calibration
- Compare Brier Score before and after calibration
- Save calibrated model and calibration metrics
- Commit and push calibrated artifacts
Example Output:
models/calibrated_model_20250120_143445.pkl
metrics/calibration_metrics.txt
- Python 3.8 or higher
- Git
- GitHub account (for Actions)
- Clone the repository:
git clone https://github.com/yourusername/GitHub_Lab_LAB2.git
cd GitHub_Lab_LAB2- Create a virtual environment:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txtpython src/train_model.pypython src/evaluate_model.pypython src/calibrate.pypandas>=1.3.0
numpy>=1.21.0
scikit-learn>=1.0.0
requests>=2.26.0- Accuracy: Overall classification accuracy
- F1 Score: Harmonic mean of precision and recall
- Brier Score: Measures calibration quality (lower is better)
=== Before Calibration ===
Brier Score = 0.0656
F1 = 0.4167
Accuracy = 0.9072
=== After Calibration ===
Brier Score = 0.0621
F1 = 0.4430
Accuracy = 0.9083
graph LR
    A[Code Push] --> B[Trigger Retraining Workflow]
    B --> C[Train Model]
    C --> D[Evaluate Model]
    D --> E[Commit Artifacts]
    E --> F[Trigger Calibration Workflow]
    F --> G[Calibrate Model]
    G --> H[Compare Metrics]
    H --> I[Commit Calibrated Model]
    I --> J[Complete]
    - UCI Bank Marketing Dataset
- scikit-learn Documentation
- GitHub Actions Documentation
- Probability Calibration in scikit-learn