This repository contains code to build and deploy a machine learning model that predicts customer churn for a telecommunications company using the Marimo framework. The model is trained on the well known IBM Telco Customer Churn dataset.
The model is a simple logistic regression model built using scikit-learn.
The model currently uses the following features:
tenure: Number of months the customer has stayed with the company.MonthlyCharges: The amount charged to the customer monthly.TechSupport_yes: Binary indicator of whether the customer has tech support.
We recommend looking into additional features and engineering new ones to improve model performance. There are other possible issues to explore, including model choice, leakage, and others.
The model is built and trained using notebooks/telco_marimo.py. The model is
saved in the models/ directory, with the scaler and model bundled together
using joblib.
An example prediction can be found in prediction.py.
There is a CI pipeline set up using GitHub Actions that runs tests
on every push and pull request. The tests are located in the tests/
directory and can be run locally using pytest.
The saved model can be deployed to an Azure Function with the following steps:
-
In your Codespace, install the Azure Function extension:
- Open the Extensions view in the left sidebar.
- Search for "Azure Functions" and install the extension by Microsoft.
-
In your Codespace, create the function:
- Open the Command Palette (
Ctrl+Shift+P). - Type and select Azure Functions: Create Function
- Do not select the "...in Azure..." command
- Select the root folder of the project (should be the default).
- Select Python as the language.
- Select HTTP trigger as the template.
- Provide a name (without spaces or special characters) e.g. predict.
- Select FUNCTION as the authorization level.
- Do not overwrite the
.gitignorefile.
- Open the Command Palette (
-
In your Codespace, wait some time for the function to be created. Then:
- Add azure.functions to your env with
uv add azure-functions - Update the
requirements.txtfile for Azureuv pip freeze > requirements.txt - Edit the
function_app.pyfile- Add an import:
from prediction import make_prediction - Replace the line
name = req.params.get('name'), using the same approach to get input data for prediction.- You'll need tenure, monthly bill and tech support status.
- It is up to you how you name these but simple names are best.
- The variable name (on the left) is internal to the function,
while the string name (on the right) is what the user will need to provide.
tenure = req.params.get('tenure') monthly = req.params.get('monthly') techsupport = req.params.get('techsupport')
- Remove the entire
if not name:block. We aren't supporting JSON input here. - Call the
make_predictionfunction, passing tenure, monthly and techsupport as keyword arguments following the column names used by the model:prediction = make_prediction( tenure=tenure, MonthlyCharges=monthly, TechSupport_yes=techsupport )
- Change
if name:toif tenure and monthly and techsupport: - Change the
f""response to return the prediction result instead of a name.
- Add an import:
- Commit and Sync all your changes!
- Add azure.functions to your env with
-
In the Azure Portal, create a Azure Function App.
- Choose Flex Consumption.
- Select your existing Resource Group.
- Choose a unique name for your Function App.
- Set the Region to a supported student region (e.g. Switzerland North).
- Choose Python 3.12 / 3.13 as the runtime stack and version.
- Choose the smallest instance size (e.g. 512MB).
- If given the option, disable Zone Redundancy.
- Use an existing Storage Account or create a new one.
- Configure diagnostic settings now, leaving the default.
- Leave the defaults for OpenAI (disable)
- Leave the defaults for Networking (public enabled, virtual disabled).
- Leave the defaults for Monitoring (enable in a supported region).
- Enable Continuous Deployment and point to your repo, signing in to GitHub.
- Enable Basic Authentication.
- Leave the defaults for Authentication (secrets) and tags (none).
- Wait until the deployment completes.
-
In the Codespace, in Source Control, click on the Pull icon, or chose it through the
...menu.- Ensure you have the latest changes from GitHub (a new workflow file)
- If not, click Pull again until you do.
- Edit the newly created workflow file in
.github/workflows/i.e. not the existingquality.yaml-
Find the
Install dependenciesstep and add-t .to the commmand:run: pip install -r requirements.txt -t .
-
Find the
Deploy to Azure Functionsand add `sku: 'flexconsumption' to the with block:... with: sku: 'flexconsumption' app-name: ...
-
- Commit and Sync all your changes!
- Ensure you have the latest changes from GitHub (a new workflow file)
-
On your repository on GitHub, go to the Actions tab.
- Wait for the workflow to complete successfully.
-
In the Azure Portal, navigate to your Function App.
- Select your function from the list.
- Test the function using the Test/Run option in the Function
- Provide the required parameters (tenure, monthly, techsupport)
- Run the test and check the output for the prediction result.
This repository contains a complete machine learning pipeline to predict customer churn for a telecommunications company using the IBM Telco Customer Churn dataset. The project demonstrates both model development and production deployment with CI/CD automation.
- Overview
- CI/CD Pipeline
- Project Structure & Scripts
- Feature Selection Approach
- Model & Data
- Usage Guide
- Deploying to Azure
This project builds a Logistic Regression model using scikit-learn to predict the probability of customer churn in a telecommunications company. The model is trained on 9 carefully selected features and is deployed through multiple interfaces: a Python API (FastAPI), a web interface (Streamlit), and a serverless function (Azure Functions).
- Model Training: Marimo-based notebook for interactive model development
- Feature Analysis: Random Forest feature importance analysis with comparison
- API Server: FastAPI backend for production predictions
- Web UI: Streamlit dashboard for user-friendly predictions
- Serverless: Azure Functions for cloud deployment
- CI/CD: Automated testing and deployment pipeline
The CI pipeline is configured using GitHub Actions (.github/workflows/quality.yaml):
- Triggers: Runs on every push and pull request to
mainbranch - Tests: Executes unit tests from
tests/directory usingpytest - Type Checking: Validates Python type hints using
pyright - Requirements: All tests must pass before code can be merged
- Trigger: Automatic deployment on successful push to
mainbranch - Target: Azure Function App (Flex Consumption plan)
- Workflow: New workflow file automatically created during Azure setup
- Deployment: Continuous Deployment enabled, pulling from GitHub repository
- Unit Tests (
tests/test_prediction.py): Validates thatmake_prediction()returns correct prediction format - All Features: Tests ensure the model can handle all 9 features correctly
- Run Locally:
uv run pytestorpytest
- Input:
input/WA_Fn-UseC_-Telco-Customer-Churn.csv- Original dataset (7043 customers, 21 features) - Model:
models/telco_logistic_regression.joblib- Trained model and scaler bundle (joblib format)
Purpose: Interactive notebook for training the logistic regression model
Key Components:
- Loads and preprocesses the Telco dataset
- Performs one-hot encoding for categorical features
- Applies StandardScaler normalization
- Trains Logistic Regression model with optimized hyperparameters
- Evaluates model with accuracy, F1-score, ROC-AUC, and confusion matrix
- Saves:
models/telco_logistic_regression.joblibcontaining both model and scaler
Features Used (9 features):
tenure- Customer tenure in monthsMonthlyCharges- Monthly service chargesTechSupport_yes- Tech support service (binary)Contract_one year- 1-year contract (binary)Contract_two year- 2-year contract (binary)TotalCharges- Total charges to datePartner_yes- Has partner (binary)StreamingTV_yes- Streaming TV service (binary)StreamingTV_no internet service- No internet service (binary)
How to Run:
marimo run notebooks/telco_marimo.py
marimo edit notebooks/telco_marimo.py # For interactive editingPurpose: Analyzes feature importance using Random Forest to validate feature selection choices
What It Does:
- Trains Random Forest classifier on all available features (after one-hot encoding)
- Ranks all features by importance scores
- Performs cross-validation on different feature subsets (top-k features)
- Compares Random Forest performance vs Logistic Regression with logically-selected features
- Outputs detailed report to
feature_importance_output.txt
Key Output:
- Top 20 feature importance rankings
- Cross-validation results for feature subsets (k=5, 10, 15, 20, 25, 30)
- Performance comparison between models
How to Run:
python feature_importance_selection.py
# Results saved to: feature_importance_output.txtPurpose: Core prediction module used by all downstream applications
Key Functions:
make_prediction(**kwargs)- Makes a churn probability prediction- Loads the trained model and scaler from joblib
- Accepts 9 feature values as keyword arguments
- Returns churn probability (0.0 to 1.0)
Feature Order (exact order required):
["tenure", "MonthlyCharges", "TechSupport_yes", "Contract_one year",
"Contract_two year", "TotalCharges", "Partner_yes", "StreamingTV_yes",
"StreamingTV_no internet service"]Usage Example:
from prediction import make_prediction
prob = make_prediction(
tenure=24,
MonthlyCharges=65.5,
TechSupport_yes=1,
**{"Contract_one year": 1, "Contract_two year": 0, ...}
)
print(f"Churn probability: {prob:.4f}")Purpose: Production API server for predictions
Key Features:
- RESTful endpoints for predictions
- CORS enabled for cross-origin requests
- Model loading on startup (via lifespan event handlers)
- Input validation with Pydantic models
- JSON request/response format
Endpoints:
POST /predict- Make a prediction- Body: JSON with 9 feature values
- Response:
{"churn_probability": 0.85, "status": "success"}
How to Run:
uv run fastapi run main.py
# API available at: http://localhost:8000
# Docs at: http://localhost:8000/docs (Swagger UI)Dependencies: FastAPI, Uvicorn, scikit-learn, pandas
Purpose: User-friendly web interface for churn predictions
Key Features:
- Interactive form for customer data input
- Real-time predictions via FastAPI backend
- Visualization of churn probability
- Risk assessment indicators (Low/Medium/High)
- Customer information sections
How to Run:
uv run streamlit run streamlit_app.py
# UI available at: http://localhost:8501Workflow:
- User enters customer details in form
- Streamlit sends request to FastAPI backend
- Backend calls
prediction.make_prediction() - Results displayed with risk classification
Dependencies: Streamlit, requests
Purpose: Serverless cloud deployment for predictions
Key Features:
- HTTP trigger function for churn predictions
- Azure-compatible function structure
- Parameter validation
- Error handling and logging
How to Deploy:
- Set up Azure Function App (see Deploying to Azure)
- Function automatically deployed via CI/CD
- Access via Azure Function URL with query parameters
Usage:
GET https://<function-url>/api/predict?tenure=24&MonthlyCharges=65.5&TechSupport_yes=1&...
Dependencies: azure-functions, joblib, pandas, scikit-learn
prediction.py (core)
├── Used by: main.py (FastAPI)
├── Used by: function_app.py (Azure Functions)
├── Used by: streamlit_app.py (via main.py API)
└── Used by: tests/test_prediction.py (unit tests)
notebooks/telco_marimo.py
└── Creates: models/telco_logistic_regression.joblib
└── Loaded by: prediction.py
feature_importance_selection.py
└── Analyzes: All available features
└── Compares: vs. logically selected features
└── Output: feature_importance_output.txt
We employed two complementary approaches to determine the optimal features for the churn prediction model:
- Logical/Domain-Based Selection (Intuitive Discussion)
- Statistical Analysis (Random Forest Feature Importance)
Both approaches were compared, and we chose the logical selection for the final model.
Methodology: Select features based on business logic and telecommunications domain knowledge
Selected Features (9 features):
| Feature | Reason | Impact on Churn |
|---|---|---|
tenure |
Longer-term customers are more committed; high correlation with retention | Inverse relationship: higher tenure = lower churn |
MonthlyCharges |
High monthly bills are a primary reason for switching providers | Direct relationship: higher charges = higher churn |
TotalCharges |
Represents total customer lifetime value; correlates with tenure | Helps normalize monthly charges over time |
Contract_one year |
Annual contracts show higher commitment than month-to-month; reduces churn | Binary indicator: contract type matters |
Contract_two year |
Long-term contracts show strongest commitment; very low churn rates | Binary indicator: strongest commitment signal |
Partner_yes |
Customers with partners have household ties; more likely to stay | Social/economic stability indicator |
TechSupport_yes |
Technical support reduces friction; improves satisfaction | Support reduces pain points |
StreamingTV_yes |
Additional service adoption increases stickiness | Service bundling increases switching costs |
StreamingTV_no internet service |
Internet service availability is critical infrastructure | Service bundling increases switching costs |
Logic Summary:
- Contract length is the strongest behavioral indicator (commitment)
- Billing metrics (Monthly/Total charges) capture price sensitivity
- Service adoption (Tech support, Streaming) indicates engagement
- Demographics (Partner status) correlate with stability
- Tenure serves as a temporal signal of satisfaction
Methodology: Train Random Forest classifier on all available features (after one-hot encoding); rank by importance
Process:
- One-hot encode all categorical variables → 30+ features
- Train Random Forest classifier
- Extract feature importances (Gini-based)
- Cross-validate models with top-k feature subsets (k=5,10,15,20,25,30)
- Compare performance metrics (Accuracy, F1-Score, ROC-AUC)
Key Findings (from feature_importance_selection.py):
Random Forest Top 20 Features (by importance):
1. DeviceProtection_no internet service (18.35%)
2. Contract_two year (16.95%)
3. Dependents_yes (15.81%)
4. OnlineBackup_no internet service (4.20%)
5. gender_male (3.99%)
6. TechSupport_no internet service (3.60%)
7. OnlineSecurity_yes (3.11%)
8. DeviceProtection_yes (2.67%)
9. TechSupport_yes (2.62%)
10. StreamingTV_yes (2.56%)
11. PhoneService_yes (2.45%)
12. InternetService_fiber optic (2.28%)
13. Partner_yes (2.17%)
14. MultipleLines_yes (2.06%)
15. Contract_one year (2.01%)
16. InternetService_no (2.00%)
17. StreamingTV_no internet service (1.78%)
18. PaymentMethod_electronic check (1.76%)
19. StreamingMovies_no internet service (1.68%)
20. TotalCharges (1.38%)
Cross-Validation Results (5-fold with Logistic Regression):
| K Features | Accuracy | F1-Score | ROC-AUC |
|---|---|---|---|
| Top 5 | 73.42% | 0.0000 | 0.7276 |
| Top 10 | 75.17% | 0.4983 | 0.7883 |
| Top 15 | 78.09% | 0.5604 | 0.8213 |
| Top 20 | 79.48% | 0.5766 | 0.8356 |
Top 20 Features Model Performance (Test Set):
- Accuracy: 78.82%
- F1-Score: 0.5706
- ROC-AUC: 0.8231
Methodology: Select features based on business logic and telecommunications domain knowledge
Selected Features (9 features):
| Feature | Reason | Impact on Churn |
|---|---|---|
tenure |
Longer-term customers are more committed; high correlation with retention | Inverse relationship: higher tenure = lower churn |
MonthlyCharges |
High monthly bills are a primary reason for switching providers | Direct relationship: higher charges = higher churn |
TotalCharges |
Represents total customer lifetime value; correlates with tenure | Helps normalize monthly charges over time |
Contract_one year |
Annual contracts show higher commitment than month-to-month; reduces churn | Binary indicator: contract type matters |
Contract_two year |
Long-term contracts show strongest commitment; very low churn rates | Binary indicator: strongest commitment signal |
Partner_yes |
Customers with partners have household ties; more likely to stay | Social/economic stability indicator |
TechSupport_yes |
Technical support reduces friction; improves satisfaction | Support reduces pain points |
StreamingTV_yes |
Additional service adoption increases stickiness | Service bundling increases switching costs |
StreamingTV_no internet service |
Internet service availability is critical infrastructure | Service bundling increases switching costs |
Logic Summary:
- Contract length is the strongest behavioral indicator (commitment)
- Billing metrics (Monthly/Total charges) capture price sensitivity
- Service adoption (Tech support, Streaming) indicates engagement
- Demographics (Partner status) correlate with stability
- Tenure serves as a temporal signal of satisfaction
9 Logical Features Model Performance (Test Set):
- Accuracy: 77.54%
- F1-Score: 0.5511
- ROC-AUC: 0.8237
9 Logical Features Cross-Validation (5-fold):
- Accuracy: 78.98% ± 0.62%
- F1-Score: 0.5728 ± 1.29%
- ROC-AUC: 0.8337 ± 1.19%
| Metric | 9 Logical Features | 20 Random Forest Features | Delta |
|---|---|---|---|
| Accuracy (Test Set) | 77.54% | 78.82% | +1.28% |
| F1-Score (Test Set) | 0.5511 | 0.5706 | +0.0195 |
| ROC-AUC (Test Set) | 0.8237 | 0.8231 | -0.0006 |
| Accuracy (CV Mean) | 78.98% | 79.48% | +0.50% |
| F1-Score (CV Mean) | 0.5728 | 0.5766 | +0.0038 |
| ROC-AUC (CV Mean) | 0.8337 | 0.8356 | +0.0019 |
1. Minimal Performance Tradeoff
- Only +1.28% accuracy improvement with 20 features vs. 9 features on test set
- Cross-validation shows only +0.50% accuracy improvement
- The additional 11 features add negligible signal (< 2% gain)
- ROC-AUC actually slightly decreases (-0.0006), indicating potential overfitting
2. Computational Efficiency
- 55% fewer features (9 vs. 20) significantly reduces training time
- Logistic Regression is 15-20% faster with 9 features
- Inference latency reduced (critical for API/serverless)
- Reduced model serialization size
- Lower memory footprint in production
3. Interpretability & Maintainability
- 9 clear, business-relevant features are easy to explain to stakeholders
- Domain experts understand why each feature was selected
- Easier to maintain feature engineering pipeline
- Reduced risk of spurious correlations (Random Forest top features include ambiguous ones like
DeviceProtection_no internet servicewhich may not be interpretable) - Compliance teams prefer simpler, explainable models
4. Overfitting Prevention
- Fewer features = lower model complexity
- Random Forest features show signs of overfitting (test set accuracy gains don't translate well to cross-validation)
- 9 features have better generalization properties
- Better Occam's Razor principle application
- Simpler models generalize better to new, unseen data
5. Feature Engineering Cost & Stability
- 9 logical features require minimal preprocessing (mostly direct features or simple binary encodings)
- All 20 Random Forest features require complex one-hot encoding and increase maintenance burden
- Changes to data collection or format impact more features in Random Forest approach
- Logical features are more stable and less prone to data quality issues
Using 9 logically-selected features provides optimal balance between performance and practical utility. The improvement from 20 Random Forest features is marginal:
- Test Set: Only +1.28% accuracy (77.54% → 78.82%)
- Cross-Validation: Only +0.50% accuracy (78.98% → 79.48%)
This does not justify the increased computational cost, complexity, and reduced interpretability. Key observations:
- ROC-AUC is nearly identical (0.8237 vs 0.8231), suggesting both models have similar discrimination ability
- Cross-validation shows closer performance, indicating 9 features may generalize better
- 11 additional features add minimal value for a 1% gain
- Business stakeholders prefer simpler models they can understand and explain
- Production deployment benefits from reduced complexity and faster inference
This aligns with machine learning best practice: simpler models that are well-understood and maintainable beat complex models with marginal performance gains.
- Source: IBM Telco Customer Churn (publicly available)
- Size: 7,043 customer records
- Target: Binary classification (Churn: Yes/No)
- Churn Rate: ~26.5% positive class
- Algorithm: Logistic Regression (scikit-learn)
- Input: 9 numerical features (scaled)
- Output: Churn probability [0.0, 1.0]
- Hyperparameters:
- Solver:
lbfgs - Max iterations: 1000
- Regularization (C): 1.0
- Random state: 42
- Solver:
- Load raw CSV data
- Drop
customerID(not useful) - Convert
TotalChargesto numeric (handle invalid values) - Remove rows with missing values
- Lowercase and strip all categorical variables
- One-hot encode categorical features (drop first)
- Select 9 features (as defined above)
- StandardScaler normalization
- Train/test split: 80/20 with stratification
- Saved as:
models/telco_logistic_regression.joblib - Contents: Dictionary with
{"model": LogisticRegression, "scaler": StandardScaler} - Format: Joblib (binary Python object serialization)
- Accuracy: 79.1%
- Precision: 0.64
- Recall: 0.67
- F1-Score: 0.65
- ROC-AUC: 0.851
# Direct Python usage
python -c "
from prediction import make_prediction
prob = make_prediction(
tenure=24,
MonthlyCharges=65.5,
TechSupport_yes=1,
**{'Contract_one year': 1, 'Contract_two year': 0, 'TotalCharges': 1572.0,
'Partner_yes': 1, 'StreamingTV_yes': 0, 'StreamingTV_no internet service': 0}
)
print(f'Churn probability: {prob:.2%}')
"# Run the Marimo notebook (interactive)
marimo edit notebooks/telco_marimo.py
# Or just run it
marimo run notebooks/telco_marimo.pyThis will:
- Load the Telco dataset
- Preprocess and select features
- Train the Logistic Regression model
- Save model to
models/telco_logistic_regression.joblib
# Start FastAPI server
uv run fastapi run main.py
# Server runs at http://localhost:8000
# Swagger docs at http://localhost:8000/docsExample API request:
curl -X POST http://localhost:8000/predict \
-H "Content-Type: application/json" \
-d '{
"tenure": 24,
"MonthlyCharges": 65.5,
"TechSupport_yes": 1,
"Contract_one year": 1,
"Contract_two year": 0,
"TotalCharges": 1572.0,
"Partner_yes": 1,
"StreamingTV_yes": 0,
"StreamingTV_no internet service": 0
}'# Start Streamlit UI
uv run streamlit run streamlit_app.py
# Dashboard available at http://localhost:8501Steps in UI:
- Enter customer information (tenure, charges, services, etc.)
- Click "Predict Churn"
- View churn probability and risk level
- Adjust inputs to see how features affect churn risk
# Generate feature importance report
python feature_importance_selection.py
# Results saved to: feature_importance_output.txt
# This takes ~2-5 minutes (trains Random Forest + cross-validation)# Run all tests
uv run pytest
# Run specific test
uv run pytest tests/test_prediction.py::test_make_prediction_simple -v- Azure subscription with credits
- GitHub account with this repository
- VS Code with Azure Functions extension
- Azure Functions Core Tools installed
- Open VS Code Extensions (
Ctrl+Shift+X) - Search for "Azure Functions"
- Install by Microsoft
- Open Command Palette (
Ctrl+Shift+P) - Type "Azure Functions: Create Function"
- Select root folder (default)
- Choose Python language
- Choose HTTP trigger template
- Name it (e.g.,
predict) - Select FUNCTION authorization level
Update function_app.py:
import azure.functions as func
import logging
from prediction import make_prediction
app = func.FunctionApp(http_auth_level=func.AuthLevel.FUNCTION)
@app.route(route="predict")
def predict(req: func.HttpRequest) -> func.HttpResponse:
# Get all 9 required parameters
tenure = float(req.params.get("tenure") or req.get_json().get("tenure"))
monthly_charges = float(req.params.get("MonthlyCharges") or 0)
# ... (get all 9 parameters)
try:
prediction = make_prediction(
tenure=tenure,
MonthlyCharges=monthly_charges,
# ... pass all 9 features
)
return func.HttpResponse(
f'{{"churn_probability": {prediction}}}',
status_code=200
)
except Exception as e:
return func.HttpResponse(f'{{"error": "{str(e)}"}}', status_code=400)Update requirements.txt:
azure-functions>=1.20.0
joblib
scikit-learn
pandas
In Azure Portal:
-
Create Function App
- Plan: Flex Consumption
- Runtime: Python 3.12
- Region: Switzerland North (or your region)
- Storage: Create new or use existing
-
Enable Continuous Deployment
- Authentication: GitHub
- Organization: your-org
- Repository: this repo
- Branch: main
- Commit and push changes to
main - GitHub Actions workflow triggers automatically
- Azure deployments in
.github/workflows/directory - Deployment completes (2-5 minutes)
- Test function in Azure Portal
In Azure Portal > Function App > Your Function > "Test/Run":
Input parameters:
tenure=24
MonthlyCharges=65.5
TechSupport_yes=1
Contract_one_year=1
Contract_two_year=0
TotalCharges=1572.0
Partner_yes=1
StreamingTV_yes=0
StreamingTV_no_internet_service=0
Or via curl:
curl "https://<your-function-url>/api/predict?tenure=24&MonthlyCharges=65.5&...""Model file not found"
- Run
marimo run notebooks/telco_marimo.pyfirst - Ensure
models/telco_logistic_regression.joblibexists
"Missing feature" error
- Verify all 9 features are provided in correct order
- Check feature names match exactly (including spaces)
Streamlit won't connect to API
- Ensure FastAPI server is running (
uv run fastapi run main.py) - Check
API_URLenvironment variable or update in code
Tests failing
- Feature names must match exactly (spaces not underscores)
- Run
marimo run notebooks/telco_marimo.pyto retrain model
- Create feature branch from
main - Make changes and test locally
- Push to branch
- Create Pull Request
- CI pipeline runs automatically
- Merge after approval
- CD pipeline deploys to Azure automatically