🔮 ChurnGuard AI — Customer Churn Prediction & Retention Strategy System

An end-to-end Customer Churn Prediction and Retention Intelligence web application built with Streamlit, Scikit-Learn, and LLM integration (OpenAI / Gemini).

📌 Features

Module	Description
📥 Data Ingestion	Fetches real-time customer data from the IBM Telco public dataset or any custom API endpoint
🔧 ETL Pipeline	Automated data cleaning, encoding, feature engineering (LTV, engagement score, NPS segments, etc.) and Min-Max scaling
🤖 ML Models	Trains and evaluates Logistic Regression and Random Forest classifiers with 5-fold cross-validation
📊 Dashboard	Interactive KPI cards, churn probability distribution, risk segmentation pie chart, and revenue analysis
💡 Retention Strategies	AI-generated personalised retention playbooks per customer using OpenAI GPT or Google Gemini (with a rich mock fallback)
📤 Power BI Export	One-click CSV export with predictions and risk segments ready for Power BI dashboards

🏗️ Project Structure

churn_retention_system/
│
├── app.py                        # Main Streamlit UI (5-tab interface)
├── requirements.txt              # Python dependencies
├── .gitignore
├── data/
│   └── power_bi_export.csv       # Output file for Power BI
│
└── modules/
    ├── __init__.py               # Package exports
    ├── scraper.py                # Real-time data fetcher (IBM Telco + custom API)
    ├── etl_pipeline.py           # Data cleaning & feature engineering
    ├── ml_models.py              # Random Forest & Logistic Regression
    └── llm_strategy.py           # OpenAI / Gemini / Mock strategy generator

🚀 Getting Started

1. Clone the Repository

git clone https://github.com/jaideep005/churn_retention_system.git
cd churn_retention_system

2. Install Dependencies

pip install -r requirements.txt

3. Set Up Environment Variables (Optional)

Create a .env file in the project root for LLM API keys:

GEMINI_API_KEY=your_gemini_api_key_here
OPENAI_API_KEY=your_openai_api_key_here

Note: If no API key is provided, the app automatically uses the built-in mock strategy generator.

4. Run the App

streamlit run app.py

The app will open at http://localhost:8501 in your browser.

🔄 App Workflow

1. Data Ingestion  →  Fetch real IBM Telco data (or upload CSV)
2. ETL Pipeline    →  Clean, encode, and engineer features
3. ML Models       →  Train Logistic Regression + Random Forest
4. Dashboard       →  View KPIs, risk segments, revenue analysis
5. Strategies      →  Generate AI-powered retention plans per customer

🧠 Machine Learning

Two classifiers are trained and compared:

Model	Key Hyperparameters
Logistic Regression	`C=1.0`, `class_weight=balanced`, `max_iter=1000`
Random Forest	`n_estimators=300`, `max_depth=12`, `class_weight=balanced`

Both models are evaluated on:

Accuracy, Precision, Recall, F1 Score, ROC-AUC
5-Fold Stratified Cross-Validation

The best model (by ROC-AUC) is automatically selected to generate full-dataset predictions.

🌐 Real-Time Data Source

By default, the scraper pulls from the IBM Telco Customer Churn public dataset:

https://raw.githubusercontent.com/IBM/telco-customer-churn-on-icp4d/master/data/Telco-Customer-Churn.csv

To use a custom API endpoint, pass your own URL and column mapping:

from modules.scraper import CompanyDataScraper

scraper = CompanyDataScraper(
    source_url="https://your-api.com/customers",
    column_map={
        "TimeWithCompany": "tenure_months",
        "MonthlySpend":    "monthly_revenue",
        "DidTheyLeave":    "churn_raw",
    },
    request_headers={"Authorization": "Bearer YOUR_TOKEN"},
)
df = scraper.fetch()

💡 LLM Retention Strategies

Supports three modes:

Mode	Description
`mock`	Rich rule-based template (default, no API key needed)
`gemini`	Google Gemini 1.5 Flash
`openai`	OpenAI GPT-3.5-Turbo

📤 Power BI Integration

After training, click "Export CSV for Power BI" in the Dashboard tab. The exported file includes:

All original customer columns
churn_probability (float 0–1)
predicted_churn (0 or 1)
risk_segment (Low / Medium / High Risk)
ltv_24mo (estimated 24-month lifetime value)

📦 Dependencies

streamlit==1.32.0
pandas==2.2.1
numpy==1.26.4
scikit-learn==1.4.1
plotly==5.20.0
openai==1.14.3
google-generativeai==0.4.1
python-dotenv==1.0.1
faker==24.2.0
requests==2.31.0
joblib==1.3.2

👨‍💻 Author

Jaideep — @jaideep005

📄 License

This project is licensed under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔮 ChurnGuard AI — Customer Churn Prediction & Retention Strategy System

📌 Features

🏗️ Project Structure

🚀 Getting Started

1. Clone the Repository

2. Install Dependencies

3. Set Up Environment Variables (Optional)

4. Run the App

🔄 App Workflow

🧠 Machine Learning

🌐 Real-Time Data Source

💡 LLM Retention Strategies

📤 Power BI Integration

📦 Dependencies

👨‍💻 Author

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
modules		modules
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🔮 ChurnGuard AI — Customer Churn Prediction & Retention Strategy System

📌 Features

🏗️ Project Structure

🚀 Getting Started

1. Clone the Repository

2. Install Dependencies

3. Set Up Environment Variables (Optional)

4. Run the App

🔄 App Workflow

🧠 Machine Learning

🌐 Real-Time Data Source

💡 LLM Retention Strategies

📤 Power BI Integration

📦 Dependencies

👨‍💻 Author

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages