<a href="https://colab.research.google.com/github/sahil9022-crypto/data-science-project-all-/blob/main/customer_segmentation_dyanmic_.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [6]:
# --- Install required packages ---
!pip install gradio pandas numpy scikit-learn plotly

import numpy as np
import pandas as pd
from sklearn.cluster import KMeans
from sklearn.decomposition import PCA
import plotly.express as px
import gradio as gr
import random, string, time

# ----------------------------
# Simulated Customer Data Stream
# ----------------------------
customers = []

def generate_customer_event(batch_size=5):
    """Generate random customer transactions"""
    global customers
    new_data = []
    for _ in range(batch_size):
        customer_id = ''.join(random.choices(string.ascii_lowercase, k=5))
        spend = np.random.randint(100, 5000)   # random spending
        visits = np.random.randint(1, 30)      # number of visits
        new_data.append([customer_id, spend, visits])
    customers.extend(new_data)
    df = pd.DataFrame(customers, columns=["CustomerID", "Spend", "Visits"])
    return df

# ----------------------------
# Clustering + Visualization
# ----------------------------
def customer_segmentation(polling_interval, batch_size, n_clusters):
    """Simulate new batch, run clustering, and return plots + table"""
    df = generate_customer_event(batch_size)

    # Run KMeans on features
    X = df[["Spend", "Visits"]]
    kmeans = KMeans(n_clusters=n_clusters, random_state=42, n_init=10)
    df["Cluster"] = kmeans.fit_predict(X)

    # PCA for visualization
    pca = PCA(n_components=2)
    df[["PCA1", "PCA2"]] = pca.fit_transform(X)

    # Scatter plot (clusters)
    fig1 = px.scatter(
        df, x="PCA1", y="PCA2", color="Cluster", size="Spend",
        hover_data=["CustomerID", "Spend", "Visits"],
        title="Customer Segmentation (PCA Visualization)"
    )

    # Bar chart (Top customers by spend)
    top_customers = df.sort_values("Spend", ascending=False).head(5)
    fig2 = px.bar(
        top_customers, x="CustomerID", y="Spend", color="Cluster",
        title="Top Customers by Spend"
    )

    status = f"Stream updated at {time.strftime('%H:%M:%S')} with {batch_size} new events."

    return fig1, fig2, df[["CustomerID", "Spend", "Visits", "Cluster"]], status


# ----------------------------
# Gradio UI
# ----------------------------
with gr.Blocks(theme=gr.themes.Soft()) as demo:
    gr.Markdown("## 📊 Customer Segmentation Dashboard — Dynamic (Simulated live data)")

    with gr.Row():
        polling_interval = gr.Number(value=5, label="Polling Interval (seconds)")
        batch_size = gr.Number(value=5, label="Events per tick (batch size)")
        n_clusters = gr.Slider(2, 8, value=3, step=1, label="Number of clusters (k)")

    with gr.Row():
        start_btn = gr.Button("🔄 Update Now")

    with gr.Row():
        out_plot1 = gr.Plot()
        out_plot2 = gr.Plot()

    with gr.Row():
        out_table = gr.Dataframe(headers=["CustomerID", "Spend", "Visits", "Cluster"])

    status_out = gr.Textbox(label="Status", interactive=False)

    start_btn.click(
        customer_segmentation,
        inputs=[polling_interval, batch_size, n_clusters],
        outputs=[out_plot1, out_plot2, out_table, status_out]
    )

demo.launch(share=True)


Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://346a411b3f19421479.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




# 📊 Customer Segmentation Dashboard — Dynamic (Simulated Live Data)

### 🔹 Project Overview
This project demonstrates **AI-powered Marketing Analytics** by segmenting customers dynamically based on their transaction behaviors.  
Instead of relying on static datasets, the dashboard **simulates live streaming customer data**, applies **K-Means clustering + PCA visualization**, and provides actionable insights for businesses.

---

## 🚀 Features
- ✅ **Dynamic Data Simulation** – New customers & transactions generated in real-time.  
- ✅ **K-Means Clustering** – Groups customers into meaningful segments.  
- ✅ **PCA Visualization** – Reduces dimensions for easy interpretation.  
- ✅ **Interactive Dashboard (Gradio UI)** – Clean, user-friendly controls.  
- ✅ **Business Insights** – Identify top spenders & customer clusters for marketing.  

---

## 🛠️ Tech Stack
- **Python** → Data processing & ML  
- **Pandas / NumPy** → Data handling  
- **Scikit-learn** → K-Means clustering + PCA  
- **Plotly** → Interactive charts (scatter, bar)  
- **Gradio** → Dashboard UI (easy to deploy from Google Colab)  

---

## 📂 Project Workflow
1. **Data Simulation** → Generate random customers (spend, visits).  
2. **Segmentation** → Apply **K-Means clustering**.  
3. **Dimensionality Reduction** → PCA → 2D visualization.  
4. **Visualization** → Cluster scatterplot + Top spenders bar chart.  
5. **Dashboard UI** → Gradio for interactive user experience.  

---

## 📊 Business Application
This project highlights how businesses (like **e-commerce, retail, or marketing agencies**) can:  
- 🎯 Identify **customer segments** (e.g., high spenders, frequent visitors).  
- 📈 Optimize **marketing campaigns** per cluster.  
- 💰 Improve **ROI by targeting customers effectively**.  

---

## 📸 Screenshots
(Include screenshots of your running Gradio dashboard here)

---

## ▶️ How to Run
1. Open [Google Colab](https://colab.research.google.com/).  
2. Copy-paste the code from this repo.  
3. Run all cells → Gradio dashboard link will be generated.  
4. Click the link → Use the interactive dashboard.  

---

## 👨‍💻 Author
**Sahil Pawar**  
- 🎓 Engineering Student | Data Scientist | Ethical Hacker | Full Stack Developer  
- 📧 Email: publichacker9999@gmail.com

---

## 🌟 Future Improvements
- 🔗 Connect to **real e-commerce APIs** (e.g., FakeStore API, Shopify API).  
- ⏱️ Add **auto-refresh streaming** every X seconds.  
- 📊 Deploy on **Hugging Face Spaces** for permanent free hosting.  

---

## 💡 Pitch for Resume / Portfolio
> **Customer Segmentation Dashboard (Dynamic)** – Built an AI-powered marketing analytics dashboard that dynamically simulates live customer transactions and applies **K-Means clustering + PCA visualization**. Designed an interactive Gradio dashboard for actionable insights into customer behavior. Demonstrated real-world marketing applications by identifying top spenders & segmenting customers into meaningful clusters. **Tech Stack:** Python, Pandas, Scikit-learn, Plotly, Gradio.  
