<a href="https://colab.research.google.com/github/hannahbanjo/AssociationOfDataScience/blob/main/ml_dashboard.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Machine Learning Dashboard Workshop

In this workshop we will create an interactive Streamlit dashboard for exploring coffee sales data, training an ML model, and making live predictions.


What the code does


*   Loads coffee sales data from GitHub.

* Displays dataset overview (first rows, total transactions, unique coffees).

* Visualizes feature distributions with interactive plots.

* Trains a Random Forest model to predict coffee_name.

* Shows model performance (accuracy, confusion matrix, classification report).

* Provides real-time prediction: user inputs purchase details, and the model predicts the outcome.



In [9]:
# install dependencies
!pip install pyngrok streamlit pandas seaborn matplotlib -q

In [10]:
%%writefile app.py
import pandas as pd
import streamlit as st
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
from pyngrok import ngrok

def load_data():
    return pd.read_csv("https://raw.githubusercontent.com/hannahbanjo/AssociationOfDataScience/main/Coffee_sales.csv")

df = load_data()

st.title("☕ Coffee Sales ML Dashboard")
st.write("Explore coffee purchase behavior and predict outcomes using ML.")

# -------------------
# Data Exploration
# -------------------
st.header("1. Data Overview")
st.write(df.head())

col1, col2 = st.columns(2)
with col1:
    st.metric("Total Transactions", len(df))
with col2:
    st.metric("Unique Coffees", df["coffee_name"].nunique())

st.subheader("Feature Distributions")
feature = st.selectbox("Choose a feature", df.columns)
fig, ax = plt.subplots()
sns.countplot(data=df, x=feature, ax=ax)
plt.xticks(rotation=45)
st.pyplot(fig)

# -------------------
# Model Training
# -------------------
st.header("2. Machine Learning Model")

target = st.selectbox("Select target column to predict", ["coffee_name"])
# Factorize and keep the mapping
y, label_names = pd.factorize(df[target])

X = pd.get_dummies(df.drop(columns=[target]), drop_first=True)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = RandomForestClassifier(random_state=42)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

acc = accuracy_score(y_test, y_pred)
st.success(f"Model Accuracy: {acc:.2f}")

# Confusion Matrix with labels
st.subheader("Confusion Matrix")
cm = confusion_matrix(y_test, y_pred)
fig, ax = plt.subplots()
sns.heatmap(cm, annot=True, fmt="d", cmap="Blues", ax=ax,
            xticklabels=label_names, yticklabels=label_names)
st.pyplot(fig)

# Classification Report with names
st.text("Classification Report")
st.text(classification_report(y_test, y_pred, target_names=label_names))

# -------------------
# Real-Time Prediction
# -------------------
st.header("3. Try Real-Time Prediction")

hour_of_day = st.slider("Hour of purchase", 0, 23, 9)
money = st.number_input("Transaction Amount", min_value=1, max_value=50, value=5)
coffee_name = st.selectbox("Coffee Type", df["coffee_name"].unique())
time_of_day = st.selectbox("Time of Day", df["Time_of_Day"].unique())
weekday = st.selectbox("Weekday", df["Weekday"].unique())
month_name = st.selectbox("Month", df["Month_name"].unique())

input_data = pd.DataFrame(
    {
        "hour_of_day": [hour_of_day],
        "money": [money],
        "coffee_name": [coffee_name],
        "Time_of_Day": [time_of_day],
        "Weekday": [weekday],
        "Month_name": [month_name],
    }
)

input_encoded = pd.get_dummies(input_data)
X_input = input_encoded.reindex(columns=X.columns, fill_value=0)

prediction = model.predict(X_input)[0]

st.subheader("Prediction Result")
st.write(f"🎯 The model predicts: **{prediction}**")


Overwriting app.py


Run Streamlit

In [11]:
!streamlit run app.py --server.port=8501 --server.address=0.0.0.0 &>/dev/null &

Start NGrok Tunnel

**Getting ngrok token**
1. Go to https://ngrok.com/ and create an account
2. Once you have your account, go to the menu on the left side of the screen
3. Find the "Getting Started" section, and under that click "Your Authtoken"
4. Copy the token and paste it below where it says "PASTE_YOUR_AUTHTOKEN"

In [12]:
from pyngrok import ngrok
# used to allow us to create a tunnel to display our dashboard
ngrok.set_auth_token("PASTE_YOUR_AUTHTOKEN")

public_url = ngrok.connect(8501)
print("✅ Streamlit app is live at:", public_url)

✅ Streamlit app is live at: NgrokTunnel: "https://d1a6a240ae32.ngrok-free.app" -> "http://localhost:8501"
