# Seasonal Drink Classifier: Coffee Shop PCA + SVM

Can you teach a computer which **season** a drink belongs to using flavor features?

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.decomposition import PCA
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
import warnings
warnings.filterwarnings('ignore')

plt.rcParams['figure.figsize'] = (10, 5)
sns.set(style="whitegrid")


## Dataset (PROVIDED)

In [None]:
np.random.seed(42)

seasons = ["spring", "summer", "fall", "winter"]

def generate_season_drinks(season, n):
    """Generate loosely season-related drink features with noise."""
    base = {
        "spring": dict(spice=3, temperature=6, flavor=6, fruit=7, color=3, foam=4),
        "summer": dict(spice=1, temperature=2, flavor=5, fruit=8, color=2, foam=2),
        "fall":   dict(spice=8, temperature=8, flavor=8, fruit=3, color=8, foam=7),
        "winter": dict(spice=6, temperature=9, flavor=9, fruit=2, color=9, foam=8),
    }
    params = base[season]

    def clip_norm(mean, std, size):
        return np.clip(np.random.normal(mean, std, size), 0, 10)

    return pd.DataFrame({
        "spice":        clip_norm(params["spice"], 1.2, n),
        "temperature":  clip_norm(params["temperature"], 1.2, n),
        "flavor_notes": clip_norm(params["flavor"], 1.2, n),
        "fruitiness":   clip_norm(params["fruit"], 1.2, n),
        "color_tone":   clip_norm(params["color"], 1.2, n),
        "foaminess":    clip_norm(params["foam"], 1.2, n),
        "season":       [season] * n
    })

# Generate 80 drinks total (20 per season)
drinks = pd.concat([
    generate_season_drinks(s, 20) for s in seasons
], ignore_index=True)

print(drinks.head())
print("\nDataset size:", drinks.shape)


In [None]:
print(f"\n✓ Loaded {len(drinks)} drinks across {drinks['season'].nunique()} seasons\n")


### 💕 Example Drinks (Just for Imagination)

Each row is a **synthetic drink** with flavor features:
- `spice` (0–10) — cinnamon / nutmeg / chai vibes
- `temperature` (0–10) — 0 = iced, 10 = very hot
- `flavor_notes` (0–10) — how rich / sweet / deep the flavor is
- `fruitiness` (0–10) — berries, citrus, etc.
- `color_tone` (0–10) — 0 = very light, 10 = very dark
- `foaminess` (0–10) — foam / whipped cream / latte art

Seasons are labels:
- `spring`, `summer`, `fall`, `winter`

You can imagine drinks like:
- 🍂 **Pumpkin Spice Daydream** → high spice, hot, dark, foamy → probably **fall**
- ☀️ **Berry Sunrise Refresher** → iced, fruity, light, low foam → probably **summer**
- ❄️ **Snowfall Vanilla Latte** → very hot, rich, foamy, dark → probably **winter**
- 🌸 **Blossom Matcha Cooler** → medium temp, a bit fruity, pastel → maybe **spring**

> Our model only sees the **numbers**. Your job: help it learn the seasons!

## PART 1: Prepare Data

In [None]:
features = [
    "spice", "temperature", "flavor_notes",
    "fruitiness", "color_tone", "foaminess"
]

# TODO: Extract features and scale them
X = None  # replace with: drinks[features].values

scaler = None  # replace with: StandardScaler()
X_scaled = None  # replace with: scaler.fit_transform(X)

print("\u2713 Data scaled\n")
print(f"Original data shape: {X.shape}")
print(f"Scaled data shape: {X_scaled.shape}\n")


## PART 2: PCA — Visualize Seasonal Flavor Space

We have **6 flavor features**. That’s hard to picture!

**PCA (Principal Component Analysis)** helps by creating new axes:
- **PCA 1**: biggest pattern in the data
- **PCA 2**: second biggest pattern (at a right angle to the first)

We will:
1. Run PCA on the **scaled data**.
2. Plot each drink on a **2D PCA scatter plot**, colored by season.

> Question to keep in mind: Do certain seasons form **clusters** in this flavor space?

In [None]:
y = drinks["season"].values

# TODO: Create PCA object and transform scaled data
pca = None  # replace with: PCA(n_components=2)
X_pca = None  # replace with: pca.fit_transform(X_scaled)

print("PCA shape:", None if X_pca is None else X_pca.shape)
print("Explained variance ratio:", None if pca is None else pca.explained_variance_ratio_)


In [None]:
# TODO: Scatter plot of PCA components colored by season
plt.figure(figsize=(8, 5))

seasons_unique = drinks["season"].unique()
palette = {
    "spring": "#77dd77",  # green
    "summer": "#ffb347",  # orange
    "fall":   "#c23b22",  # red
    "winter": "#779ecb"   # blue
}

for s in seasons_unique:
    mask = (y == s)
    # Replace None with correct slices of X_pca
    plt.scatter(
        None,  # X_pca[mask, 0]
        None,  # X_pca[mask, 1]
        label=s.capitalize(),
        alpha=0.8,
        s=60,
        c=palette[s]
    )

plt.xlabel("PCA Component 1")
plt.ylabel("PCA Component 2")
plt.title("Seasonal Drinks in PCA Flavor Space")
plt.legend()
plt.tight_layout()
plt.show()


**QUESTION A:** Do you see any **groups** of drinks that look like Spring, Summer, Fall, Winter?

**QUESTION B:** If you had to name the axes, what would you call them?
- e.g. "Cozy vs Refreshing", "Spicy & Hot vs Cool & Light"

## PART 3: Train an SVM Season Classifier

Now we build a model that **predicts the season** using the original flavor features.

We will:
1. Split data into **train** and **test** sets.
2. Train an **SVM classifier** (Support Vector Machine).
3. Evaluate how well it predicts seasons on the test set.

> SVM idea: it finds boundaries that separate classes (seasons) in feature space.

In [None]:
# TODO: Train/test split
X_train, X_test, y_train, y_test = train_test_split(
    None,  # X_scaled
    None,  # y
    test_size=0.3,
    random_state=0,
    stratify=None  # y
)

# TODO: Create SVM classifier
svm_clf = None  # SVC(kernel='rbf', gamma='scale', C=1.0)

# TODO: Fit model and predict on test set
y_pred = None  # svm_clf.predict(X_test)

print("Classification report:\n")
print(None)  # classification_report(y_test, y_pred)


**QUESTION C:** Which seasons does the model do best on? Which are harder?

**QUESTION D:** Why might some seasons be easier to separate based on these features?

## PART 4: Try Your Own Drinks 👋🍵

Now invent 2–4 of your *own* drinks and see what season the model predicts.

Examples:
- **Caramel Cloud Dream**
- **Iced Strawberry Matcha Glow**
- **Mocha Midnight Storm**

Think about:
- spice, temperature, flavor_notes, fruitiness, color_tone, foaminess
- What season do *you* think it belongs to?
- Does the model agree with you?

In [None]:
# TODO: After you have a trained scaler, PCA, and svm_clf, try new drinks

new_drinks = pd.DataFrame([
    # Example format (change these!):
    {"name": "Caramel Cloud Dream", "spice": 4, "temperature": 8, "flavor_notes": 9,
     "fruitiness": 2, "color_tone": 7, "foaminess": 9},
    {"name": "Iced Strawberry Matcha Glow", "spice": 2, "temperature": 3, "flavor_notes": 6,
     "fruitiness": 9, "color_tone": 3, "foaminess": 4},
])

feature_cols = ["spice", "temperature", "flavor_notes", "fruitiness", "color_tone", "foaminess"]

# TODO: Scale and predict
X_new = None  # scaler.transform(new_drinks[feature_cols])
pred_seasons = None  # svm_clf.predict(X_new)

new_drinks["predicted_season"] = None  # pred_seasons
print(new_drinks[["name", "predicted_season"]])


**FINAL QUESTION:**

Did the model agree with your intuition about each drink’s season?
- If **yes**, why do you think it worked so well?
- If **no**, what flavor features might you change to convince the model?