### Dishes-per-Hour Prediction Notebook
### Strategy summary (high-level):
### - Parse raw order rows and expand items (dish name + quantity)
### - Build an hourly aggregation table where each row = one hour (timestamp floored to hour)
### and features capture contextual signals (hour-of-day, day-of-week, restaurant-level aggregates,
### weather placeholders, lag features, rolling averages).
### - Targets: a multi-output vector of counts for the top-K most frequent dishes in the dataset.
### (We predict counts per dish for each hour.)
### - Modeling: baseline (historical average), multi-output RandomForestRegressor, and a simple Poisson/GLM
### alternative (if desired).
### - Evaluation: per-dish RMSE, aggregated RMSE, top-K dish recall (did we predict the dishes that actually
### appeared), and hit-rate for quantity thresholds.
###
### Notes / design decisions (defaults):
### - We'll focus on top_k dishes (configurable) to keep the target vector size manageable. You can increase
### top_k later if you have sufficient data and compute.
### - Time-based split will be used for validation (train on earlier hours, test on later hours).
### - Output will include helper functions so you can later switch models easily.

## IMPORTS

In [None]:
import math
import os
from datetime import timedelta

import numpy as np
import pandas as pd

from sklearn.ensemble import RandomForestRegressor
from sklearn.multioutput import MultiOutputRegressor
from sklearn.model_selection import TimeSeriesSplit, GridSearchCV
from sklearn.metrics import mean_squared_error
from sklearn.preprocessing import StandardScaler
import joblib

## CONFIGURATION

In [None]:
DATA_PATH = "../data/data.csv"
TOP_K = 50 # number of top dishes to predict (tune as needed)
MIN_HOURS_REQUIRED = 24 # minimal hours required to include in aggregated table
RANDOM_STATE = 42
MODEL_OUTPUT_DIR = "models"
os.makedirs(MODEL_OUTPUT_DIR, exist_ok=True)

## Utility functions: parse "Items in order" into (name, qty) pairs and expand

In [6]:
import re

def parse_order_items(order_str):

# Parse a string like '2 x Pizza, 1 x Coke' into a list of tuples [("Pizza",2), ("Coke",1)].
# If parsing fails, returns an empty list.

	if pd.isna(order_str):
		return []
	# common patterns: '2 x Pizza', '2x Pizza', '1 x Chicken Burger (no onion)'
	parts = re.findall(r"(\d+)\s*[xX]\s*([^,]+)", order_str)
	out = []
	for qty, name in parts:
		item_name = name.strip()
		try:
			q = int(qty)
		except:
			q = 1
		out.append((item_name, q))
	return out

### quick sanity test

In [7]:
assert parse_order_items('2 x Pizza, 1 x Coke') == [('Pizza', 2), ('Coke', 1)]

## Load data and basic datetime parsing


In [8]:
df = pd.read_csv(DATA_PATH)

# Show minimal required columns and parse datetime (adjust format as required)
if 'Order Placed At' in df.columns:
	try:
		df['order_datetime'] = pd.to_datetime(df['Order Placed At'], format="%I:%M %p, %B %d %Y")
	except Exception:
		# fallback to general parsing
		df['order_datetime'] = pd.to_datetime(df['Order Placed At'], errors='coerce')
else:
	# attempt generic column discovery
	possible_cols = [c for c in df.columns if 'order' in c.lower() and 'date' in c.lower() or 'placed' in c.lower()]
	if possible_cols:
		df['order_datetime'] = pd.to_datetime(df[possible_cols[0]], errors='coerce')
	else:
		raise ValueError('Cannot find "Order Placed At" or similar column. Please provide a datetime column.')

# floor to hour for aggregation
df['order_hour'] = df['order_datetime'].dt.floor('h')
print('Loaded rows:', len(df))

Loaded rows: 21321


In [9]:
# * One-hot encode restaurant names for per-restaurant modeling
restaurant_dummies = pd.get_dummies(df['Restaurant name'], prefix='rest')
df = pd.concat([df, restaurant_dummies], axis=1)
REST_NAMES = restaurant_dummies.columns.tolist()  # * keep track of encoded columns
print("Restaurant dummy columns added:", REST_NAMES[:5], "...")


Restaurant dummy columns added: ['rest_Aura Pizzas', 'rest_Dilli Burger Adda', 'rest_Masala Junction', 'rest_Swaad', 'rest_Tandoori Junction'] ...


## Expand items and build per-row item lists (dish name and quantity)

In [10]:
def expand_items_row(order_str):
	parsed = parse_order_items(order_str)
	# return a dict {dish: qty}
	d = {}
	for name, q in parsed:
		d[name] = d.get(name, 0) + q
	return d

# create expanded_items column
if 'Items in order' not in df.columns:
	raise ValueError('Column "Items in order" not found in data. Please rename or provide equivalent column.')

# apply
df['expanded_items'] = df['Items in order'].fillna('').apply(expand_items_row)

# quick check
print(df[['Items in order', 'expanded_items']].head())

                                      Items in order  \
0  1 x Grilled Chicken Jamaican Tender, 1 x Grill...   
1  1 x Peri Peri Fries, 1 x Fried Chicken Angara ...   
2              1 x Bone in Peri Peri Grilled Chicken   
3  1 x Fried Chicken Ghostbuster Tender, 1 x Anga...   
4  1 x Peri Peri Krispers, 1 x Fried Chicken Anga...   

                                      expanded_items  
0  {'Grilled Chicken Jamaican Tender': 1, 'Grille...  
1  {'Peri Peri Fries': 1, 'Fried Chicken Angara T...  
2           {'Bone in Peri Peri Grilled Chicken': 1}  
3  {'Fried Chicken Ghostbuster Tender': 1, 'Angar...  
4  {'Peri Peri Krispers': 1, 'Fried Chicken Angar...  


## Build hourly aggregation table: features + target counts for top-K dishes

In [11]:
# ----------------------------
# Identify top K dishes globally
# ----------------------------
from collections import Counter

all_items = Counter()
for d in df['expanded_items']:
    all_items.update(d)

TOP_K = min(TOP_K, len(all_items))
TOP_DISHES = [name for name, _ in all_items.most_common(TOP_K)]
print(f"Top {TOP_K} dishes sample:", TOP_DISHES[:10])

# ----------------------------
# Create hourly aggregation table
# ----------------------------
hour_index = pd.date_range(start=df['order_hour'].min().floor('D'),
                           end=df['order_hour'].max().ceil('D'), freq='h')
agg = pd.DataFrame(index=hour_index)
agg.index.name = 'order_hour'

# Time features
agg['hour_of_day'] = agg.index.hour
agg['day_of_week'] = agg.index.dayofweek
agg['is_weekend'] = agg['day_of_week'].isin([5,6]).astype(int)

# Initialize per-dish and per-restaurant columns
for dish in TOP_DISHES:
    agg[f'dish__{dish}'] = 0
agg['total_orders'] = 0

for rest_col in REST_NAMES:
    agg[rest_col] = 0

# ----------------------------
# Aggregate counts per hour
# ----------------------------
for hour, group in df.groupby('order_hour'):
    if hour in agg.index:
        # Total orders
        agg.loc[hour, 'total_orders'] = len(group)
        
        # Per-restaurant orders
        for rest_col in REST_NAMES:
            agg.loc[hour, rest_col] = group[rest_col].sum()
        
        # Per-dish counts
        hour_counts = Counter()
        for d in group['expanded_items']:
            hour_counts.update(d)
        for dish in TOP_DISHES:
            agg.loc[hour, f'dish__{dish}'] = hour_counts.get(dish, 0)

# Drop hours with zero history (optional, keeps continuous index)
valid_hours = agg['total_orders'].rolling(window=24, min_periods=1).sum() >= 1
agg = agg[valid_hours]

print('Aggregated hours:', agg.shape[0])


Top 50 dishes sample: ['Bageecha Pizza', 'Chilli Cheese Garlic Bread', 'Bone in Jamaican Grilled Chicken', 'All About Chicken Pizza', 'Makhani Paneer Pizza', 'Margherita Pizza', 'Cheesy Garlic Bread', 'Jamaican Chicken Melt', 'Herbed Potato', 'Tripple Cheese Pizza']
Aggregated hours: 3673


## Feature engineering: lag features, rolling averages, restaurant-level signals


In [12]:
# LAGS = [1, 2, 24] # 1 hour, 2 hours, 24 hours
# WINDOWS = [3, 6, 24]

LAGS = [1, 2, 3, 6, 12, 24]
WINDOWS = [3, 6, 12, 24]


# 1) Lags
lag_dfs = []
for lag in LAGS:
    tmp = pd.DataFrame({
        'total_orders_lag_' + str(lag): agg['total_orders'].shift(lag)
    }, index=agg.index)
    for dish in TOP_DISHES:
        tmp[f'dish__{dish}_lag_{lag}'] = agg[f'dish__{dish}'].shift(lag)
    lag_dfs.append(tmp)

agg = pd.concat([agg] + lag_dfs, axis=1)

# 2) Rolling means
roll_dfs = []
for w in WINDOWS:
    tmp = pd.DataFrame({
        'total_orders_rollmean_' + str(w): agg['total_orders'].rolling(window=w, min_periods=1).mean()
    }, index=agg.index)
    for dish in TOP_DISHES:
        tmp[f'dish__{dish}_rollmean_{w}'] = agg[f'dish__{dish}'].rolling(window=w, min_periods=1).mean()
    roll_dfs.append(tmp)

agg = pd.concat([agg] + roll_dfs, axis=1)
print('After lag/rolling, rows:', agg.shape[0])

After lag/rolling, rows: 3673


## Prepare features (X) and targets (Y)


In [13]:
# Targets: matrix of shape (n_hours, TOP_K) with counts per dish

target_cols = [f'dish__{d}' for d in TOP_DISHES]
feature_cols = [c for c in agg.columns if c not in target_cols]  # exclude targets
feature_cols += REST_NAMES  # * add restaurant-level indicators
feature_cols = list(set(feature_cols))  # remove duplicates

X = agg[feature_cols].copy()
Y = agg[target_cols].copy()

print('X shape:', X.shape)
print('Y shape:', Y.shape)

X shape: (3673, 520)
Y shape: (3673, 50)


## Train/test split (time-based)


In [14]:
train_frac = 0.8
train_size = int(len(X) * train_frac)

X_train = X.iloc[:train_size]
X_test = X.iloc[train_size:]
Y_train = Y.iloc[:train_size]
Y_test = Y.iloc[train_size:]

print('train hours:', X_train.shape[0], 'test hours:', X_test.shape[0])

train hours: 2938 test hours: 735


## Baseline model: historical average per hour-of-day (predict mean count for that hour)


In [15]:
baseline_preds = []
for idx, row in X_test.iterrows():
    hour = row['hour_of_day']
    # mean across training hours with same hour_of_day
    mask = X_train['hour_of_day'] == hour
    if mask.sum() == 0:
        baseline_preds.append(Y_train.mean().values)
    else:
        baseline_preds.append(Y_train[mask].mean().values)

baseline_preds = np.vstack(baseline_preds)

# ensure same shape as Y_test
assert baseline_preds.shape == Y_test.shape

baseline_rmse = np.sqrt(mean_squared_error(Y_test.values, baseline_preds))
print('Baseline RMSE (multi-output aggregated):', baseline_rmse)


Baseline RMSE (multi-output aggregated): 0.5180696960354211


## Model: MultiOutput RandomForest

In [16]:
rf = RandomForestRegressor(n_estimators=200, max_depth=10, random_state=RANDOM_STATE, n_jobs=-1)
model = MultiOutputRegressor(rf)

print('Training RandomForest multi-output...')
model.fit(X_train, Y_train.values)

# predict
preds = model.predict(X_test)
rf_rmse = np.sqrt(mean_squared_error(Y_test.values, preds))
print('RandomForest RMSE:', rf_rmse)

# save model and metadata
joblib.dump({'model': model, 'top_dishes': TOP_DISHES, 'feature_cols': feature_cols},
os.path.join(MODEL_OUTPUT_DIR, 'rf_multioutput_v1.joblib'))

Training RandomForest multi-output...
RandomForest RMSE: 0.22654309440345732


['models/rf_multioutput_v1.joblib']

## Evaluation helpers: per-dish RMSE and top-K dish recall


In [17]:

def per_dish_rmse(y_true, y_pred, dish_names):
	res = {}
	for i, dish in enumerate(dish_names):
		res[dish] = math.sqrt(mean_squared_error(y_true[:, i], y_pred[:, i]))
	return res

per_dish = per_dish_rmse(Y_test.values, preds, TOP_DISHES)
# show top 10 worst and best
sorted_items = sorted(per_dish.items(), key=lambda x: x[1], reverse=True)
print('Top 10 worst RMSE dishes:')
for k,v in sorted_items[:10]:
	print(k, round(v,3))

# Top-k recall: for each hour, compare top-k predicted dishes vs actual top-k
K = 5
hits = 0
for i in range(len(preds)):
	pred_topk = np.argsort(-preds[i])[:K]
	true_topk = np.argsort(-Y_test.values[i])[:K]
	# count overlap
	overlap = len(set(pred_topk).intersection(set(true_topk)))
	hits += overlap / K
print(f'Average top-{K} overlap (fraction):', hits / len(preds))

Top 10 worst RMSE dishes:
Tripple Cheese Pizza 0.588
Bageecha Pizza 0.575
Murgh Amritsari Seekh Pizza 0.462
All About Chicken Pizza 0.329
Margherita Pizza 0.329
Jamaican Chicken Melt 0.319
Cheesy Garlic Bread 0.309
Tipsy Tiger Fresh Lime Soda 0.299
Makhani Paneer Pizza 0.298
Bone in Jamaican Grilled Chicken 0.292
Average top-5 overlap (fraction): 0.7428571428571453


## Quick inference function: given a datetime (hour) and optional context row, predict dish counts


In [18]:
# ----------------------------
# 1) Diagnose feature mismatch
# ----------------------------
print("len(feature_cols) (model features):", len(feature_cols))
print("len(agg.columns):", len(agg.columns))

missing_in_agg = list(set(feature_cols) - set(agg.columns))
extra_in_agg = list(set(agg.columns) - set(feature_cols))

print(f"\nFeatures expected by model but NOT present in agg (count {len(missing_in_agg)}):\n", missing_in_agg[:20])
print(f"\nColumns present in agg but not expected by model (count {len(extra_in_agg)}):\n", extra_in_agg[:20])

# ----------------------------
# 2) Quick sanity prediction test
# ----------------------------
print("\n--- Quick model test using a real X_test row ---")
try:
    xt_real = X_test.iloc[0:1]   # one real test row (DataFrame with correct columns)
    print("X_test row shape:", xt_real.shape)
    pred_real = model.predict(xt_real)
    print("Prediction (first 10 dims):", np.round(pred_real[0][:10], 3))
except Exception as e:
    print("Error predicting on X_test row:", e)

# If the above gives non-zero predictions, model is fine and the issue is with the synthetic row building.

# ----------------------------
# 3) Robust inference function
# ----------------------------
# We'll use X_train statistics to fill missing features sensibly.
X_train_means = X_train.mean()

def build_inference_row_from_agg(dt_hour, context_row=None):
    """
    Build a single-row DataFrame with all feature_cols,
    filling missing numeric features with X_train mean or 0.
    """
    base = pd.Series(0, index=feature_cols, dtype=float)

    # Add time features
    base['hour_of_day'] = dt_hour.hour
    base['day_of_week'] = dt_hour.dayofweek
    base['is_weekend'] = int(dt_hour.dayofweek in [5,6])
    if 'month' in base.index:
        base['month'] = dt_hour.month
    if 'day' in base.index:
        base['day'] = dt_hour.day

    # Fill missing numeric features from X_train mean
    for col in feature_cols:
        if col not in base or base[col] == 0:
            if col in X_train_means.index:
                base[col] = X_train_means[col]

    # Apply restaurant context
    if context_row:
        for k,v in context_row.items():
            if k in base.index:
                base[k] = v

    row_df = pd.DataFrame([base])
    return row_df
def predict_for_hour_restaurant(dt_hour, restaurant_name=None, top_n=20, round_to_int=True):
    dt_hour = pd.to_datetime(dt_hour).floor('h')

    if restaurant_name:
        context_row = {r: 1 if restaurant_name.lower() == r.lower() else 0 for r in REST_NAMES}
        x_row = build_inference_row_from_agg(dt_hour, context_row=context_row)
        x_row = x_row.reindex(columns=feature_cols)
        pred = model.predict(x_row).reshape(-1)
        if round_to_int:
            pred = np.round(pred).astype(int)
        return pd.DataFrame({'dish': Y_train.columns, 'predicted_qty': pred}).sort_values('predicted_qty', ascending=False).head(top_n)
    
    else:
        # Predict per restaurant
        results = []
        for rest in REST_NAMES:
            context_row = {r: 1 if r == rest else 0 for r in REST_NAMES}
            x_row = build_inference_row_from_agg(dt_hour, context_row=context_row)
            x_row = x_row.reindex(columns=feature_cols)
            pred = model.predict(x_row).reshape(-1)
            if round_to_int:
                pred = np.round(pred).astype(int)
            df = pd.DataFrame({
                'restaurant': [rest]*len(Y_train.columns),
                'dish': Y_train.columns,
                'predicted_qty': pred
            })
            results.append(df)
        return pd.concat(results).sort_values(['restaurant','predicted_qty'], ascending=[True,False]).reset_index(drop=True)




try:
    real_pred_df = pd.DataFrame({'dish': Y_train.columns, 'predicted_qty': np.round(model.predict(X_test.iloc[0:1])[0]).astype(int)})
    print(real_pred_df.head(10))
except Exception as e:
    print("Error on real X_test prediction:", e)

print("\nPredict using predict_for_hour_fixed for next hour (with optional context):")
future_time = agg.index[-1] + pd.Timedelta(hours=1)

len(feature_cols) (model features): 520
len(agg.columns): 570

Features expected by model but NOT present in agg (count 0):
 []

Columns present in agg but not expected by model (count 50):
 ['dish__Fried Chicken Classic Tender', 'dish__Tripple Cheese Pizza', 'dish__Mushroom Pizza', 'dish__Cheesy Garlic Bread', 'dish__Murgh Amritsari Seekh Pide', 'dish__Salted Fries', 'dish__Pepperoni Garlic Bread', 'dish__Just Pepperoni Pizza', 'dish__Peri Peri Chicken Melt', 'dish__Fried Chicken Angara Tender', 'dish__Bageecha Pizza', 'dish__Peri Peri Crisper Fries', 'dish__Grilled Chicken Peri Peri Tender', 'dish__Chicken Pepperoni Pizza', 'dish__Mutton Seekh Pizza', 'dish__Spinach Sumac Pide', 'dish__Bellpepper Onion Pizza', 'dish__Bone in Smoky Bbq Grilled Chicken', 'dish__Just Pepperoni Pide', 'dish__Fried Chicken Strips']

--- Quick model test using a real X_test row ---
X_test row shape: (1, 520)
Prediction (first 10 dims): [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
                                     di

## MultiOutput Gradient Boosting (XGBoost)

In [19]:
from xgboost import XGBRegressor

xgb_model = MultiOutputRegressor(
    XGBRegressor(
        n_estimators=200,
        max_depth=6,
        learning_rate=0.1,
        random_state=42,
        n_jobs=-1,
        objective='reg:squarederror'
    )
)

print('Training XGBoost multi-output...')
xgb_model.fit(X_train, Y_train.values)

xgb_preds = xgb_model.predict(X_test)
xgb_rmse = np.sqrt(mean_squared_error(Y_test.values, xgb_preds))
print('XGBoost RMSE:', xgb_rmse)


Training XGBoost multi-output...
XGBoost RMSE: 0.19361292466225366


## LIGHTGBM

In [20]:
import lightgbm as lgb

lgb_model = MultiOutputRegressor(
    lgb.LGBMRegressor(
        n_estimators=200,
        max_depth=6,
        learning_rate=0.1,
        random_state=RANDOM_STATE,
        n_jobs=-1
    )
)

print('Training LightGBM multi-output...')
lgb_model.fit(X_train, Y_train.values)

lgb_preds = lgb_model.predict(X_test)
lgb_rmse = np.sqrt(mean_squared_error(Y_test.values, lgb_preds))
print('LightGBM RMSE:', lgb_rmse)


Training LightGBM multi-output...
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.013745 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 7216
[LightGBM] [Info] Number of data points in the train set: 2938, number of used features: 518
[LightGBM] [Info] Start training from score 0.981961
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.017762 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 7216
[LightGBM] [Info] Number of data points in the train set: 2938, number of used features: 518
[LightGBM] [Info] Start training from score 0.564670
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.013667 seconds.
You can set `force_row_wise=true` to remove the ov

In [21]:
comparison = pd.DataFrame({
    'Model': ['Baseline', 'RandomForest', 'XGBoost', 'LightGBM'],
    'RMSE': [baseline_rmse, rf_rmse, xgb_rmse, lgb_rmse]
})
print(comparison)


          Model      RMSE
0      Baseline  0.518070
1  RandomForest  0.226543
2       XGBoost  0.193613
3      LightGBM  0.156673


In [22]:
xgb_per_dish = per_dish_rmse(Y_test.values, xgb_preds, TOP_DISHES)
lgb_per_dish = per_dish_rmse(Y_test.values, lgb_preds, TOP_DISHES)

# quick comparison top 5 worst dishes for each
print('Top 5 worst RMSE dishes - XGBoost:')
sorted_xgb = sorted(xgb_per_dish.items(), key=lambda x: x[1], reverse=True)
for k,v in sorted_xgb[:5]:
    print(k, round(v,3))

print('\nTop 5 worst RMSE dishes - LightGBM:')
sorted_lgb = sorted(lgb_per_dish.items(), key=lambda x: x[1], reverse=True)
for k,v in sorted_lgb[:5]:
    print(k, round(v,3))


Top 5 worst RMSE dishes - XGBoost:
Tripple Cheese Pizza 0.514
Bageecha Pizza 0.428
Murgh Amritsari Seekh Pizza 0.417
Margherita Pizza 0.313
Jamaican Chicken Melt 0.31

Top 5 worst RMSE dishes - LightGBM:
Bageecha Pizza 0.43
Tripple Cheese Pizza 0.361
Murgh Amritsari Seekh Pizza 0.303
Tipsy Tiger Fresh Lime Soda 0.287
Jamaican Chicken Melt 0.241


In [23]:
# ----------------------------
# Unified model diagnostics and inference (simplified)
# ----------------------------
def model_inference_diagnostics(model, X_train, X_test, Y_train, Y_test, agg, feature_cols, top_n=20, topk_recall=5):
    import math
    import pandas as pd
    import numpy as np
    from sklearn.metrics import mean_squared_error

    print("\n=============================")
    print(f"Diagnostics for model: {type(model).__name__}")
    print("=============================\n")

    # Quick test prediction
    print("--- Quick model test using a real X_test row ---")
    try:
        xt_real = X_test.iloc[0:1]
        pred_real = model.predict(xt_real).reshape(-1)
        print("X_test row shape:", xt_real.shape)
        print("Prediction (first 10 dims):", np.round(pred_real[:10], 3))
    except Exception as e:
        print("Error predicting on X_test row:", e)

    # Robust inference row builder
    X_train_means = X_train.mean()
    def build_inference_row(dt_hour, context_row=None):
        dt_hour = pd.to_datetime(dt_hour).floor('h')
        if dt_hour in agg.index:
            base = agg.loc[dt_hour, [c for c in feature_cols if c in agg.columns]].copy()
        else:
            base = agg.iloc[-1][[c for c in feature_cols if c in agg.columns]].copy()
            base['hour_of_day'] = dt_hour.hour
            base['day_of_week'] = dt_hour.dayofweek
            base['is_weekend'] = int(dt_hour.dayofweek in [5,6])
        if context_row:
            for k, v in context_row.items():
                base[k] = v
        row = {col: base[col] if col in base.index else X_train_means.get(col, 0) for col in feature_cols}
        return pd.DataFrame([row]).astype(float)

    def predict_for_hour(dt_hour, context_row=None, top_n=top_n):
        x_row = build_inference_row(dt_hour, context_row)
        x_row = x_row.reindex(columns=feature_cols)
        pred = model.predict(x_row).reshape(-1)
        pred = np.round(pred).astype(int)
        return pd.DataFrame({'dish': Y_train.columns, 'predicted_qty': pred}) \
                 .sort_values('predicted_qty', ascending=False).reset_index(drop=True).head(top_n)

    # Per-dish RMSE
    preds_test = model.predict(X_test)
    per_dish_rmse = {dish: np.sqrt(mean_squared_error(Y_test.values[:,i], preds_test[:,i])) 
                 for i, dish in enumerate(Y_train.columns)}

    sorted_rmse = sorted(per_dish_rmse.items(), key=lambda x: x[1], reverse=True)
    print("\nTop 10 worst RMSE dishes:")
    for dish, rmse in sorted_rmse[:10]:
        print(dish, round(rmse,3))

    # Top-K overlap
    hits = 0
    for i in range(len(preds_test)):
        pred_topk = np.argsort(-preds_test[i])[:topk_recall]
        true_topk = np.argsort(-Y_test.values[i])[:topk_recall]
        hits += len(set(pred_topk).intersection(set(true_topk))) / topk_recall
    print(f"\nAverage top-{topk_recall} overlap (fraction):", hits / len(preds_test))

    # Quick inference for next hour
    future_time = agg.index[-1] + pd.Timedelta(hours=1)
    print("\nPredict for next hour (optional context):")
    print(predict_for_hour(future_time, context_row=None, top_n=top_n))

# * Context row can now specify a restaurant
# Example: predict for "Swaad"
res = predict_for_hour_restaurant(future_time, restaurant_name=None, top_n=20)
print(res)



                    restaurant                                 dish  \
0             rest_Aura Pizzas               dish__Margherita Pizza   
1             rest_Aura Pizzas        dish__All About Chicken Pizza   
2             rest_Aura Pizzas           dish__Makhani Paneer Pizza   
3             rest_Aura Pizzas            dish__Cheesy Garlic Bread   
4             rest_Aura Pizzas          dish__Jamaican Chicken Melt   
..                         ...                                  ...   
295  rest_The Chicken Junction                   dish__Salted Fries   
296  rest_The Chicken Junction             dish__Mutton Seekh Pizza   
297  rest_The Chicken Junction                dish__Grilled Tangdis   
298  rest_The Chicken Junction            dish__Just Pepperoni Pide   
299  rest_The Chicken Junction  dish__Grilled Chicken Angara Tender   

     predicted_qty  
0                2  
1                1  
2                1  
3                1  
4                1  
..             ...  


In [24]:
model_inference_diagnostics(model, X_train, X_test, Y_train, Y_test, agg, feature_cols)       # RandomForest


Diagnostics for model: MultiOutputRegressor

--- Quick model test using a real X_test row ---
X_test row shape: (1, 520)
Prediction (first 10 dims): [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]

Top 10 worst RMSE dishes:
dish__Tripple Cheese Pizza 0.588
dish__Bageecha Pizza 0.575
dish__Murgh Amritsari Seekh Pizza 0.462
dish__All About Chicken Pizza 0.329
dish__Margherita Pizza 0.329
dish__Jamaican Chicken Melt 0.319
dish__Cheesy Garlic Bread 0.309
dish__Tipsy Tiger Fresh Lime Soda 0.299
dish__Makhani Paneer Pizza 0.298
dish__Bone in Jamaican Grilled Chicken 0.292

Average top-5 overlap (fraction): 0.7428571428571453

Predict for next hour (optional context):
                                      dish  predicted_qty
0     dish__Bone in Angara Grilled Chicken              1
1   dish__Bone in Jamaican Grilled Chicken              1
2                dish__Cheesy Garlic Bread              1
3               dish__Tripple Cheese Pizza              1
4        dish__Murgh Amritsari Seekh Pizza             

In [None]:
model_inference_diagnostics(xgb_model, X_train, X_test, Y_train, Y_test, agg, feature_cols)   # XGBoost

In [None]:
model_inference_diagnostics(lgb_model, X_train, X_test, Y_train, Y_test, agg, feature_cols)   # LightGBM

### **1) Quick model test using a real `X_test` row**

```
X_test row shape: (1, 514)
Prediction (first 10 dims): [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
```

* This is a sanity check to see if the model produces predictions for a real test row.
* `X_test row shape: (1, 514)` confirms the row has 514 features, matching the model’s expected input.
* `Prediction (first 10 dims)` shows the first 10 predicted quantities for dishes.
* In your first two models, the predictions are all **0**, meaning the model is probably underestimating low-demand items or hasn’t learned strong signal yet. The last model shows small non-zero predictions (e.g., 0.038), indicating it has learned some patterns.

---

### **2) Top 10 worst RMSE dishes**

```
dish__Tripple Cheese Pizza 0.58
dish__Bageecha Pizza 0.537
...
```

* RMSE (Root Mean Squared Error) per dish measures **how far the model’s predictions are from the true values**.
* High RMSE → model struggles to predict this dish accurately.
* Example: `dish__Tripple Cheese Pizza 0.58` means that, on average, predictions for this dish deviate by ~0.58 units from actual orders.
* This list helps you **identify which dishes need better modeling or more data**.

---

### **3) Average top-5 overlap (fraction)**

```
Average top-5 overlap (fraction): 0.7646
```

* This is a measure of **how well the model predicts the top-K popular dishes** per order.
* Calculation:

  1. Take top 5 predicted dishes (`pred_topk`) per row.
  2. Take top 5 actual dishes (`true_topk`) per row.
  3. Compute fraction of overlap.
* `0.7646` → on average, ~76% of the true top-5 dishes are correctly predicted in the top-5 predictions.
* Higher is better — indicates the model is capturing the most important dishes for each order.

---

### **4) Predict for next hour**

```
dish__Bone in Jamaican Grilled Chicken              2
dish__Bone in Angara Grilled Chicken              1
...
```

* This shows a **practical inference example**: predicted quantities for the next hour.
* Helps answer the question: “If a customer comes in the next hour, how many orders of each dish should we expect?”
* Values are rounded integers because you can’t have fractional dish orders.
* The top few dishes show predicted demand >0; rest are predicted as 0, which is expected for low-demand items.

---

### **Summary**

1. **Quick test predictions**: sanity check; ensures model runs on real data.
2. **Top RMSE dishes**: shows where the model struggles most; useful for debugging and targeted improvement.
3. **Top-K overlap**: shows how well the model predicts the most popular dishes; relevant for operational decision-making.
4. **Next-hour prediction**: practical output; shows predicted dish quantities for immediate planning.

---

💡 **Takeaways from your output**:

* Some models produce zero predictions — may need tuning or more features.
* RMSE is higher for some pizzas and special dishes — these are harder to predict.
* Top-5 overlap is reasonably good (~0.52–0.76) — model captures popular dishes fairly well.
* Next-hour predictions are sparse, mostly zero, but top dishes are identified correctly.

## **What this model does:**

Think of a restaurant with hundreds of dishes. You want to know **how many of each dish will be ordered in the next hour**. This model is like your “smart assistant” that looks at past orders, the time of day, day of week, and trends to **predict hourly demand per dish**.

It doesn’t just guess randomly — it learns patterns, like:

* Customers order more pizza at lunch than breakfast.
* Certain days (weekends, holidays) have spikes in certain dishes.
* If a dish sold a lot in the last few hours, it might continue to sell.

---

## **Input to the model:**

The model needs **one row per hour**, containing:

1. **Time features:** hour of day, day of week, weekend/weekday.
2. **Historical features:** how many orders came in the previous 1, 2, 3…24 hours.
3. **Rolling averages:** average dish counts over last 3, 6, 12, 24 hours.
4. **Optional context:** special events, restaurant-level signals, weather placeholders (if you have them).

The idea: give the model everything that could help it “understand” the demand for that hour.

---

## **Output from the model:**

The model produces a **vector of predicted counts** for the top dishes (you can configure how many, e.g., top 50).

Example:

| Dish             | Predicted Orders Next Hour |
| ---------------- | -------------------------- |
| Pizza Margherita | 3                          |
| Chicken Wings    | 1                          |
| Coke             | 2                          |
| Garlic Bread     | 0                          |

* If predicted = 0 → expect very low/no demand for that hour.
* Rounded to integers, because you can’t sell half a pizza!
* You can also sort by predicted quantity to see the most likely dishes first.

---

## **How the model works (simple version):**

1. **Step 1: Learn from history** – The model looks at past hours, sees patterns in dish sales.
2. **Step 2: Feature engineering** – It creates “helper numbers” like past counts, rolling averages, weekday/weekend flags.
3. **Step 3: Multi-output prediction** – Unlike normal models that predict one thing, this predicts **many dishes at once** (multi-output).
4. **Step 4: Output predictions** – Gives counts for each dish, ready to use in planning or inventory.

---

## **Scenarios where this is useful:**

* **Kitchen prep:** Stock ingredients only for predicted demand, avoid waste.
* **Staff planning:** Know which hours will be busy for certain dishes.
* **Promotions:** Push dishes likely to sell or slow-moving items.
* **Inventory management:** Adjust purchases based on expected demand for top dishes.

---

## **How to use it (layman steps):**

1. Feed the model a **row with current hour info and recent trends**.
2. Model outputs **predicted counts for top dishes**.
3. Use this output to **prepare, stock, or plan** operations.
4. Optionally, keep adding new data to retrain the model so it **learns new trends**.

---

## **Key metrics to know if it works:**

1. **RMSE per dish:** Lower = better predictions for that dish.
2. **Top-K overlap:** Measures if the model correctly predicts the most popular dishes in that hour.
3. **Practical predictions:** Look at next-hour output and see if it makes sense operationally.

---

### **Pitch version (why this is awesome):**

> Imagine you could **look into the future** and know exactly how many pizzas, burgers, or drinks your customers will order next hour. No guessing, no overstocking, no last-minute chaos. This model does exactly that — turning historical patterns, time signals, and simple stats into **actionable, hour-by-hour dish forecasts**. It’s like having a data-powered manager who never sleeps.

## **1️⃣ Predicting a whole day**

* The model predicts **per hour**, so to forecast a full day you just **loop over each hour** of the day (00:00 → 23:00).
* For each hour, you create a feature row (hour-of-day, day-of-week, lag features, rolling averages, etc.) and feed it to the model.
* The model outputs a **predicted quantity for each dish per hour**, which you can sum or use hourly for planning.

**Example pseudocode:**

```python
future_hours = pd.date_range(start='2025-10-28', end='2025-10-28 23:00', freq='h')
predictions = []
for hr in future_hours:
    df_pred = predict_for_hour_fixed(hr)
    predictions.append(df_pred)
# predictions now has dish forecasts for all 24 hours
```

---

## **2️⃣ What if the restaurant was closed yesterday?**

* The model uses **lag and rolling features** to capture trends from previous hours/days.
* If the restaurant was closed yesterday, **lags for yesterday will be 0**.
* Rolling averages over past 24 hours might be lower than normal, which can **bias predictions downward**.

### **Ways to handle closures:**

1. **Impute missing values:**

   * If the restaurant was closed, fill the missing hours with **historical averages for that hour/day**.
   * This prevents the model from thinking “nobody orders anything = always 0.”

2. **Context override:**

   * You can pass a `context_row` to override certain features for tomorrow.
   * Example: set `total_orders_lag_24 = mean_orders_for_that_hour` instead of 0.

3. **Ignore missing hours in rolling averages:**

   * Some implementations can compute rolling mean **ignoring zeros from closures**.
   * This helps maintain realistic predictions despite a closure.

---

### ✅ **Bottom line**

* Yes, predicting the whole day is straightforward — just run the hourly prediction for all 24 hours.
* If the restaurant was closed yesterday, **the model might underestimate demand**, but you can fix it by filling missing hours with **historical averages or context overrides**.