# Warehouse Operations

This notebook is intentionally structured as a **guided analytics lab**.

Scope  
Warehouse execution only (order flow, labor productivity, congestion risk)

Analytics Flow  
Descriptive → Diagnostic → Predictive → Prescriptive


## Hands-On Table (Follow in Order)

| Step | What | Cell ID | What to do in Colab | Why (Purpose) |
|---|---|---|---|---|
| 0A | Load and inspect dataset | `0A` | Run the load cell and inspect `df.head()` and `df.columns` | Understand execution-level fields |
| 0B | Identify variable roles | `0B` | Fill STATE / FLOW / CONSTRAINT lists | Build correct system mental model |
| 1A | Construct warehouse cycle time | `1A` | Set `CYCLE_START` and `CYCLE_END` to correct timestamps | Measure execution speed (Order→Ship) |
| 1B | Daily KPI baseline | `1B` | Run daily aggregation and read KPI outputs | Establish baseline behavior |
|1C|Baseline dashboard charts| `1C` |Plot inventory, throughput, cycle-time plots|Make warehouse KPIs observable|
| 2A | Construct pick labor-hours + pick rate | `2A` | Set `PICK_START` and `PICK_END`, compute `pick_labor_hrs` and `pick_rate` | Build capacity logic from work time |
| 2B | Compute utilization + WIP proxy | `2B` | Define `arrival_rate` and `capacity_rate`, compute utilization & WIP | Diagnose congestion mechanisms |
| 3A | Predictive: demand shock | `3A` | Set `DEMAND_MULTIPLIER` and recompute utilization | Predict stress response |
| 3B | Diagnostic check: scaling pattern | `3B` | Compare baseline vs shocked utilization patterns | Validate linear scaling under fixed capacity |
| 4A | Prescriptive: target utilization | `4A` | Set `TARGET_UTIL` (e.g., 0.85) | Frame decision target |
| 4B | Prescriptive: required staffing | `4B` | Compute required workers and staffing gap | Translate analytics into action |



## Step 0A — Load and Inspect Dataset

**Cell ID:** `0A`

Run the cell. Then scan the column names and 3–5 sample rows.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

pd.set_option("display.max_columns", 200)
pd.set_option("display.width", 160)

df = pd.read_csv("https://raw.githubusercontent.com/saikisri97/17_Hof_Lecture_Code_Pingo/refs/heads/main/Supply_Chain_Analytics/data/warehouse_ops_orders_class_v2.csv")


# Parse timestamps (safe even if already parsed)
for c in ["Date","Order_Time","Pick_Start","Pick_End","Ship_Time"]:
    if c in df.columns:
        df[c] = pd.to_datetime(df[c], errors="coerce")

display(df.head(5))
print("\nColumns:", list(df.columns))
print("\nRows:", len(df))


Unnamed: 0,Order_ID,Date,Week_Num,SKU,Zone,Order_Time,Pick_Start,Pick_End,Ship_Time,Picks,Workers_On_Shift,Shift_Hours,Demand,Receipt_Qty,On_Hand_Inventory,Backorder_Qty,Is_Promo,Price,On_Order_Qty,Lead_Time_Days
0,WO000001,2024-01-01,1,SKU-INV-001,A,2024-01-01 12:04:34.371709988,2024-01-01 12:22:04.707434858,2024-01-01 12:27:23.456944007,2024-01-01 12:47:37.758316628,40,8,8.12,23.0,83.0,360.0,0.0,,,0.0,7.0
1,WO000002,2024-01-01,1,SKU-INV-001,B,2024-01-01 14:31:33.811356105,2024-01-01 14:47:16.663750715,2024-01-01 14:53:13.985610003,2024-01-01 15:28:40.143464557,14,8,8.12,23.0,83.0,360.0,0.0,,,0.0,7.0
2,WO000003,2024-01-01,1,SKU-INV-001,A,2024-01-01 09:21:23.120473445,2024-01-01 09:45:44.409387040,2024-01-01 09:48:14.409387040,2024-01-01 10:30:22.181918453,14,8,8.12,23.0,83.0,360.0,0.0,,,0.0,7.0
3,WO000004,2024-01-01,1,SKU-INV-001,C,2024-01-01 11:27:53.585366230,2024-01-01 11:34:45.424021159,2024-01-01 11:41:55.565324070,2024-01-01 12:09:04.791376229,27,8,8.12,23.0,83.0,360.0,0.0,,,0.0,7.0
4,WO000005,2024-01-01,1,SKU-INV-001,A,2024-01-01 13:01:37.628705527,2024-01-01 13:12:16.734844509,2024-01-01 13:17:33.431139791,2024-01-01 13:35:00.961321182,27,8,8.12,23.0,83.0,360.0,0.0,,,0.0,7.0



Columns: ['Order_ID', 'Date', 'Week_Num', 'SKU', 'Zone', 'Order_Time', 'Pick_Start', 'Pick_End', 'Ship_Time', 'Picks', 'Workers_On_Shift', 'Shift_Hours', 'Demand', 'Receipt_Qty', 'On_Hand_Inventory', 'Backorder_Qty', 'Is_Promo', 'Price', 'On_Order_Qty', 'Lead_Time_Days']

Rows: 1966


## Step 0B — Identify Variable Roles

**Cell ID:** `0B`

Fill lists with column names from the dataset. Keep it simple and defensible.

In [2]:
# TODO: Fill lists with column names from df.columns
STATE_VARIABLES = []        # quantities describing system state at a point in time
FLOW_VARIABLES = []         # quantities describing movement/processing over time
CONSTRAINT_VARIABLES = []   # capacity/resources/time constraints

STATE_VARIABLES, FLOW_VARIABLES, CONSTRAINT_VARIABLES


([], [], [])

## Step 1A — Construct Warehouse Cycle Time (Execution)

**Cell ID:** `1A`

Choose timestamps that represent warehouse execution lead time (release → ship).

In [3]:
# TODO: Select correct columns for execution cycle time
CYCLE_START = None
CYCLE_END = None

df["wh_cycle_time_hrs"] = (
    pd.to_datetime(df[CYCLE_END]) - pd.to_datetime(df[CYCLE_START])
).dt.total_seconds() / 3600

# Basic sanity filter
df_kpi = df[df["wh_cycle_time_hrs"].notna() & (df["wh_cycle_time_hrs"] >= 0)].copy()
df_kpi["wh_cycle_time_hrs"].describe()


KeyError: None

## Step 1B — Daily Baseline KPI Summary

**Cell ID:** `1B`

Run to create a daily table used by all later steps.

In [None]:
df_kpi["day"] = df_kpi["Date"].dt.date

daily = df_kpi.groupby("day").agg(
    orders=("Order_ID","nunique"),
    total_picks=("Picks","sum"),
    avg_cycle_time_hrs=("wh_cycle_time_hrs","mean"),
    p90_cycle_time_hrs=("wh_cycle_time_hrs", lambda s: np.percentile(s.dropna(), 90)),
    workers=("Workers_On_Shift","median"),
    shift_hours=("Shift_Hours","median")
).reset_index()

display(daily.head(10))
daily.describe()


## Step 1C — Baseline Dashboard Charts

**Cell ID:** `1C`

Common warehouse dashboard views: time series + distribution.

In [None]:
# Time series: avg cycle time
plt.figure()
plt.plot(pd.to_datetime(daily[""]), daily[""], marker="o")
plt.title("Avg Warehouse Cycle Time (hrs) — Daily")
plt.xlabel("Day")
plt.ylabel("Hours")
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

# Time series: orders per hour
plt.figure()
plt.plot(pd.to_datetime(daily[""]), daily[""], marker="o")
plt.title("Throughput (Orders per Hour) — Daily")
plt.xlabel("Day")
plt.ylabel("Orders/hour")
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

# Distribution: cycle time
plt.figure()
plt.hist(df_kpi[""].dropna(), bins=30)
plt.title("Distribution of Warehouse Cycle Time (hrs)")
plt.xlabel("Hours")
plt.ylabel("Count")
plt.tight_layout()
plt.show()


## Step 2A — Build Capacity Inputs: Pick Labor-Hours and Pick Rate

**Cell ID:** `2A`

Pick labor-hours are computed from pick start/end timestamps. Then pick rate = total picks / pick labor-hours.

In [None]:
# TODO: Select correct pick timestamps (work time spent picking)
PICK_START = None
PICK_END = None

df_kpi["pick_labor_hrs"] = (
    pd.to_datetime(df_kpi[PICK_END]) - pd.to_datetime(df_kpi[PICK_START])
).dt.total_seconds() / 3600

# keep only valid pick durations
df_kpi = df_kpi[df_kpi["pick_labor_hrs"].notna() & (df_kpi["pick_labor_hrs"] >= 0)].copy()

pick_daily = df_kpi.groupby(df_kpi["day"]).agg(
    pick_labor_hrs=("pick_labor_hrs","sum"),
    picks=("Picks","sum")
).reset_index()

pick_daily["pick_rate_picks_per_lh"] = pick_daily["picks"] / pick_daily["pick_labor_hrs"]

display(pick_daily.head(10))
pick_daily["pick_rate_picks_per_lh"].describe()


## Step 2B — Utilization and WIP Proxy (Congestion KPIs)

**Cell ID:** `2B`

Merge pick rate into daily table. Then compute arrival rate, capacity rate, utilization, and WIP proxy.

In [None]:
# Merge pick-rate into daily table
daily2 = daily.merge(pick_daily[["day","pick_labor_hrs","pick_rate_picks_per_lh"]], on="day", how="left")

# TODO: Define arrival rate and capacity rate (both in picks/hour)
arrival_rate = None
capacity_rate = None

daily2["utilization"] = arrival_rate / capacity_rate

# WIP proxy via Little’s Law: throughput (orders/hour) × cycle time (hours)
daily2["throughput_orders_per_hr"] = daily2["orders"] / daily2["shift_hours"]
daily2["wip_proxy_orders"] = daily2["throughput_orders_per_hr"] * daily2["avg_cycle_time_hrs"]

display(daily2[["day","orders","avg_cycle_time_hrs","pick_rate_picks_per_lh","utilization","wip_proxy_orders"]].head(12))
daily2[["avg_cycle_time_hrs","utilization","wip_proxy_orders"]].describe()


## Step 3A — Predictive Scenario: Demand Shock (+15%)

**Cell ID:** `3A`

Apply demand multiplier note impact on utilization under fixed capacity.

In [None]:
DEMAND_MULTIPLIER = 1.15

daily2["arrival_rate_future"] = daily2["total_picks"] * DEMAND_MULTIPLIER / daily2["shift_hours"]
daily2["utilization_future"] = daily2["arrival_rate_future"] / (daily2["workers"] * daily2["pick_rate_picks_per_lh"])

display(daily2[["day","utilization","utilization_future"]].head(12))
daily2[["utilization","utilization_future"]].describe()


## Step 3B — Diagnostic Check: Scaling Pattern

**Cell ID:** `3B`

Compute the ratio utilization_future / utilization and inspect whether it is approximately constant.

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Scenario B: Non-uniform workload (weekday pattern) + optional single-day shock
# Change the weekday multipliers and stress-day controls to observe utilization and staffing impacts downstream.

BASE_UPLIFT = 1.25

weekday_multiplier = {
    0: 1.35,  # Mon
    1: 1.10,
    2: 1.00,
    3: 1.00,
    4: 1.20,  # Fri
    5: 0.85,
    6: 0.75
}

ENABLE_SINGLE_SHOCK_DAY = True
SINGLE_DAY_SHOCK_MULT = 2.20

# Calibration constant (keeps ratios intact, makes staffing levels visible)
VOLUME_SCALE = 3.50

tmp_days = pd.to_datetime(daily2["day"])
daily2["weekday"] = tmp_days.dt.weekday

# Baseline arrival rate (picks/hour)
daily2["arrival_rate"] = daily2["total_picks"] / daily2["shift_hours"]

daily2["demand_multiplier_B"] = BASE_UPLIFT * daily2["weekday"].map(weekday_multiplier).astype(float)
daily2["arrival_rate_future_B"] = daily2["arrival_rate"] * daily2["demand_multiplier_B"] * VOLUME_SCALE

if ENABLE_SINGLE_SHOCK_DAY:
    shock_day = tmp_days.iloc[-5]
    daily2.loc[tmp_days == shock_day, "arrival_rate_future_B"] *= SINGLE_DAY_SHOCK_MULT

daily2["utilization_future_B"] = daily2["arrival_rate_future_B"] / (daily2["workers"] * daily2["pick_rate_picks_per_lh"])

plt.figure(figsize=(10,4))
plt.plot(tmp_days, daily2["utilization"], marker="o", label="Baseline utilization")
plt.plot(tmp_days, daily2["utilization_future_B"], marker="o", label="Scenario B utilization")
plt.title("Utilization — Baseline vs Scenario B (Weekday Peaks + Stress Day)")
plt.xlabel("Day")
plt.ylabel("Utilization")
plt.xticks(rotation=45)
plt.legend()
plt.tight_layout()
plt.show()

display(daily2[["day","weekday","demand_multiplier_B","utilization","utilization_future_B"]].head(12))


## Step 4A — Prescriptive: Define Target Utilization

**Cell ID:** `4A`

Target utilization is a policy threshold (e.g., 85%) for stable operations.

In [None]:
TARGET_UTIL = 0.85
TARGET_UTIL

## Step 4B — Prescriptive: Required Staffing to Hit Target

**Cell ID:** `4B`

Compute workers needed to keep utilization at/below target under demand shock.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# ---------------------------------------------
# 4B) Prescriptive Staffing — Scenario B
# Model: demand pressure (arrival_rate_future_B) vs effective execution capacity (workers_effective × pick_rate_effective)
# ---------------------------------------------

# Policy parameter (decision target)
TARGET_UTIL = TARGET_UTIL  # keep as set earlier (e.g., 0.85). Change to test stricter vs looser policy.

# Capacity loss (attendance / availability)
workers_lost_by_weekday = {
    0: 1,  # Mon: absenteeism / onboarding
    1: 0,
    2: 0,
    3: 0,
    4: 1,  # Fri: early shift-outs
    5: 3,  # Sat: skeleton crew
    6: 4   # Sun: minimal crew
}

# Productivity loss (congestion / fatigue / travel distance)
congestion_factor_by_weekday = {
    0: 0.70,  # Mon peak congestion
    1: 0.85,
    2: 1.00,
    3: 1.00,
    4: 0.75,  # Fri congestion + fatigue
    5: 1.05,  # Sat smoother flow
    6: 1.10   # Sun light operations
}

# Stress-test lever (scenario tuning)
ENABLE_PEAK_WEEK = True
PEAK_WEEK_MULT = 3.0    # adjust upward if required staffing is still below available
PEAK_DAYS = 4           # how many last days treated as peak days

# ---------------------------------------------
# 4B1) Effective workers
# ---------------------------------------------
daily2["day"] = pd.to_datetime(daily2["day"])
if "weekday" not in daily2.columns:
    daily2["weekday"] = daily2["day"].dt.weekday

daily2["workers_lost"] = daily2["weekday"].map(workers_lost_by_weekday).astype(float)
daily2["workers_effective"] = (daily2["workers"] - daily2["workers_lost"]).clip(lower=1)

# ---------------------------------------------
# 4B2) Effective pick rate
# ---------------------------------------------
daily2["congestion_factor"] = daily2["weekday"].map(congestion_factor_by_weekday).astype(float)
daily2["pick_rate_effective"] = (daily2["pick_rate_picks_per_lh"] * daily2["congestion_factor"]).clip(lower=1e-6)

# ---------------------------------------------
# 4B3) Peak window multiplier (Scenario B arrival pressure)
# ---------------------------------------------
df_days = pd.to_datetime(daily2["day"])
peak_window = df_days.isin(df_days.sort_values().tail(PEAK_DAYS))

if ENABLE_PEAK_WEEK:
    daily2.loc[peak_window, "arrival_rate_future_B"] *= PEAK_WEEK_MULT

# ---------------------------------------------
# 4B4) Required staffing (raw + rounded) using effective capacity
# ---------------------------------------------
daily2["workers_required_B_raw"] = daily2["arrival_rate_future_B"] / (TARGET_UTIL * daily2["pick_rate_effective"])
daily2["workers_required_B"] = np.ceil(daily2["workers_required_B_raw"])

daily2["staffing_gap_B_raw"] = daily2["workers_required_B_raw"] - daily2["workers_effective"]
daily2["staffing_gap_B"] = daily2["workers_required_B"] - daily2["workers_effective"]

# Total headcount required (same scale as workers_effective)
daily2["workers_required_total_B"] = daily2["workers_effective"] + daily2["staffing_gap_B"].clip(lower=0)

display(daily2[[
    "day","weekday",
    "workers","workers_lost","workers_effective",
    "pick_rate_picks_per_lh","congestion_factor","pick_rate_effective",
    "arrival_rate_future_B",
    "workers_required_B_raw","workers_required_B",
    "staffing_gap_B_raw","staffing_gap_B",
    "workers_required_total_B"
]].head(12))

# ---------------------------------------------
# 4B5) Dashboards
# ---------------------------------------------
N = 12
dfp = daily2.sort_values("day").tail(N).copy()
x = np.arange(len(dfp))

# View 1: RAW staffing gap (continuous signal)
plt.figure(figsize=(11,4.5))
plt.bar(x, dfp["staffing_gap_B_raw"])
plt.axhline(0)
plt.xticks(x, dfp["day"].dt.strftime("%Y-%m-%d"), rotation=45, ha="right")
plt.ylabel("Staffing Gap (raw workers)")
plt.title("Staffing Gap (RAW) — Scenario B vs Effective Staffing")
plt.tight_layout()
plt.show()

# View 2: Rounded decision gap (headcount reality)
plt.figure(figsize=(11,4.5))
plt.bar(x, dfp["staffing_gap_B"])
plt.axhline(0)
plt.xticks(x, dfp["day"].dt.strftime("%Y-%m-%d"), rotation=45, ha="right")
plt.ylabel("Staffing Gap (rounded workers)")
plt.title("Staffing Gap (ROUNDED) — Scenario B vs Effective Staffing")
plt.tight_layout()
plt.show()

# View 3: Effective vs Total Required (side-by-side) — rounded decision
w = 0.40
plt.figure(figsize=(11,4.5))
plt.bar(x - w/2, dfp["workers_effective"], width=w, label="Effective workers (available)")
plt.bar(x + w/2, dfp["workers_required_total_B"], width=w, label="Total workers required (rounded)")
plt.xticks(x, dfp["day"].dt.strftime("%Y-%m-%d"), rotation=45, ha="right")
plt.ylabel("Workers")
plt.title("Effective vs Total Required Staffing — Scenario B (Rounded Decision)")
plt.legend()
plt.tight_layout()
plt.show()

# Driver view: pick rate degradation (why staffing requirement rises)
plt.figure(figsize=(11,3.8))
plt.plot(dfp["day"], dfp["pick_rate_picks_per_lh"], marker="o", label="Base pick rate")
plt.plot(dfp["day"], dfp["pick_rate_effective"], marker="o", label="Effective pick rate (with congestion)")
plt.xticks(rotation=45, ha="right")
plt.ylabel("Picks / labor hour")
plt.title("Pick Rate Degradation Due to Congestion (Driver of Staffing Requirement)")
plt.legend()
plt.tight_layout()
plt.show()
