# AI-Driven Inventory Intelligence System 
### KPI Engineering • Risk Signals • Actionable Alerts • Power BI Outputs

This notebook turns the **Gold weekly dataset** into **decision-ready analytics tables**:
- Store/week KPIs (revenue proxy, demand trends)
- Item-level risk scoring (dead-stock, spikes, volatility)
- An alerts table (what to do next)
- Power BI-friendly aggregates and ranks


In [0]:
from pyspark.sql import functions as F
from pyspark.sql.window import Window

gold_table = "workspace.database.gold_weekly_inventory_intelligence_v2"
gold = spark.read.table(gold_table)

gold.select(
    F.count("*").alias("rows"),
    F.countDistinct("store_id").alias("stores"),
    F.countDistinct("item_id").alias("items"),
    F.countDistinct("wm_yr_wk").alias("weeks")
).display()

display(gold.limit(5))


## Step 1 — Safety & Standardization
We enforce safe numeric behavior and ensure derived metrics avoid divide-by-zero issues.


In [0]:
gold2 = (
    gold
    .withColumn("weekly_sales", F.col("weekly_sales").cast("double"))
    .withColumn("final_sell_price", F.col("final_sell_price").cast("double"))
    .withColumn("lag_price_1", F.col("lag_price_1").cast("double"))
    .withColumn("roll_avg_sales_4", F.col("roll_avg_sales_4").cast("double"))
    .withColumn("roll_std_sales_4", F.col("roll_std_sales_4").cast("double"))
)

# Helpful: revenue proxy
gold2 = gold2.withColumn("revenue_proxy", F.col("weekly_sales") * F.col("final_sell_price"))

# Safe ratios
gold2 = gold2.withColumn(
    "sales_vs_rollavg",
    F.when(F.col("roll_avg_sales_4") > 0, F.col("weekly_sales") / F.col("roll_avg_sales_4")).otherwise(F.lit(None))
)

# Price change %
gold2 = gold2.withColumn(
    "price_change_pct",
    F.when(F.col("lag_price_1") > 0, (F.col("final_sell_price") - F.col("lag_price_1")) / F.col("lag_price_1")).otherwise(F.lit(None))
)


## Step 2 — Demand Signals & Risk Flags

We create signals that map to operational decisions:
- **Dead-stock risk** (persistent zero sales)
- **Demand spikes** (unusual surge)
- **Volatility** (unstable demand)
- **Promo-like behavior** (price drop + sales jump)


In [0]:
# Dead-stock / low movement flags
gold2 = gold2.withColumn("is_zero_sales", (F.col("weekly_sales") == 0).cast("int"))

# Spike rule: sales > avg + 2*std (only when std exists)
gold2 = gold2.withColumn(
    "is_spike",
    F.when(
        (F.col("roll_avg_sales_4") > 0) & (F.col("roll_std_sales_4").isNotNull()) &
        (F.col("weekly_sales") > (F.col("roll_avg_sales_4") + 2 * F.col("roll_std_sales_4"))),
        1
    ).otherwise(0)
)

# Volatility: coefficient of variation (std/avg)
gold2 = gold2.withColumn(
    "sales_cv_4w",
    F.when(F.col("roll_avg_sales_4") > 0, F.col("roll_std_sales_4") / F.col("roll_avg_sales_4")).otherwise(F.lit(None))
)

# Promo-ish: price drop >= 10% and sales above rolling avg
gold2 = gold2.withColumn(
    "is_promo_like",
    F.when(
        (F.col("price_change_pct").isNotNull()) &
        (F.col("price_change_pct") <= -0.10) &
        (F.col("roll_avg_sales_4") > 0) &
        (F.col("weekly_sales") > F.col("roll_avg_sales_4")),
        1
    ).otherwise(0)
)


## Step 3 — Latest Week Snapshot & Rankings
Power BI loves a “current week action list”: top items to restock, top dead-stock risks, biggest spikes.


In [0]:
latest_wk = gold2.agg(F.max("wm_yr_wk").alias("max_wk")).collect()[0]["max_wk"]
print("Latest wm_yr_wk:", latest_wk)

current = gold2.filter(F.col("wm_yr_wk") == latest_wk)

# Restock candidates: high sales_vs_rollavg + spike
restock_rank_w = Window.partitionBy("store_id").orderBy(F.col("weekly_sales").desc(), F.col("revenue_proxy").desc())

restock_candidates = (
    current
    .filter(F.col("weekly_sales") > 0)
    .withColumn("restock_rank", F.row_number().over(restock_rank_w))
    .filter(F.col("restock_rank") <= 50)
    .select(
        "store_id","item_id","dept_id","cat_id","wm_yr_wk",
        "weekly_sales","final_sell_price","revenue_proxy",
        "sales_vs_rollavg","is_spike","is_promo_like","restock_rank"
    )
)

display(restock_candidates.orderBy("store_id","restock_rank"))


## Step 4 — Dead-Stock Risk Scoring (Simple & Defensible)

We compute recent inactivity and rank items that look “stuck”.
This is intentionally interpretable (ops teams trust it).


In [0]:
w_item = Window.partitionBy("store_id","item_id").orderBy(F.col("wm_yr_wk").desc())

# last 8 weeks inactivity score: % weeks with zero sales
dead_stock = (
    gold2
    .withColumn("rn_desc", F.row_number().over(w_item))
    .filter(F.col("rn_desc") <= 8)
    .groupBy("store_id","item_id","dept_id","cat_id")
    .agg(
        F.mean("is_zero_sales").alias("zero_sales_rate_8w"),
        F.sum("weekly_sales").alias("sales_sum_8w"),
        F.mean("final_sell_price").alias("avg_price_8w")
    )
    .withColumn(
        "dead_stock_risk",
        # risk increases with persistent zeros and low total movement
        F.when((F.col("zero_sales_rate_8w") >= 0.75) & (F.col("sales_sum_8w") == 0), F.lit("HIGH"))
         .when((F.col("zero_sales_rate_8w") >= 0.50) & (F.col("sales_sum_8w") <= 2), F.lit("MED"))
         .otherwise(F.lit("LOW"))
    )
)

display(dead_stock.orderBy(F.col("zero_sales_rate_8w").desc(), F.col("sales_sum_8w").asc()).limit(50))


## Step 5 — Alerts Table (Actionable)
This table is what makes your project “inventory intelligence” instead of “a model notebook”.


In [0]:
alerts = (
    current.select("store_id","item_id","dept_id","cat_id","wm_yr_wk",
                   "weekly_sales","final_sell_price","revenue_proxy",
                   "is_spike","is_promo_like","sales_vs_rollavg")
    .join(dead_stock.select("store_id","item_id","dead_stock_risk","zero_sales_rate_8w","sales_sum_8w"),
          on=["store_id","item_id"], how="left")
    .withColumn(
        "alert_type",
        F.when(F.col("is_spike") == 1, F.lit("DEMAND_SPIKE"))
         .when(F.col("dead_stock_risk") == "HIGH", F.lit("DEAD_STOCK"))
         .when(F.col("is_promo_like") == 1, F.lit("PROMO_LIFT"))
         .otherwise(F.lit(None))
    )
    .filter(F.col("alert_type").isNotNull())
    .withColumn(
        "severity",
        F.when(F.col("alert_type") == "DEMAND_SPIKE", F.lit("HIGH"))
         .when(F.col("alert_type") == "DEAD_STOCK", F.lit("HIGH"))
         .when(F.col("alert_type") == "PROMO_LIFT", F.lit("MED"))
         .otherwise(F.lit("LOW"))
    )
    .withColumn(
        "recommended_action",
        F.when(F.col("alert_type") == "DEMAND_SPIKE", F.lit("Review replenishment; consider restock or redistribution."))
         .when(F.col("alert_type") == "DEAD_STOCK", F.lit("Investigate item; consider markdown, delist, or reduce orders."))
         .when(F.col("alert_type") == "PROMO_LIFT", F.lit("Confirm promo impact; consider extending or adjusting pricing."))
         .otherwise(F.lit(""))
    )
)

display(alerts.orderBy("store_id","severity").limit(100))


## Step 6 — Power BI Output Tables (Delta)

We write three BI-friendly tables:
- Item-week decision table
- Store-week KPI summary
- Alerts (latest week)


In [0]:
decision_item_week = gold2.select(
    "store_id","item_id","dept_id","cat_id","wm_yr_wk",
    "weekly_sales","final_sell_price","revenue_proxy",
    "event_ratio","snap_ratio",
    "sales_vs_rollavg","price_change_pct","sales_cv_4w",
    "is_spike","is_promo_like"
)

store_week_kpis = (
    decision_item_week
    .groupBy("store_id","wm_yr_wk")
    .agg(
        F.sum("weekly_sales").alias("store_week_sales"),
        F.sum("revenue_proxy").alias("store_week_revenue_proxy"),
        F.mean(F.col("is_spike").cast("double")).alias("spike_rate"),
        F.mean(F.col("is_promo_like").cast("double")).alias("promo_like_rate"),
        F.mean("final_sell_price").alias("avg_price")
    )
)

out_decision = "workspace.database.gold_decision_item_week_v1"
out_storekpi = "workspace.database.gold_store_week_kpis_v1"
out_alerts  = "workspace.database.gold_alerts_latest_week_v1"

(decision_item_week.write.mode("overwrite").option("overwriteSchema","true").format("delta").saveAsTable(out_decision))
(store_week_kpis.write.mode("overwrite").option("overwriteSchema","true").format("delta").saveAsTable(out_storekpi))
(alerts.write.mode("overwrite").option("overwriteSchema","true").format("delta").saveAsTable(out_alerts))

print("✅ Written:", out_decision)
print("✅ Written:", out_storekpi)
print("✅ Written:", out_alerts)
