# Notebook 21 - Simulate Deployment Impact

## Purpose
This notebook estimates how much product waste could have been prevented if the ranked meal box plans had been deployed in stores. By aligning matched recipe ingredients with historical waste events, we simulate potential reductions in both wasted items and monetary value.

## Objectives
- Load matched recipe–product pairs with store-level deployment plans
- Join with waste logs enriched with concept mappings
- Simulate avoided waste by checking concept overlap between planned and wasted items
- Aggregate results by store and overall KPIs
- Export store-level waste reduction estimates for reporting

## Inputs
- recipe_store_ranked.csv - Final ranked recipes per store
- matching_matrix_scored.csv - Filtered recipe-product matches
- waste_enriched.csv - Waste logs with mapped product concepts

## Outputs
- waste_impact_simulated.csv - Estimated waste reduction per store
- Console summary of avoided waste (items + value)


In [17]:
import os
import pandas as pd

# Folders
matching_folder = "matching_scored"
ranking_folder = "recipe_ranking"
waste_folder = "cleaned_data"
variant_folder = "variant_exports"
output_folder = "waste_simulation"
os.makedirs(output_folder, exist_ok=True)

# File paths
ranked_file = os.path.join(ranking_folder, "recipe_store_ranked.csv")
matches_file = os.path.join(matching_folder, "matching_matrix_scored.csv")
recipes_file = os.path.join(variant_folder, "recipes_with_variants.csv")
waste_file = os.path.join(waste_folder, "waste_with_concept.csv")

# Load data
df_ranked = pd.read_csv(ranked_file)
df_matches = pd.read_csv(matches_file)
df_recipes = pd.read_csv(recipes_file)
df_waste = pd.read_csv(waste_file)

print("Loaded:")
print(f"- Ranked recipes: {df_ranked.shape}")
print(f"- Matches: {df_matches.shape}")
print(f"- Recipes: {df_recipes.shape}")
print(f"- Waste: {df_waste.shape}")


Loaded:
- Ranked recipes: (4, 6)
- Matches: (8, 10)
- Recipes: (6, 8)
- Waste: (18382, 16)


In [18]:
# Ensure row_id exists for merging
if "row_id" not in df_recipes.columns:
    df_recipes = df_recipes.reset_index(drop=False).rename(columns={"index": "row_id"})

# Keep only recipe/ingredient context
df_context = df_recipes[["row_id", "recipe", "ingredient"]].copy()

# Merge context into matches
df_matches = df_matches.merge(df_context, on=["row_id", "ingredient"], how="left")


In [19]:
# Extract relevant fields for simulation
df_match_concepts = df_matches[[
    "recipe", "store", "ingredient", "product_article", "product_name", "match_term"
]].copy()

# Filter to store–recipe pairs from ranked deployment plan
deployable_pairs = df_ranked[["store", "recipe"]].drop_duplicates()
df_deployed = df_match_concepts.merge(deployable_pairs, on=["store", "recipe"], how="inner")

# Preview
print("Deployable product matches:", df_deployed.shape)
df_deployed.head()


Deployable product matches: (4, 6)


Unnamed: 0,recipe,store,ingredient,product_article,product_name,match_term
0,Greek Yogurt & Honey,1024.0,yogurt,438226,roeryoghurt,yogurt
1,Greek Yogurt & Honey,1090.0,yogurt,438226,roeryoghurt,yogurt
2,Greek Yogurt & Honey,4255.0,yogurt,438226,roeryoghurt,yogurt
3,Greek Yogurt & Honey,3340.0,yogurt,438226,roeryoghurt,yogurt


In [20]:
# Ensure product_concept in waste is clean lowercase
df_waste["product_concept"] = df_waste["product_concept"].astype(str).str.strip().str.lower()

# Also lowercase match_term (from product/ingredient concepts)
df_deployed["match_term"] = df_deployed["match_term"].astype(str).str.strip().str.lower()


In [21]:
# Join on store and concept match
df_simulated = df_waste.merge(
    df_deployed,
    left_on=["Store", "product_concept"],
    right_on=["store", "match_term"],
    how="inner"
)

print("Simulated avoided waste matches:", df_simulated.shape)
df_simulated[["Store", "product_concept", "Items wasted", "Value wasted", "recipe"]].head()


Simulated avoided waste matches: (5, 22)


Unnamed: 0,Store,product_concept,Items wasted,Value wasted,recipe
0,4255,yogurt,1,1.33,Greek Yogurt & Honey
1,4255,yogurt,1,0.79,Greek Yogurt & Honey
2,1024,yogurt,1,0.79,Greek Yogurt & Honey
3,1090,yogurt,1,0.79,Greek Yogurt & Honey
4,3340,yogurt,1,0.79,Greek Yogurt & Honey


In [22]:
# Ensure single clean 'store' column
if "store_x" in df_simulated.columns and "store_y" in df_simulated.columns:
    # If duplicate store columns from merge
    df_simulated["store"] = df_simulated["store_x"]
elif "Store" in df_simulated.columns:
    df_simulated = df_simulated.rename(columns={"Store": "store"})

# Drop any leftover duplicate columns if present
df_simulated = df_simulated.loc[:, ~df_simulated.columns.duplicated()]


In [23]:
# Clean column names for downstream processing
df_simulated = df_simulated.rename(columns={
    "Store": "store",
    "Items wasted": "items_wasted",
    "Value wasted": "value_wasted"
})

# Group by store to compute totals
df_impact = df_simulated.groupby("store").agg({
    "items_wasted": "sum",
    "value_wasted": "sum"
}).reset_index()

# Save result
impact_file = os.path.join(output_folder, "waste_impact_simulated.csv")
df_impact.to_csv(impact_file, index=False)

print("Estimated waste reductions saved to:", impact_file)
df_impact.head()


Estimated waste reductions saved to: waste_simulation\waste_impact_simulated.csv


Unnamed: 0,store,items_wasted,value_wasted
0,1024,1,0.79
1,1090,1,0.79
2,3340,1,0.79
3,4255,2,2.12
