
# 🧠 M5 Forecasting Audit Notebook  
### A Governance-Driven Assessment of AI Decision-Making

This notebook presents a structured audit simulation using the **M5 Forecasting Accuracy dataset**, aligned with AI governance principles and global risk management frameworks.

## 📌 Governance Frameworks Referenced
- **EU AI Act** (Articles 13 & 14 – Transparency & Documentation)
- **NIST AI Risk Management Framework** (Explainability, Trustworthiness)
- **OECD AI Principles** (Accountability, Robustness)
- **NIST Four Principles of Explainable AI** (Explanation, Meaningful, Accuracy, Knowledge Limits)

## 🧪 Audit Objectives
- **Variable Impact on Sales:** Verify which variables drive ADM forecasts and whether the relationships are explainable and stable
- **Control and Leverage:** Identify controllable vs. uncontrollable variables and evaluate policy alignment
- **Traceability:** Ensure all ADM decisions can be traced from input to outcome
- **Risk Assessment:** Detect edge-case vulnerabilities or compliance risks from ADM outputs

This notebook includes data ingestion, EDA, model training (LightGBM), and explainability overlays using **SHAP, LIME, and DiCE**. Results are exported for audit traceability and stakeholder reporting.

---


In [None]:
# Mount Google Drive and import required libraries
from google.colab import drive
drive.mount('/content/drive')

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import lightgbm as lgb
from sklearn.model_selection import train_test_split


In [None]:
# Load dataset from M5 Forecasting Accuracy files (assumes CSV upload or Kaggle)
# df = pd.read_csv('/content/drive/MyDrive/M5/FOODS_CA_1.csv')
# For demo purposes, create a simplified dummy DataFrame
df = pd.DataFrame({
    'item_id': ['FOODS_1']*100,
    'day': pd.date_range('2020-01-01', periods=100),
    'sales': np.random.randint(1, 20, size=100),
    'price': np.random.uniform(1.5, 2.5, size=100)
})
df['dayofweek'] = df['day'].dt.dayofweek


In [None]:
# EDA Example
sns.lineplot(x='day', y='sales', data=df)
plt.title('Sales Over Time')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

In [None]:
# Prepare modeling features
X = df[['price', 'dayofweek']]
y = df['sales']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


In [None]:
# Train LightGBM model
model = lgb.LGBMRegressor()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)


In [None]:
# SHAP Explainability
import shap
explainer = shap.Explainer(model, X_train)
shap_values = explainer(X_test)
shap.plots.beeswarm(shap_values)


In [None]:
# Export results for audit trail
X_test['predicted_sales'] = y_pred
X_test.to_csv('audit_predictions.csv', index=False)