<div style="background-color:#121212; color:#f5f5f5; padding:35px; border-radius:18px; font-family:Segoe UI, sans-serif;">

<h1 style="color:#F39C12; text-align:center; font-size:46px; font-weight:bold;">
🏬 Retail Analytics & Forecasting Suite
</h1>

<p style="text-align:center; font-size:19px; color:#cccccc; margin-bottom:25px;">
<i>End-to-End Machine Learning Pipeline: From Data Cleaning to Business Strategy</i>
</p>

<hr style="border: 1px solid #F39C12; width:85%; margin-bottom:25px;">

<h2 style="color:#1ABC9C;">🔗 GitHub Repository</h2>
<p>
👉 <a href="https://github.com/ut-si-ch/Retail-Sales-Analytics-Forecasting.git" target="_blank" style="color:#F39C12; font-weight:bold;">
Project Repository Link
</a>
</p>

---

<h2 style="color:#1ABC9C; font-size:18px;">📂 Dataset Overview</h2>
<ul>
  <li><b>Source:</b> Walmart retail sales dataset (merged: stores, departments, holidays, external factors)</li>
  <li><b>Size:</b> ~421,570 rows, 15+ features</li>
  <li><b>Key Columns:</b> <code>Store</code>, <code>Dept</code>, <code>Date</code>, <code>Weekly_Sales</code>, <code>Holiday_Flag</code>, <code>CPI</code>, <code>Unemployment</code>, <code>Fuel_Price</code></li>
</ul>

---

<h2 style="color:#1ABC9C;">🎯 Business Problem</h2>
<div style="background:#1e1e1e; padding:18px; border-radius:12px;font-size:18px; margin-bottom:20px;">
Retailers face challenges in:
<ul>
  <li>🚨 Detecting unusual sales patterns (fraud, stockouts, misreporting)</li>
  <li>🛒 Understanding customer buying patterns across departments</li>
  <li>📈 Forecasting demand accurately for better inventory planning</li>
  <li>🎯 Personalizing promotions & markdowns by store/segment</li>
</ul>
<b>Objective:</b> Build a unified pipeline for <u>anomaly detection</u>, <u>segmentation</u>, <u>forecasting</u>, and <u>basket analysis</u>.
</div>

---

<h2 style="color:#1ABC9C;">🛠️ End-to-End Workflow</h2>
<div style="background:#1e1e1e; padding:20px; font-size:17px; border-radius:12px;">

<h3 style="color:#F39C12;">Step 1: Project Setup</h3>
<p>✅ Environment setup with Anaconda, installed essential libraries, created modular folder structure.</p>

<h3 style="color:#F39C12;">Step 2: Data Ingestion & Exploration</h3>
<p>✅ Loaded sales + external datasets, checked missing values, correlations, trends (holiday spikes, markdown impact).</p>

<h3 style="color:#F39C12;">Step 3: Data Preprocessing & Feature Engineering</h3>
<p>✅ Handled null values (especially MarkDown), encoded categorical features, created lag features, rolling averages, holiday/weekend indicators.</p>

<h3 style="color:#F39C12;">Step 4: Exploratory Data Analysis</h3>
<p>✅ Visualized store/department sales distributions, seasonal decomposition, effect of CPI, fuel, unemployment. Outliers identified.</p>

<h3 style="color:#F39C12;">Step 5: Anomaly Detection</h3>
<p>✅ Used Z-score, Isolation Forest, and Autoencoders. Achieved better recall with Autoencoder (≈20%).</p>

<h3 style="color:#F39C12;">Step 6: Customer Segmentation</h3>
<p>✅ Applied K-Means, found <b>K=10</b> optimal clusters (lowest inertia). Segmented stores/departments by performance.</p>

<h3 style="color:#F39C12;">Step 7: Market Basket Analysis</h3>
<p>✅ Applied FP-Growth (faster than Apriori) to discover association rules for cross-selling strategies.</p>

<h3 style="color:#F39C12;">Step 8: Demand Forecasting</h3>
<p>✅ Compared Linear Regression, Random Forest, Gradient Boosting, XGBoost.  
Best RMSE ≈ 2986 (Random Forest) with R² = 0.98.</p>

<h3 style="color:#F39C12;">Step 9: Impact of External Factors</h3>
<p>✅ Regression models with CPI, fuel prices, unemployment → showed strong seasonal & macroeconomic dependencies.</p>

<h3 style="color:#F39C12;">Step 10: Personalization & Strategy</h3>
<p>✅ Store-specific markdown recommendations, inventory optimization, segment-based promotions.</p>

<h3 style="color:#F39C12;">Step 11: Visualization & Reporting</h3>
<p>✅ EDA Visualizations, anomaly trends, and forecasting. Jupyter Notebook for detailed ML pipeline.</p>

<h3 style="color:#F39C12;">Step 12: Deliverables</h3>
<p>✅ Predictive models (forecasting + anomalies), customer clusters, basket rules, and business recommendation report.</p>

</div>

---

<h2 style="color:#1ABC9C;">💡 Business Impact</h2>
<div style="background:#1e1e1e; padding:18px; font-size:18px;border-radius:12px;">
<ul>
  <li>📦 <b>Inventory Optimization:</b> Prevent stockouts/overstock using forecasts.</li>
  <li>🎯 <b>Marketing Precision:</b> Cluster-based promotions boost campaign ROI.</li>
  <li>🚨 <b>Risk Mitigation:</b> Early anomaly detection prevents revenue leakage.</li>
  <li>🤝 <b>Revenue Growth:</b> Cross-selling bundles increase basket size.</li>
</ul>
</div>

<p style="text-align:center; font-size:19px; margin-top:25px; color:#F39C12;">
✨ Retail data → Insights → Profits ✨
</p>

</div>


<div style="background-color:#0b0c10; color:#f5f5f5; padding:40px; border-radius:18px; font-family:Segoe UI, sans-serif; text-align:center;">

<h1 style="color:#66FCF1; font-size:42px; font-weight:bold;">
⚡ Let’s Begin the Project Walkthrough
</h1>

<p style="font-size:20px; color:#c5c6c7; margin-top:15px;">
We will now walk step-by-step through the <b>end-to-end pipeline</b> —  
covering data ingestion, preprocessing, EDA, anomaly detection, forecasting, segmentation, and strategy design.  
</p>

<hr style="border: 1px solid #66FCF1; width:70%; margin:25px auto;">

<p style="margin-top:25px; font-size:18px; color:#66FCF1;">
✨ Ready? Let’s dive into the code and uncover insights from our retail data!
</p>

</div>
