#### Original Markdown
# Task 7 Level 3

**Time Series Breakdown of Walmart Retail Sales**

Dataset: `train.csv` (Walmart Sales Forecasting)

**Covered Topics:** Time series analysis • Trend & seasonality • Visualization over time • Simple forecasting (rolling mean, exponential smoothing)

#### Original Markdown
## 1) Imports & Settings

#### Code Explanation
This cell performs the following task:

```python
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt



```

#### Original Markdown
## 2) Load Dataset

#### Code Explanation
This cell performs the following task:

```python
df=pd.read_csv("train.csv")
df.head()

```

#### Original Markdown
## 3) Basic Cleaning & Type Conversion

#### Code Explanation
This cell performs the following task:

```python
# Inspect columns
df.info()

# The Kaggle Walmart train.csv typically has: Store, Dept, Date, Weekly_Sales, IsHoliday
# Convert Date to datetime, standardize column names for convenience
expected_cols = ['Store','Dept','Date','Weekly_Sales']
missing = [c for c in expected_cols if c not in df.columns]
if missing:
    print('Note: Missing expected columns:', missing)
```

#### Code Explanation
This cell performs the following task:

```python
# Convert date
df['Date'] = pd.to_datetime(df['Date'])
```

#### Code Explanation
This cell performs the following task:

```python
# Ensure numeric
df['Weekly_Sales'] = pd.to_numeric(df['Weekly_Sales'], errors='coerce')
```

#### Code Explanation
This cell performs the following task:

```python
# Drop rows with missing sales or date
df = df.dropna(subset=['Date','Weekly_Sales']).copy()
```

#### Code Explanation
This cell performs the following task:

```python
# Sort by date to make time operations safer
df = df.sort_values('Date').reset_index(drop=True)
df.head()
```

#### Original Markdown
## 4) Create Monthly Aggregations

#### Code Explanation
This cell performs the following task:

```python
# Although the dataset is weekly, we will aggregate to monthly totals for smoother trends.
df['YearMonth'] = df['Date'].dt.to_period('M').dt.to_timestamp()

monthly_total = df.groupby('YearMonth', as_index=False)['Weekly_Sales'].sum()
monthly_total.rename(columns={'Weekly_Sales':'Monthly_Sales'}, inplace=True)

# Per product (Dept) & per region (Store) monthly breakdowns
monthly_by_dept = df.groupby(['YearMonth','Dept'], as_index=False)['Weekly_Sales'].sum()
monthly_by_dept.rename(columns={'Weekly_Sales':'Monthly_Sales'}, inplace=True)

monthly_by_store = df.groupby(['YearMonth','Store'], as_index=False)['Weekly_Sales'].sum()
monthly_by_store.rename(columns={'Weekly_Sales':'Monthly_Sales'}, inplace=True)

monthly_total.head()

```

#### Original Markdown
## 5) Overall Monthly Sales Trend

#### Code Explanation
This cell performs the following task:

```python
plt.figure()
plt.plot(monthly_total['YearMonth'], monthly_total['Monthly_Sales'])
plt.title('Overall Monthly Sales Trend')
plt.xlabel('Month')
plt.ylabel('Sales')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

```

#### Original Markdown
## 6) Moving Averages (3-Month & 12-Month)

#### Code Explanation
This cell performs the following task:

```python
monthly_ma = monthly_total.copy()
monthly_ma['MA_3'] = monthly_ma['Monthly_Sales'].rolling(window=3, min_periods=1).mean()
monthly_ma['MA_12'] = monthly_ma['Monthly_Sales'].rolling(window=12, min_periods=1).mean()

plt.figure()
plt.plot(monthly_ma['YearMonth'], monthly_ma['Monthly_Sales'], label='Monthly Sales')
plt.plot(monthly_ma['YearMonth'], monthly_ma['MA_3'], label='3-Month MA')
plt.plot(monthly_ma['YearMonth'], monthly_ma['MA_12'], label='12-Month MA')
plt.title('Monthly Sales with Moving Averages')
plt.xlabel('Month')
plt.ylabel('Sales')
plt.xticks(rotation=45)
plt.legend()
plt.tight_layout()
plt.show()

```

#### Original Markdown
## 7) Seasonal Pattern — Average Sales by Calendar Month

#### Code Explanation
This cell performs the following task:

```python
# Compute average sales for each calendar month across years
monthly_total['Month'] = monthly_total['YearMonth'].dt.month
avg_by_month = monthly_total.groupby('Month', as_index=False)['Monthly_Sales'].mean()

# Bar chart of average sales by month
plt.figure()
plt.bar(avg_by_month['Month'], avg_by_month['Monthly_Sales'])
plt.title('Average Monthly Sales (Seasonality)')
plt.xlabel('Calendar Month (1–12)')
plt.ylabel('Average Sales')
plt.xticks(range(1,13))
plt.tight_layout()
plt.show()

avg_by_month.sort_values('Month').reset_index(drop=True)

```

#### Original Markdown
## 8) Breakdown Over Time — Top 5 Products (Depts)

#### Code Explanation
This cell performs the following task:

```python
# Compute total sales by Dept and pick top 5
dept_totals = monthly_by_dept.groupby('Dept', as_index=False)['Monthly_Sales'].sum()
top5_depts = dept_totals.sort_values('Monthly_Sales', ascending=False).head(5)['Dept'].tolist()

subset_dept = monthly_by_dept[monthly_by_dept['Dept'].isin(top5_depts)]
pivot_dept = subset_dept.pivot(index='YearMonth', columns='Dept', values='Monthly_Sales').fillna(0)

plt.figure()
for col in pivot_dept.columns:
    plt.plot(pivot_dept.index, pivot_dept[col], label=f'Dept {col}')
plt.title('Top 5 Depts — Monthly Sales Over Time')
plt.xlabel('Month')
plt.ylabel('Sales')
plt.xticks(rotation=45)
plt.legend()
plt.tight_layout()
plt.show()

pivot_dept.tail()

```

#### Original Markdown
## 9) Breakdown Over Time — Top 5 Regions (Stores)

#### Code Explanation
This cell performs the following task:

```python
# The train.csv file does not include geographic regions, so we use Store as a proxy for region.
store_totals = monthly_by_store.groupby('Store', as_index=False)['Monthly_Sales'].sum()
top5_stores = store_totals.sort_values('Monthly_Sales', ascending=False).head(5)['Store'].tolist()

subset_store = monthly_by_store[monthly_by_store['Store'].isin(top5_stores)]
pivot_store = subset_store.pivot(index='YearMonth', columns='Store', values='Monthly_Sales').fillna(0)

plt.figure()
for col in pivot_store.columns:
    plt.plot(pivot_store.index, pivot_store[col], label=f'Store {col}')
plt.title('Top 5 Stores — Monthly Sales Over Time')
plt.xlabel('Month')
plt.ylabel('Sales')
plt.xticks(rotation=45)
plt.legend()
plt.tight_layout()
plt.show()

pivot_store.tail()

```

#### Original Markdown
## 10) Simple Forecasting — Rolling Mean & Exponential Smoothing

#### Code Explanation
This cell performs the following task:

```python
# We'll create a simple 12-month rolling-mean forecast and a Simple Exponential Smoothing (SES) forecast.
# Forecast horizon
h = 6  # months

series = monthly_total.set_index('YearMonth')['Monthly_Sales'].asfreq('MS')  # monthly start frequency

# --- Rolling Mean Forecast (12-month) ---
roll_window = 12
rolling_mean = series.rolling(window=roll_window, min_periods=1).mean()
last_roll = rolling_mean.iloc[-1]
roll_forecast_index = pd.date_range(series.index[-1] + pd.offsets.MonthBegin(), periods=h, freq='MS')
roll_forecast = pd.Series([last_roll]*h, index=roll_forecast_index)


# Display numeric forecast tables
print('Rolling Mean Forecast (next 6 months):')
display(roll_forecast.to_frame('Forecast'))



```

#### Code Explanation
This cell performs the following task:

```python
# --- Simple Exponential Smoothing (manual) ---
alpha = 0.3
ses = pd.Series(index=series.index, dtype='float64')
level = series.iloc[0]
ses.iloc[0] = level
for t in range(1, len(series)):
    level = alpha * series.iloc[t] + (1 - alpha) * level
    ses.iloc[t] = level
# Forecast future points using the last level (no trend/seasonality)
ses_forecast_index = pd.date_range(series.index[-1] + pd.offsets.MonthBegin(), periods=h, freq='MS')
ses_forecast = pd.Series([level]*h, index=ses_forecast_index)

print('\nSimple Exponential Smoothing Forecast (next 6 months):')
display(ses_forecast.to_frame('Forecast'))
```

#### Code Explanation
This cell performs the following task:

```python
# Plot history + forecasts
plt.figure()
plt.plot(series.index, series.values, label='Monthly Sales')
plt.plot(rolling_mean.index, rolling_mean.values, label='12-Month Rolling Mean')
plt.plot(roll_forecast.index, roll_forecast.values, label='Rolling Mean Forecast')
plt.plot(ses.index, ses.values, label='SES (alpha=0.3)')
plt.plot(ses_forecast.index, ses_forecast.values, label='SES Forecast')
plt.title('Simple Forecasts: Rolling Mean & SES')
plt.xlabel('Month')
plt.ylabel('Sales')
plt.xticks(rotation=45)
plt.legend()
plt.tight_layout()
plt.show()
```

#### Original Markdown
## 11) Notes & Conclusions

- The dataset is weekly; we aggregated to monthly totals for trend/seasonality.
- "Region" is represented by `Store` because the train.csv file does not include geographic regions.
- Forecasts shown here are simple baselines (rolling mean and SES without trend/seasonality components).