## Project Overview
This project analyzes the effectiveness of promotional markdowns on weekly sales performance using historical Walmart sales data.  
The objective is to move beyond simple averages and evaluate whether the *intensity* of markdowns meaningfully impacts sales outcomes.


In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt


Matplotlib is building the font cache; this may take a moment.


In [2]:
df = pd.read_csv("walmart_cleaned.csv")


## Data Preparation
- Parsed transaction dates into datetime format
- Removed indexing artifacts
- Created a binary promotion flag indicating the presence of markdown activity
- Excluded negative and zero sales values for log-based analysis


In [None]:
df.shape
df.head()
df.info()


In [4]:
df['Date'] = pd.to_datetime(df['Date'])
df = df.drop(columns=['Unnamed: 0'])


## Sales Distribution Considerations
Weekly sales data is highly right-skewed due to occasional extreme promotional spikes.  
To evaluate typical performance rather than rare outliers, a logarithmic transformation was applied to sales values.


In [5]:
markdown_cols = ['MarkDown1','MarkDown2','MarkDown3','MarkDown4','MarkDown5']
df['Any_Markdown'] = (df[markdown_cols].sum(axis=1) > 0).astype(int)


In [None]:
df.groupby('Any_Markdown')['Weekly_Sales'].agg(
    count='count',
    mean='mean',
    median='median'
)


In [7]:
df_pos = df[df['Weekly_Sales'] > 0].copy()
df_pos['Log_Weekly_Sales'] = np.log(df_pos['Weekly_Sales'])


In [None]:
df_pos.groupby('Any_Markdown')['Log_Weekly_Sales'].agg(
    count='count',
    mean='mean',
    median='median'
)


## Binary Promotion Analysis
Initial comparison between promotional and non-promotional weeks showed no statistically significant difference in typical sales levels after controlling for skewness.  
This indicated that a simple binary classification of promotions is insufficient to capture true promotional effects.


In [None]:
from scipy.stats import mannwhitneyu

no_promo = df_pos[df_pos['Any_Markdown'] == 0]['Log_Weekly_Sales']
promo = df_pos[df_pos['Any_Markdown'] == 1]['Log_Weekly_Sales']

stat, p_value = mannwhitneyu(no_promo, promo, alternative='two-sided')
p_value


In [10]:
df_pos['Total_Markdown'] = df_pos[markdown_cols].sum(axis=1)


## Markdown Intensity Analysis
Rather than treating promotions as a binary variable, total markdown spending was aggregated and categorized into intensity levels.  
Weeks without markdown activity were analyzed separately due to their dominance in the dataset.


In [12]:
df_pos['Markdown_Level'] = 'No Markdown'


In [13]:
mask = df_pos['Total_Markdown'] > 0

df_pos.loc[mask, 'Markdown_Level'] = pd.qcut(
    df_pos.loc[mask, 'Total_Markdown'],
    q=3,
    labels=['Low', 'Medium', 'High']
)

In [None]:
df_pos['Markdown_Level'].value_counts()


In [None]:
df_pos.groupby('Markdown_Level')['Log_Weekly_Sales'].agg(
    count='count',
    mean='mean',
    median='median'
)


## Key Findings
Markdown intensity shows a statistically significant, non-linear relationship with weekly sales performance (Kruskalâ€“Wallis p < 0.001).

- Low-intensity markdowns underperform baseline (no markdown) weeks
- Medium markdown levels improve sales outcomes
- High-intensity markdowns generate the strongest uplift

This suggests that weak promotions may erode margins without meaningfully increasing demand.


In [None]:
from scipy.stats import kruskal

groups = [
    df_pos[df_pos['Markdown_Level'] == lvl]['Log_Weekly_Sales']
    for lvl in ['No Markdown', 'Low', 'Medium', 'High']
]

stat, p_value = kruskal(*groups)
p_value


## Business Implications

The analysis shows that **not all promotions are equally effective**.  
Low-intensity markdowns fail to outperform non-promotional weeks, suggesting they add cost without meaningfully increasing demand.

In contrast, **medium and high markdown levels consistently drive higher sales**, indicating that customers respond only when price reductions are sufficiently strong.

**Implications for decision-makers:**
- Focus promotional budgets on **fewer, high-impact discount events** rather than frequent minor markdowns  
- Avoid low-level promotions that erode margins without changing purchasing behavior  
- Use markdown depth as a strategic lever, not a routine tactic

This insight supports more efficient promotion planning, margin protection, and demand stimulation.
