### What this Notebook contains:
- We will be trying to  solve Price Sensitivity Problem where We answer questions like, “If price goes up by a small amount, how much do units sold change?”
- THe type of Model we have choosen. We will be going with a log-based Linear Model bcoz:

    - percentage change is easier to explain than absolute change
    - coefficients directly show sensitivity
    - results are stable and interpretable
```
log(units)=a+b1​⋅log(price)+b2​⋅discount+b3​⋅log(marketing)+b4​⋅week
```
b is the Price Sensitivity

We will be using OLS: ordinary Least squares. As the name suggest we will be trying to fit a line which has lest errors


In [1]:
import pandas as pd
import numpy as np
import statsmodels.api as sm
from pathlib import Path

DATA_PROCESSED = Path("../data/processed")
OUT_TAB = Path("../outputs/tables")
OUT_TAB.mkdir(parents=True, exist_ok=True)


In [2]:
df = pd.read_csv(DATA_PROCESSED / "model_data.csv")
df.head()


Unnamed: 0,week,store_id,product_id,units_sold,selling_price,category,cost_price,discount_percent,marketing_spend,final_price,revenue,profit
0,1,S01,P001,17.0,51.95,Dairy,31.77,0.0,8047.96,51.95,883.15,343.06
1,1,S01,P002,7.0,38.57,Household,26.05,0.0,8047.96,38.57,269.99,87.64
2,1,S01,P003,25.0,34.52,Beverages,23.56,15.0,8047.96,29.342,733.55,144.55
3,1,S01,P004,29.0,35.43,Snacks,22.08,20.0,8047.96,28.344,821.976,181.656
4,1,S01,P005,19.0,19.55,Fresh,14.62,0.0,8047.96,19.55,371.45,93.67


In [3]:
model_df = df[
    (df["units_sold"] > 0) &
    (df["final_price"] > 0) &
    (df["marketing_spend"] > 0)
].copy()

In [4]:
model_df["log_units"] = np.log(model_df["units_sold"])
model_df["log_price"] = np.log(model_df["final_price"])
model_df["log_marketing"] = np.log(model_df["marketing_spend"])

In [5]:
model_df["week_scaled"] = model_df["week"] / model_df["week"].max()


In [6]:
X = model_df[
    ["log_price", "discount_percent", "log_marketing", "week_scaled"]
]

X = sm.add_constant(X)
y = model_df["log_units"]


In [7]:
model = sm.OLS(y, X).fit()
model.summary()


0,1,2,3
Dep. Variable:,log_units,R-squared:,0.132
Model:,OLS,Adj. R-squared:,0.131
Method:,Least Squares,F-statistic:,983.4
Date:,"Sat, 27 Dec 2025",Prob (F-statistic):,0.0
Time:,06:01:17,Log-Likelihood:,-23224.0
No. Observations:,25972,AIC:,46460.0
Df Residuals:,25967,BIC:,46500.0
Df Model:,4,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,2.2862,0.075,30.473,0.000,2.139,2.433
log_price,-0.2659,0.012,-22.862,0.000,-0.289,-0.243
discount_percent,0.0358,0.001,47.738,0.000,0.034,0.037
log_marketing,0.1487,0.007,21.895,0.000,0.135,0.162
week_scaled,0.1633,0.013,12.759,0.000,0.138,0.188

0,1,2,3
Omnibus:,161.202,Durbin-Watson:,1.887
Prob(Omnibus):,0.0,Jarque-Bera (JB):,161.16
Skew:,-0.182,Prob(JB):,1.0100000000000001e-35
Kurtosis:,2.873,Cond. No.,210.0


In [9]:
price_sensitivity = model.params["log_price"]
price_sensitivity

# This would be the main parameter which basically tells,
# FOr every 1% increase in price of the product, there is a decrease of -0.26% in the Number of units sold.


np.float64(-0.2659347505219455)

In [10]:
# Get model coefficients
coeffs = model.params

coeffs


const               2.286241
log_price          -0.265935
discount_percent    0.035822
log_marketing       0.148710
week_scaled         0.163272
dtype: float64

In [13]:
intercept = coeffs["const"]
b_price = coeffs["log_price"]
b_discount = coeffs["discount_percent"]
b_marketing = coeffs["log_marketing"]
b_week = coeffs["week_scaled"]

equation = f"""
log(units_sold) =
{intercept:.3f}
+ ({b_price:.3f}) * log(final_price)
+ ({b_discount:.3f}) * discount_percent
+ ({b_marketing:.3f}) * log(marketing_spend)
+ ({b_week:.3f}) * week_scaled
"""

print(equation)



log(units_sold) =
2.286
+ (-0.266) * log(final_price)
+ (0.036) * discount_percent
+ (0.149) * log(marketing_spend)
+ (0.163) * week_scaled



# Interpretation

Base sales level starts at 2.286 (log scale)

- If price increases by 1%, units sold drop by ~0.27%

- Each 1% discount increases units by ~3.6%


In [14]:
coef_table = pd.DataFrame({
    "variable": model.params.index,
    "coefficient": model.params.values,
    "p_value": model.pvalues.values
})

coef_table

intercept = model.params["const"]
b_price = model.params["log_price"]
b_discount = model.params["discount_percent"]
b_marketing = model.params["log_marketing"]
b_week = model.params["week_scaled"]

equation_text = f"""
Price Sensitivity Model Equation

log(units_sold) =
{intercept:.4f}
+ ({b_price:.4f}) * log(final_price)
+ ({b_discount:.4f}) * discount_percent
+ ({b_marketing:.4f}) * log(marketing_spend)
+ ({b_week:.4f}) * week_scaled

Interpretation:
- Price sensitivity = {b_price:.4f}
  A 1% increase in price changes units sold by approximately {b_price:.2f}%.
"""

print(equation_text)

equation_path = OUT_TAB / "price_sensitivity_equation.txt"
equation_path.write_text(equation_text.strip())

print("Saved price_sensitivity_equation.txt")



Price Sensitivity Model Equation

log(units_sold) =
2.2862
+ (-0.2659) * log(final_price)
+ (0.0358) * discount_percent
+ (0.1487) * log(marketing_spend)
+ (0.1633) * week_scaled

Interpretation:
- Price sensitivity = -0.2659
  A 1% increase in price changes units sold by approximately -0.27%.

Saved price_sensitivity_equation.txt


In [15]:
summary_table = pd.DataFrame({
    "metric": [
        "Price sensitivity (elasticity)",
        "Discount effect",
        "Marketing effect",
        "Model R-squared",
        "Number of observations"
    ],
    "value": [
        round(b_price, 3),
        round(b_discount, 3),
        round(b_marketing, 3),
        round(model.rsquared, 3),
        int(model.nobs)
    ]
})

summary_table


Unnamed: 0,metric,value
0,Price sensitivity (elasticity),-0.266
1,Discount effect,0.036
2,Marketing effect,0.149
3,Model R-squared,0.132
4,Number of observations,25972.0


In [16]:
summary_table.to_csv(
    OUT_TAB / "price_sensitivity_summary_overall.csv",
    index=False
)

print("Saved price_sensitivity_summary_overall.csv")


Saved price_sensitivity_summary_overall.csv
