# Ridge Regression
- How to use ridge regression to control for multicollinearity

## Create Dataset

In [11]:
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression, Ridge

np.random.seed(42)

n_weeks = 156  # ~3 years of weekly data

paid_search = np.random.normal(50_000, 8_000, n_weeks)
paid_social = paid_search * 0.8 + np.random.normal(0, 5_000, n_weeks)

total_marketing = paid_search + paid_social + np.random.normal(0, 2_000, n_weeks)

weekly_sales = (
    2.5 * paid_search +
    1.8 * paid_social +
    np.random.normal(0, 50_000, n_weeks)
)

df = pd.DataFrame({
    "paid_search": paid_search,
    "paid_social": paid_social,
    "total_marketing": total_marketing,
    "weekly_sales": weekly_sales
})

print(df.head())
print("")
df.info()

    paid_search   paid_social  total_marketing   weekly_sales
0  53973.713224  52507.843135    108132.389057  202923.343323
1  48893.885591  41484.273077     92005.177940  168115.493483
2  55181.508305  38188.689158     95981.155077  192940.826389
3  62184.238851  53030.159124    115256.405659  135818.825315
4  48126.773002  33628.010051     83118.688995  105087.797487

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 156 entries, 0 to 155
Data columns (total 4 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   paid_search      156 non-null    float64
 1   paid_social      156 non-null    float64
 2   total_marketing  156 non-null    float64
 3   weekly_sales     156 non-null    float64
dtypes: float64(4)
memory usage: 5.0 KB


## Fit OLS Regression

In [12]:
X = df[[
    "paid_search", 
    "paid_social", 
    "total_marketing"
    ]]
y = df["weekly_sales"]

ols = LinearRegression()
ols.fit(X, y)

ols_coefs = pd.Series(ols.coef_, index=X.columns)
ols_coefs

paid_search        1.834479
paid_social        2.952235
total_marketing   -0.020951
dtype: float64

In [13]:
import statsmodels.api as sm

X_sm = sm.add_constant(X)
ols_sm = sm.OLS(y, X_sm).fit()
print(ols_sm.summary())

                            OLS Regression Results                            
Dep. Variable:           weekly_sales   R-squared:                       0.304
Model:                            OLS   Adj. R-squared:                  0.291
Method:                 Least Squares   F-statistic:                     22.17
Date:                Wed, 07 Jan 2026   Prob (F-statistic):           5.78e-12
Time:                        05:18:55   Log-Likelihood:                -1910.4
No. Observations:                 156   AIC:                             3829.
Df Residuals:                     152   BIC:                             3841.
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
                      coef    std err          t      P>|t|      [0.025      0.975]
-----------------------------------------------------------------------------------
const           -1.273e+04   2.78e+04     

- Total marketing spend is redundant and causing multicollinearity in features
- It's negatively related to sales, even though paid social & paid search are positive

## Fit Ridge Regression

In [14]:
# SKLearn Ridge Regression
ridge = Ridge(alpha=1.0)
ridge.fit(X, y)

ridge_coefs = pd.Series(ridge.coef_, index=X.columns)
ridge_coefs

paid_search        1.834479
paid_social        2.952235
total_marketing   -0.020951
dtype: float64

In [15]:
# Comparison of coefficients
pd.DataFrame({
    "OLS": ols_coefs,
    "Ridge": ridge_coefs
})

Unnamed: 0,OLS,Ridge
paid_search,1.834479,1.834479
paid_social,2.952235,2.952235
total_marketing,-0.020951,-0.020951


In [16]:
# Statsmodels ridge
ridge_sm = sm.OLS(y, X_sm).fit_regularized(method='elastic_net', L1_wt=0, alpha=1.0)

In [17]:
# Coefficients from statsmodels ridge
ridge_sm_coefs = pd.Series(ridge_sm.params[1:], index=X.columns)
ridge_sm_coefs

paid_search        1.502673
paid_social        2.826123
total_marketing    0.082210
dtype: float64

Learnings
- This example is dumb, but sometimes variables overlap and we don't have the option to remove some from the model, need a savvy method to control for multicollinearity effects.
- SKLearn ridge standardizes features internally and requires external control to see the differences from ridge.
- Statsmodels ridge has no pvalues because the penalty from ridge biases the estimates by design, we're just checking out the coefficients.