We will make a simple dataset where:
- feature 1 (X1) is in 0-100 range
- feature 2 (X2) is in 0-5 range 
without scaling feature will dominate the regression

In [1]:
!pip install numpy 
!pip install pandas 
!pip install scikit-learn



In [2]:
import numpy as np 
import pandas as pd 
from sklearn.model_selection import train_test_split 
from sklearn.preprocessing import StandardScaler 
from sklearn.linear_model import LinearRegression 
from sklearn.metrics import r2_score 

In [5]:
np.random.seed(42)
X=np.random.rand(10,2)*[100, 5]
y=3*X[:,0]+5*X[:,1]+np.random.randn(10)*10  #linear noise 

X_train,X_test, y_train, y_test = train_test_split(X,y, test_size=0.2, random_state=42)

In [6]:
model_no_scaling = LinearRegression()
model_no_scaling.fit(X_train, y_train)
pred_no_scaling=model_no_scaling.predict(X_test)

print("No Scaling R²:", r2_score(y_test, pred_no_scaling))
print("No Scaling Coefficients:", model_no_scaling.coef_)

No Scaling R²: 0.9990554789691793
No Scaling Coefficients: [3.24760954 6.39966372]


In [7]:
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test) 

model_scaled = LinearRegression()
model_scaled.fit(X_train_scaled, y_train)
pred_scaled = model_scaled.predict(X_test_scaled)

print("With Scaling R²:", r2_score(y_test, pred_scaled))
print("With Scaling Coefficients:", model_scaled.coef_)

With Scaling R²: 0.9990554789691793
With Scaling Coefficients: [85.93871816 10.92156427]


Why scaling matters here 
- without scaling-->coefficients are skewed by the larger numerical range of X1. 
- with scaling-->both features are on equal footing and coefficients reflects there true impact