🧠 **Explanation**  
We use `fit_transform()` on the training data because we want the scaler to learn the **mean and standard deviation** only from the **training set**. This captures the **distribution** that the model was trained on.

Then, we use `transform()` on the test data to apply the **same scaling** (using the training set's stats) — this ensures that the test data is brought to the same scale **without leaking information** from the test set into the training process.

Fitting the scaler again on the test set would **expose the model to unseen data**, leading to **data leakage** and **unreliable evaluation results**.

In [None]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(
    X_train
)  # Fit the scaler on training data and then transform it
X_test_scaled = scaler.transform(
    X_test
)  # Only transform test data using the same scaler (do NOT fit again)