### Ensuring Consistency Across Training & Inference Datasets: Feature Scaling
**Question**: Load a dataset (e.g., Boston Housing) and perform feature scaling. Ensure the
same scaling is applied during model inference with new data.

In [2]:
import pandas as pd
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression

# Load the California Housing dataset
housing = fetch_california_housing(as_frame=True)
X = housing.data
y = housing.target

# Split into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Feature scaling (standardization)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Train a model
model = LinearRegression()
model.fit(X_train_scaled, y_train)

# Simulate inference on new data (must match original features)
new_data = pd.DataFrame([X.iloc[0]])  # Example: use first row of original dataset
new_data_scaled = scaler.transform(new_data)
predicted_value = model.predict(new_data_scaled)

print("Predicted value on new data:", predicted_value[0])


Predicted value on new data: 4.151942685752971
