**Import necessary libraries:**

In [3]:
import numpy as np
import matplotlib.pyplot as plt
# from linear model importing Stochastic Gradient Descent(a.k.a SGDRegressor)
from sklearn.linear_model import SGDRegressor
# from preprocessing importing Z-score normalization (a.k.a StandardScaler)
from sklearn.preprocessing import StandardScaler 
from lab_utils_multi import  load_house_data
from lab_utils_common import dlc
np.set_printoptions(precision=2)
plt.style.use('./deeplearning.mplstyle')

* Load the data set:

In [4]:
X_train, y_train = load_house_data()
X_features = ['size(sqft)', 'bedrooms', 'floors', 'age']

* Normalize the training data:

In [5]:
scaler = StandardScaler() # creating object 
X_norm = scaler.fit_transform(X_train)
print(f"PtP range without scaling: {np.ptp(X_train, axis=0)}")
print(f"PtP range with scaling   : {np.ptp(X_norm, axis=0)}")

PtP range without scaling: [2.41e+03 4.00e+00 1.00e+00 9.50e+01]
PtP range with scaling   : [5.85 6.14 2.06 3.69]


- Create and fit the regression model:

In [6]:
sgdr = SGDRegressor(max_iter=1000) # creating objects 
sgdr.fit(X_norm, y_train)
print(sgdr)
print(f"number of iterations completed: {sgdr.n_iter_}")
print(f"number of weights updated     : {sgdr.t_}")

SGDRegressor()
number of iterations completed: 141
number of weights updated     : 13960.0


- View parameters:

In [8]:
b_norm = sgdr.intercept_
w_norm = sgdr.coef_
print(f"model parameters-> w: {w_norm}, b:{b_norm}")

model parameters-> w: [110.35 -21.12 -32.54 -38.03], b:[363.17]


- Make predictions:

In [10]:
# make prediction using sgdr.predict()
y_pred_sgd = sgdr.predict(X_norm)
print(f"prediction using sgdr: {y_pred_sgd[:4]}") # check the first four
print(f"actual value         :{y_train[:4]}") # check the first four

prediction using sgdr: [295.15 486.04 389.64 492.2 ]
actual value         :[300.  509.8 394.  540. ]


Conclusion: Prediction is close enough

- Plot the prediction vs the target values:

In [None]:
fig, ax= plt.subplots(1,4, figsize(12,4), sharey=True)
for i in range(X_train.shape[1]):
    ax[i].scatter(X_norm[:,i], y_train, label="Actual value.")
    ax[i].set_xlabel(X_features[i])
    ax[i].scatter(X_norm[:, i], y_pred_sgd, label="Predicted value.")

ax[0].set_ylabel()

What learned: 
- `SGDRegressor` class from `linear_model`
- `StandardScaler` class from `preprocessor`
- `scaler.fit_transform(X_train)` for normalization
- `sgdr.fit(X_norm)` for fitting data 
- `sgdr.n_iter_` for total iterations completed
- `sgdr.t_` for number of weight updated 
- `sgdr.intercept_` for checking the intercept/bias
- `sgdr.coef_` for checking coefficient/wieght

| Feature                | `LinearRegression`                        | `SGDRegressor`                             |
| :--------------------- | :---------------------------------------- | :----------------------------------------- |
| **Optimization** | Closed-form solution (OLS)                | Stochastic Gradient Descent (Iterative)    |
| **Solution Guarantee** | Global Minimum (exact)                    | Approximate Global Minimum (good enough)   |
| **Scalability** | Less scalable for very large datasets     | Highly scalable for very large datasets    |
| **Memory Usage** | Can be high for many features             | Lower, suitable for out-of-core learning   |
| **Hyperparameters** | Few (e.g., `fit_intercept`)               | Many (e.g., `learning_rate`, `max_iter`)   |
| **Performance** | Faster for small/medium datasets          | Faster for very large datasets             |
| **Common Use Case** | Smaller, in-memory datasets, analytical needs | Large datasets, online learning            |
