<h1 style="font-family: Trebuchet MS; padding: 12px; font-size: 30px; color: #3a5a40; text-align: center; line-height: 1.25;"><b>Multiple Linear Regression: Understanding the Process<b/></h1>

<p style="background-color:#93CDDD; font-family: 'Trebuchet MS'; font-weight: bold; color: #3a5a40; font-size: 40px; text-align: center; border-radius: 50px; padding: 10px;">Table of Contents</p>

- [**1 - Importing Libraries**](#1)
- [**2- Linear Regression with Multiple Variable**](#3)
    - [**2 - 1 Overview**](#3-1)
    - [**2 - 2 Loading Dataset**](#3-2)
    - [**2 - 3 Exploring the data**](#3-3)
    - [**2 - 4 Feature Normalization**](#3-4)
    - [**2 - 5 Compute Cost**](#3-5)
    - [**2 - 6 Gradient descent**](#3-6)
-[**2- Linear Regression using Scikit-Learn**](#4)


<a name = 1></a>
<p style="background-color:#93CDDD; font-family: 'Trebuchet MS'; font-weight: bold; color: #3a5a40; font-size: 40px; text-align: center; border-radius: 50px; padding: 10px;">1 - Importing libraries</p>

In [1]:
# Data Analysis and Manipulation Libraries Import
import numpy as np
import pandas as pd

# Data Visualization Libraries Import
from matplotlib import pyplot as plt
import seaborn as sns

# Python Standard Libraries Import
import copy
import math

#ignore all the warning 
import warnings
warnings.filterwarnings('ignore')

In [2]:
#display Matplotlib plots directly within the notebook
%matplotlib inline
# display graphs correctly
plt.rcParams['figure.figsize'] = (12, 5)
plt.rcParams['figure.dpi'] = 120

# Apply the 'ggplot' style
plt.style.use('dark_background')


<p style="background-color:#93CDDD; font-family: 'Trebuchet MS'; font-weight: bold; color: #3a5a40; font-size: 40px; text-align: center; border-radius: 50px; padding: 10px;">2- Linear Regression with Multiple Variable</p>
<a name = 2></a>



<p style="background-color:#93CDDD; font-family: 'Trebuchet MS'; font-weight: bold; color: #3a5a40; font-size: 40px; text-align: center; border-radius: 50px; padding: 10px;">2 -1 Overview</p>
<a name = 3-1></a>

Imagine you're in the process of selling your house, and you're eager to determine a fair market price for it. One effective approach is to gather data on recent home sales and construct a model that predicts housing prices. Your task revolves around using various factors to forecast these prices accurately.
 
The dataset **ex1data2.txt** provides a training set of housing prices. Each row consists of three columns: <br>
**the first column represents the house's size in square feet**<br>
**the second column signifies the number of bedrooms,**<br>
**the third column denotes the corresponding house price.**


<p style="background-color:#93CDDD; font-family: 'Trebuchet MS'; font-weight: bold; color: #3a5a40; font-size: 40px; text-align: center; border-radius: 50px; padding: 10px;">2 - 2 Loading Dataset</p>
<a name = 3-2></a>

In [3]:
#read the data
df1 = pd.read_csv('/kaggle/input/coursera-machine-learning-su-ex1/ex1data2.txt', names=['house size', 'num beds','price'])
# git the data treaning
X = df1.iloc[:,0:2] #the first column and the seconde column
Y = df1.iloc[:,2] # the third column
m,n = X.shape 
df1.head()

Unnamed: 0,house size,num beds,price
0,2104,3,399900
1,1600,3,329900
2,2400,3,369000
3,1416,2,232000
4,3000,4,539900




<p style="background-color:#93CDDD; font-family: 'Trebuchet MS'; font-weight: bold; color: #3a5a40; font-size: 40px; text-align: center; border-radius: 50px; padding: 10px;">2 - 3 Exploring the data</p>
<a name = 3-3></a>

In [4]:
df1.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 47 entries, 0 to 46
Data columns (total 3 columns):
 #   Column      Non-Null Count  Dtype
---  ------      --------------  -----
 0   house size  47 non-null     int64
 1   num beds    47 non-null     int64
 2   price       47 non-null     int64
dtypes: int64(3)
memory usage: 1.2 KB


In [5]:
df1.describe()

Unnamed: 0,house size,num beds,price
count,47.0,47.0,47.0
mean,2000.680851,3.170213,340412.659574
std,794.702354,0.760982,125039.899586
min,852.0,1.0,169900.0
25%,1432.0,3.0,249900.0
50%,1888.0,3.0,299900.0
75%,2269.0,4.0,384450.0
max,4478.0,5.0,699900.0


In [6]:
df1.isnull().sum()

house size    0
num beds      0
price         0
dtype: int64

In [7]:
df1.corr()

Unnamed: 0,house size,num beds,price
house size,1.0,0.559967,0.854988
num beds,0.559967,1.0,0.442261
price,0.854988,0.442261,1.0


**there is a strong correlation between the size of a house and its price.**



<p style="background-color:#93CDDD; font-family: 'Trebuchet MS'; font-weight: bold; color: #3a5a40; font-size: 40px; text-align: center; border-radius: 50px; padding: 10px;">2 -4 Feature Normalization</p>
<a name = 3-4></a>

**when analyzing the data, it's important to observe that house sizes typically have larger values compared to the number of bedrooms. When dealing with features that vary significantly in scale, it is advisable to begin by performing feature scaling, as it can greatly expedite the convergence of gradient descent algorithms.**

In [8]:
def zscore_normalize_features(X):

    # find the mean of each column/feature
    mu     = np.mean(X, axis=0)                 # mu will have shape (n,)
    # find the standard deviation of each column/feature
    sigma  = np.std(X, axis=0)                  # sigma will have shape (n,)
    # element-wise, subtract mu for that column from each example, divide by std for that column
    X_norm = (X - mu) / sigma      

    return (X_norm, mu, sigma)
X_norm, _, _ = zscore_normalize_features(X)

In [9]:
X_norm.head()

Unnamed: 0,house size,num beds
0,0.131415,-0.226093
1,-0.509641,-0.226093
2,0.507909,-0.226093
3,-0.743677,-1.554392
4,1.271071,1.102205




<p style="background-color:#93CDDD; font-family: 'Trebuchet MS'; font-weight: bold; color: #3a5a40; font-size: 40px; text-align: center; border-radius: 50px; padding: 10px;">3 -5 Compute the cost function</p>
<a name = 3-5></a>

The equation for the cost function with multiple variables $J(\mathbf{w},b)$ is:
$$J(\mathbf{w},b) = \frac{1}{2m} \sum\limits_{i = 0}^{m-1} (f_{\mathbf{w},b}(\mathbf{x}^{(i)}) - y^{(i)})^2 $$ 
where:
$$ f_{\mathbf{w},b}(\mathbf{x}^{(i)}) = \mathbf{w} \cdot \mathbf{x}^{(i)} + b  $$ 



In [10]:
def compute_cost(X, Y, w, b): 
    cost = 0.0
    m = len(Y)
    for i in range(m):                                
        f_wb_i =np.dot( X [i],w) + b       
        cost = cost + (f_wb_i - Y[i])**2       
    cost = cost / (2 * m)                      
    return cost

In [11]:
#test 
w_init = np.array([0.00001, 100000])
b= 0.
X_norm = X_norm.to_numpy()
Y = Y.to_numpy()
cost = compute_cost(X_norm,Y,w_init,b)
cost

65120665930.37282



<p style="background-color:#93CDDD; font-family: 'Trebuchet MS'; font-weight: bold; color: #3a5a40; font-size: 40px; text-align: center; border-radius: 50px; padding: 10px;">3 -6 Gradient descent function</p>
<a name = 3-6></a>

$$\begin{align*} \text{repeat}&\text{ until convergence:} \; \lbrace \newline\;
& w_j = w_j -  \alpha \frac{\partial J(\mathbf{w},b)}{\partial w_j}\; & \text{for j = 0..n-1}\newline
&b\ \ = b -  \alpha \frac{\partial J(\mathbf{w},b)}{\partial b}  \newline \rbrace
\end{align*}$$

where, n is the number of features, parameters $w_j$,  $b$, are updated simultaneously and where  

$$
\begin{align}
\frac{\partial J(\mathbf{w},b)}{\partial w_j}  &= \frac{1}{m} \sum\limits_{i = 0}^{m-1} (f_{\mathbf{w},b}(\mathbf{x}^{(i)}) - y^{(i)})x_{j}^{(i)} \ \\
\frac{\partial J(\mathbf{w},b)}{\partial b}  &= \frac{1}{m} \sum\limits_{i = 0}^{m-1} (f_{\mathbf{w},b}(\mathbf{x}^{(i)}) - y^{(i)}) \
\end{align}
$$

In [12]:
def compute_gradient(X, Y, w, b): 
    dj_dw = np.zeros((n,))
    dj_db = 0.

    for i in range(m):                             
        err = (np.dot(X[i], w) + b) - Y[i]   
        for j in range(n):                         
            dj_dw[j] = dj_dw[j] + err * X[i, j]    
        dj_db = dj_db + err                        
    dj_dw = dj_dw / m                            
    dj_db = dj_db / m                                
        
    return dj_db, dj_dw


In [13]:
#test
gradient = compute_gradient(X_norm,Y,w_init,b)
gradient

(-340412.659574468, array([-49767.41304904,  45291.17824973]))

In [14]:
def gradient_descent(X, Y, w_in, b_in, cost_function, gradient_function, alpha, num_iters): 

    J_history = []
    w = copy.deepcopy(w_in)  #avoid modifying global w within function
    b = b_in
    
    for i in range(num_iters):

        # Calculate the gradient 
        dj_db,dj_dw = gradient_function(X, Y, w, b) 

        # Update Parameters
        w = w - alpha * dj_dw              
        b = b - alpha * dj_db           
      
        # Save cost J at each iteration
        if i<100000: 
            J_history.append( cost_function(X, Y, w, b))

        if i% math.ceil(num_iters / 10) == 0:
            print(f"Iteration {i:4d}: Cost {J_history[-1]:8.2f}   ")
        
    return w, b, J_history

In [15]:
#test
iterations = 1500
alpha = 10.0e-3

# Run gradient descent 
w_final, b_final, J_hist = gradient_descent(X_norm, Y,w_init,b, compute_cost, compute_gradient, alpha, iterations)

print(f"b, w found by gradient descent: {b_final:.2f}, {w_final}")


Iteration    0: Cost 63922471505.35   
Iteration  150: Cost 6183496655.93   
Iteration  300: Cost 2540810350.88   
Iteration  450: Cost 2146111899.85   
Iteration  600: Cost 2069212113.94   
Iteration  750: Cost 2050115218.76   
Iteration  900: Cost 2045096974.94   
Iteration 1050: Cost 2043763784.17   
Iteration 1200: Cost 2043408875.95   
Iteration 1350: Cost 2043314360.51   
b, w found by gradient descent: 340412.56, [109303.05477637  -6433.61316105]


In [16]:
m,_ = X_norm.shape
for i in range(m):
    print(f"prediction: {np.dot(X_norm[i], w_final) + b_final:0.2f}, target value: {Y[i]}")

prediction: 356231.27, target value: 399900
prediction: 286161.88, target value: 329900
prediction: 397383.13, target value: 369000
prediction: 269126.74, target value: 232000
prediction: 472253.32, target value: 539900
prediction: 331141.35, target value: 299900
prediction: 276986.12, target value: 314900
prediction: 262110.28, target value: 198999
prediction: 255576.03, target value: 212000
prediction: 271425.06, target value: 242500
prediction: 324885.15, target value: 239999
prediction: 341772.50, target value: 347000
prediction: 326479.58, target value: 329999
prediction: 669188.83, target value: 699900
prediction: 240005.05, target value: 259900
prediction: 374934.72, target value: 449900
prediction: 255780.19, target value: 299900
prediction: 235556.20, target value: 199900
prediction: 417893.93, target value: 499998
prediction: 476563.14, target value: 599000
prediction: 309379.31, target value: 252900
prediction: 334747.29, target value: 255000
prediction: 286717.98, target va



<p style="background-color:#93CDDD; font-family: 'Trebuchet MS'; font-weight: bold; color: #3a5a40; font-size: 40px; text-align: center; border-radius: 50px; padding: 10px;">4 -  Linear Regression using Scikit-Learn</p>
<a name = 4></a>

## SGDRegressor

### normalize the training data

In [17]:
from sklearn.linear_model import SGDRegressor
from sklearn.preprocessing import StandardScaler


scaler = StandardScaler()
X_Scal = scaler.fit_transform(X)

### Create and fit the regression model

In [18]:
sgdr = SGDRegressor(max_iter=1000)
sgdr.fit(X_Scal, Y)
print(f"number of iterations completed: {sgdr.n_iter_}, number of weight updates: {sgdr.t_}")

number of iterations completed: 132, number of weight updates: 6205.0


In [19]:
b_Scal = sgdr.intercept_
w_Scal = sgdr.coef_
print(f"model parameters:                   w: {w_Scal}, b:{b_Scal}")

model parameters:                   w: [108476.80483065  -5609.50867684], b:[340371.06509063]


In [20]:
y_pred_sgd = sgdr.predict(X_Scal)
print(f"{y_pred_sgd}" )

[355894.86288414 286355.14331153 396735.65056965 268418.72879697
 472069.92885843 332024.66027468 277248.75146273 262485.35861696
 256000.50381554 271729.72609982 325815.75674141 341545.39694059
 326368.0771926  668546.813863   240547.23279941 375486.98500757
 255173.067926   236132.01250908 418121.45593602 476347.17351468
 309397.07420166 333543.22807927 286907.04584782 328851.22069101
 602319.3453379  217367.3262752  266762.60327321 414120.5804628
 369140.52375512 429435.87584487 326782.42200971 218471.96717756
 339613.32014868 498423.6928812  308016.89994604 263865.11495768
 236545.9394113  352307.07848336 639710.3242567  356446.34750554
 302636.68604699 374383.59784988 412326.47930497 231164.88968246
 190600.47118    313673.90094302 231578.81658468]


In [21]:
from sklearn.metrics import mean_squared_error
mse = mean_squared_error(Y, y_pred_sgd)
print(f"Mean Squared Error: {mse}")


Mean Squared Error: 4087389750.167361


## 2) LinearRegression()

In [22]:
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_squared_error

# Scaling the feature

scaler = StandardScaler()
X_Scal = scaler.fit_transform(X)

#Create and fit the model
lr = LinearRegression()
lr.fit(X_Scal, Y)

b_Scal = lr.intercept_
w_Scal = lr.coef_
print(f"Model parameters: w: {w_Scal}, b: {b_Scal}")

y_pred_lr = lr.predict(X_Scal)
print(y_pred_lr)

mse = mean_squared_error(Y, y_pred_lr)
print(f"Mean Squared Error: {mse}")

Model parameters: w: [109447.79646964  -6578.35485416], b: 340412.6595744681
[356283.1103389  286120.93063401 397489.46984812 269244.1857271
 472277.85514636 330979.02101847 276933.02614885 262037.48402897
 255494.58235014 271364.59918815 324714.54068768 341805.20024107
 326492.02609913 669293.21223209 239902.98686016 374830.38333402
 255879.96102141 235448.2452916  417846.48160547 476593.38604091
 309369.11319496 334951.62386342 286677.77333009 327777.17551607
 604913.37413438 216515.5936252  266353.01492351 415030.01477434
 369647.33504459 430482.39959029 328130.30083656 220070.5644481
 338635.60808944 500087.73659911 306756.36373941 263429.59076914
 235865.87731365 351442.99009906 641418.82407778 355619.31031959
 303768.43288347 374937.34065726 411999.63329673 230436.66102696
 190729.36558116 312464.00137413 230854.29304902]
Mean Squared Error: 4086560101.2056575


<h1 style="font-family: Trebuchet MS; font-size: 20px; color: #52b788; text-align: left; "><b>Thank You</b></h1>
<h1 style="font-family: "Trebuchet MS"; font-size: 1px; color: #264653; text-align: center; "> <b>Created By: Hassane Skikri</b></h1>