# Gradient Descent for Multiple Linear Regression
## Previous notation
* Parameters $w_{1}, \dotsc, w_{n}$ and $b$
* Model $f_{\vec{w}, b}\left(\vec{x}\right) = \sum_{i=1}^{n}w_{i}x_{i} + b$
* Cost Function $J\left(w_{1}, \dotsc, w_{n}, b\right)$
### Gradient Descent
Repeat until converge:
* $w_{j} = w'_{j} - \alpha \frac{\partial}{\partial w_{j}} J\left(w_{1}, \dotsc, w_{n}, b\right)$
* $b = b' - \alpha \frac{\partial}{\partial b} J\left(w_{1}, \dotsc, w_{n}, b\right)$
## Vector notation
* Parameters $\vec{w} = \left[ w_{1}, \dotsc, w_{n} \right]$ and $b$
* Model $f_{\vec{w}, b}\left(\vec{x}\right) = \vec{w} \cdot \vec{x} + b$
* Cost Function $J\left(\vec{w}, b\right)$
### Gradient Descent
Repeat until converge:
$$
    \begin{align*}
        w_{j} &= w'_{j} - \alpha \frac{\partial}{\partial w_{j}} J\left(\vec{w}, b\right) \\
        b &= b' - \alpha \frac{\partial}{\partial b} J\left(\vec{w}, b\right)
    \end{align*}
$$
For $1 \leq j \leq n$.

In [8]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns

# Reading the dataset
dataset = pd.read_csv('data_set/second_data_set/Cellphone.csv')

dataset.head(10)


Unnamed: 0,Product_id,Price,Sale,weight,resoloution,ppi,cpu core,cpu freq,internal mem,ram,RearCam,Front_Cam,battery,thickness
0,203,2357,10,135.0,5.2,424,8,1.35,16.0,3.0,13.0,8.0,2610,7.4
1,880,1749,10,125.0,4.0,233,2,1.3,4.0,1.0,3.15,0.0,1700,9.9
2,40,1916,10,110.0,4.7,312,4,1.2,8.0,1.5,13.0,5.0,2000,7.6
3,99,1315,11,118.5,4.0,233,2,1.3,4.0,0.512,3.15,0.0,1400,11.0
4,880,1749,11,125.0,4.0,233,2,1.3,4.0,1.0,3.15,0.0,1700,9.9
5,947,2137,12,150.0,5.5,401,4,2.3,16.0,2.0,16.0,8.0,2500,9.5
6,774,1238,13,134.1,4.0,233,2,1.2,8.0,1.0,2.0,0.0,1560,11.7
7,947,2137,13,150.0,5.5,401,4,2.3,16.0,2.0,16.0,8.0,2500,9.5
8,99,1315,14,118.5,4.0,233,2,1.3,4.0,0.512,3.15,0.0,1400,11.0
9,1103,2580,15,145.0,5.1,432,4,2.5,16.0,2.0,16.0,2.0,2800,8.1


In [7]:
# Check Data
dataset.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 161 entries, 0 to 160
Data columns (total 14 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   Product_id    161 non-null    int64  
 1   Price         161 non-null    int64  
 2   Sale          161 non-null    int64  
 3   weight        161 non-null    float64
 4   resoloution   161 non-null    float64
 5   ppi           161 non-null    int64  
 6   cpu core      161 non-null    int64  
 7   cpu freq      161 non-null    float64
 8   internal mem  161 non-null    float64
 9   ram           161 non-null    float64
 10  RearCam       161 non-null    float64
 11  Front_Cam     161 non-null    float64
 12  battery       161 non-null    int64  
 13  thickness     161 non-null    float64
dtypes: float64(8), int64(6)
memory usage: 17.7 KB


In [19]:
x = dataset[['Sale', 'weight', 'resoloution', 'ppi', 'cpu core', 'cpu freq', 'internal mem', 'ram', 'RearCam', 'Front_Cam', 'battery', 'thickness']]
y = dataset['Price']
print(f"x shape: {x.shape}")
print(f"y shape: {y.shape}")

x shape: (161, 12)
y shape: (161,)


In [22]:
b_init = 0
w_init = np.array([0] * x.shape[1])

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])