# Solving Linear Regression 

- Only Training set, No Test points

In [1]:
import numpy as np
from datetime import datetime
import pickle

In [2]:
# load
with open('data.pickle', 'rb') as f:
    data_load = pickle.load(f)

In [3]:
X, y = data_load

In [4]:
print(X.shape, y.shape)

(3360, 4) (3360,)


### Analytic Solution

Let $X \in \mathbb{R}^{N \times d}$ be the ***design matrix*** of the data,
that is, the $i$th **row vector** of $X$ is $\hat{x^i} = (1, x^i)$.

Let $y \in \mathbb{R}^N$ be the **row vector** consisting of labels of data.

Then, the loss function $L(w)$ can be written as the following vector notation:
$$L(w) = \frac{1}{2N}\sum_{i=1}^N (y_i- w x_i^\top)^2=\frac{1}{2N}(y - w X^\top) (y - wX^\top)^\top.$$

Since the loss function is convex w.r.t $w$, we can find the minimum by differentiating the function w.r.t $w$.

$$\nabla_{w} L(w) = -yX + w(X^\top X)$$

Therefore, if $X^\top X$ is invertible, the analytic optimal solution is
$$ \hat{w} = yX(X^\top X)^{-1}. $$

```np.linalg.solve(A,b)```

- It finds a solution x for the linear equation Ax = b.
- Here, x is considered to be a column vector. Thus we take transpose to the equation above.

$$\nabla_{w} L(w) = 0 \quad \Leftrightarrow \quad w(X^\top X)=yX \quad \Leftrightarrow \quad (X^\top X)w^\top = X^\top y^\top$$

In [5]:
w = np.linalg.solve(X.T@X, X.T@y.T)

In [6]:
w

array([ 0.52635202, -4.02713947,  3.6416331 , -6.53016208])

It's Wrong!

In [7]:
Z = np.zeros((X.shape[0], X.shape[1]+1))

In [8]:
for i in range(X.shape[0]):
    temp = list(X[i])
    temp.insert(0, 1)  #List.insert(index, value) index에 value값을 넣기
    Z[i] = np.array(temp)

In [9]:
w = np.linalg.solve(Z.T@Z, Z.T@y.T)

In [10]:
w

array([ 9.99759257,  2.96583528, -4.02713947,  0.98530689, -6.99427012])

### Gradient Descent Method

In [11]:
INPUT_DIM=4
OUTPUT_DIM=1

In [12]:
def forward(X, weights):
    pred = np.matmul(weights, X.T)
    return pred

def MSE(X, y, pred):
    N = X.shape[0]
    loss = np.sum((pred-y)**2) / (2*N)
    return loss

def compute_grads(X, y, pred):
    N     = X.shape[0]
    grads = (1/N)*(-np.matmul(y,X) + np.matmul(pred, X))
    return grads

def update_weights(weights, grads, LR):
    weights -= LR*grads
    return weights

In [18]:
BATCH_SIZE= 30
EPOCHS=100
LR = 0.001

In [19]:
weights= np.random.randn(OUTPUT_DIM,INPUT_DIM+1)

In [20]:
start = datetime.now()

for epoch in range(EPOCHS):
    
    # Shuffle Data
    idx = np.random.permutation(X.shape[0])
    x_temp = Z[idx]
    y_temp = y[idx]
    
    for batch in range(X.shape[0]//BATCH_SIZE):
        batch_X = x_temp[batch*BATCH_SIZE:(batch+1)*BATCH_SIZE]
        batch_y = y_temp[batch*BATCH_SIZE:(batch+1)*BATCH_SIZE].reshape(1,-1)
        
        pred  = forward(batch_X, weights)
        loss  = MSE(batch_X, batch_y, pred)
        grads = compute_grads(batch_X, batch_y, pred)
        
        weights = update_weights(weights, grads, LR)
    
    print('EPOCH %d Completed, Loss: %.3f' % (epoch+1, loss))
    
end = datetime.now()
print('Total time:', end-start)

EPOCH 1 Completed, Loss: 58.621
EPOCH 2 Completed, Loss: 51.648
EPOCH 3 Completed, Loss: 24.449
EPOCH 4 Completed, Loss: 9.209
EPOCH 5 Completed, Loss: 16.684
EPOCH 6 Completed, Loss: 8.213
EPOCH 7 Completed, Loss: 11.976
EPOCH 8 Completed, Loss: 12.771
EPOCH 9 Completed, Loss: 7.629
EPOCH 10 Completed, Loss: 8.944
EPOCH 11 Completed, Loss: 7.465
EPOCH 12 Completed, Loss: 8.761
EPOCH 13 Completed, Loss: 6.616
EPOCH 14 Completed, Loss: 5.823
EPOCH 15 Completed, Loss: 6.008
EPOCH 16 Completed, Loss: 5.483
EPOCH 17 Completed, Loss: 3.970
EPOCH 18 Completed, Loss: 8.273
EPOCH 19 Completed, Loss: 4.767
EPOCH 20 Completed, Loss: 4.130
EPOCH 21 Completed, Loss: 3.607
EPOCH 22 Completed, Loss: 3.349
EPOCH 23 Completed, Loss: 3.899
EPOCH 24 Completed, Loss: 2.240
EPOCH 25 Completed, Loss: 2.446
EPOCH 26 Completed, Loss: 4.168
EPOCH 27 Completed, Loss: 2.777
EPOCH 28 Completed, Loss: 2.708
EPOCH 29 Completed, Loss: 3.225
EPOCH 30 Completed, Loss: 4.419
EPOCH 31 Completed, Loss: 2.592
EPOCH 32 Co

In [21]:
weights

array([[ 9.90051296,  2.93835035, -4.02661454,  1.01599308, -6.9884521 ]])