# Linear Regression with Multiple Outputs

- decission functon: 
$$\begin{align}
\hat{y}_1 &= b_{1} + w_{11}x_1 + \ldots + w_{1D}x_D\\
\hat{y}_2 &= b_{2} + w_{21}x_1 + \ldots + w_{2D}x_D\\
&\vdots\\
\hat{y}_N &= b_{N} + w_{N1}x_1 + \ldots + w_{ND}x_D
\end{align}$$

- input vector: $$\vec{x} = \begin{bmatrix}
x_1\\
\vdots\\
x_D
\end{bmatrix}$$

- output vector: $$\vec{y} = \begin{bmatrix}
\hat{y}_1\\
\vdots\\
\hat{y}_N
\end{bmatrix}$$

- bias vector:$$\vec{b} = \begin{bmatrix}
b_1\\
\vdots\\
b_N
\end{bmatrix}$$
- weight matrix:
$$W = \begin{bmatrix}
w_{11} & w_{12} & \cdots & w_{1D}\\
w_{21} & w_{22} & \cdots & w_{2D}\\
\vdots & \vdots & \ddots & \vdots\\
w_{N1} & w_{N2} & \cdots & w_{ND}
\end{bmatrix}_{N\times D}$$

where $N$ is a number of output class and 

$D$ is a number of input features.

So our decssion function vector is

$$\vec{y} = W\vec{x} + \vec{b}$$

In [1]:
import torch
from torch.nn import Linear
torch.manual_seed(1)
import torch.nn as nn

In [2]:
#Model
class LR(nn.Module):
    def __init__(self, input_size, output_size):
        super(LR, self).__init__()
        self.linear = Linear(input_size, output_size)
        
    def forward(self, X):
        out = self.linear(X)
        return out

In [3]:
model = LR(input_size = 2, output_size = 2)

In [4]:
list(model.parameters())

[Parameter containing:
 tensor([[ 0.3643, -0.3121],
         [-0.1371,  0.3319]], requires_grad=True),
 Parameter containing:
 tensor([-0.6657,  0.4241], requires_grad=True)]

In [5]:
x = torch.tensor([[1.0, 2.0]])
yhat = model(x)

In [6]:
yhat

tensor([[-0.9256,  0.9508]], grad_fn=<AddmmBackward0>)

- Input Matrix

$$X = \begin{bmatrix}
x_{11} & w_{12} & \cdots & w_{1D}\\
x_{21} & w_{22} & \cdots & w_{2D}\\
\vdots & \vdots & \ddots & \vdots\\
x_{D1} & w_{D2} & \cdots & w_{DK}
\end{bmatrix}_{D\times K}$$

- Bias Matrix
$$B = \begin{bmatrix}
b_{11} & b_{12} & \cdots & b_{1D}\\
b_{21} & b_{22} & \cdots & b_{2D}\\
\vdots & \vdots & \ddots & \vdots\\
b_{N1} & b_{N2} & \cdots & b_{NK}
\end{bmatrix}_{N \times K}$$

- Output Matrix

$$\hat{Y} = WX + B$$

In [7]:
X = torch.tensor([[1.0, 2.0], [1.0, 2.0], [1.0, 3.0]])
Yhat = model(X)

In [8]:
Yhat

tensor([[-0.9256,  0.9508],
        [-0.9256,  0.9508],
        [-1.2377,  1.2827]], grad_fn=<AddmmBackward0>)

## Training

- Cost function:

$$l(W, b) = \frac{1}{K}\sum_{k = 1}^K\|y_k - (W\vec{x} + \vec{b})\|^2$$

- Update weight

$$\begin{align}
W^{i + 1} &= W^i - \eta\nabla l(W^i, b^i)\\
\end{align}$$

In [9]:
from torch import nn, optim
import torch
from torch.utils.data import Dataset, DataLoader

In [10]:
#Data
class Data2D(Dataset):
    def __init__(self):
        self.x = torch.zeros(20, 2)
        self.x[:, 0] = torch.arange(-1, 1, 0.1)
        self.x[:, 1] = torch.arange(-1, 1, 0.1)
        self.w = torch.tensor([[1.0, -1.0], [1.0, -1.0]])
        self.b = torch.tensor([[1.0, -1.0]])
        self.f = torch.mm(self.x, self.w) + self.b
        self.y = self.f + 0.1 * torch.randn((self.x.shape[0], 1))
        self.len = self.x.shape[0]
        
    def __getitem__(self, index):
        return self.x[index], self.y[index]
    
    def __len__(self):
        return self.len

In [11]:
data_set = Data2D()
criterion = nn.MSELoss()
trainloader = DataLoader(dataset = data_set, batch_size = 1)
model = LR(input_size= 2, output_size = 2)
optimizer = optim.SGD(model.parameters(), lr = 0.1)

In [12]:
for epoch in range(100):
    for x, y in trainloader:
        
        yhat = model(x)
        loss = criterion(yhat, y)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

In [13]:
for p in model.parameters():
    if p.requires_grad:
         print(p.name, p.data)

None tensor([[ 1.2925,  0.7706],
        [-1.3365, -0.6005]])
None tensor([ 1.0090, -0.9910])
