### 作業目標: 使用Pytorch進行微分與倒傳遞 
這份作業我們會實作微分與倒傳遞以及使用Pytorch的Autograd。

### 使用Pytorch實作微分與倒傳遞

這裡我們很簡單的實作兩層的神經網路進行回歸問題，其中loss function為L2 loss

$$
L2\_loss = (y_{pred}-y)^2
$$

兩層神經網路如下所示
$$
y_{pred} = ReLU(XW_1)W_2
$$

In [1]:
import torch
device = torch.device('cpu')

In [7]:
# N: batch size
# D_in: input dimension
# H: hidden dimension
# D_out: output dimension
N, D_in, H, D_out = 64, 1000, 100, 10

# 隨機生成x, y
###<your code>###
x = torch.randn(N,D_in)
y = torch.randn(N,D_out)

# 初始化weight W1, W2
###<your code>###
W1 = torch.randn(D_in,H)
W2 = torch.randn(H,D_out)

# 設置learning rate
learning_rate = 1e-6

# 訓練500個epoch
for t in range(500):
  # 向前傳遞: 計算y_pred
  ###<your code>###
  relu_y = torch.relu(torch.mm(x,W1))
  y_pred = relu_y.mm(W2)
  # 計算loss
  ###<your code>###
  residual = y_pred - y
  loss = torch.square(residual).sum()
  print(t, loss.item())

  # 倒傳遞: 計算W1與W2對loss的微分(梯度)
  ###<your code>###
  delta2 = 2.*residual
  W2_grad = torch.mm(relu_y.T,delta2) # dLoss/dW = (XT)delta2
  delta1 = torch.mm(delta2,W2.T) * torch.where(relu_y>0,1,0)
  W1_grad = torch.mm(x.T,delta1) 


  # 參數更新
  ###<your code>###
  W2 -= learning_rate * W2_grad
  W1 -= learning_rate * W1_grad


0 37318296.0
1 34578100.0
2 32509304.0
3 26739232.0
4 18453788.0
5 10811010.0
6 5968839.5
7 3411758.25
8 2166712.0
9 1533201.625
10 1177782.75
11 951478.5
12 791653.125
13 669899.9375
14 572962.25
15 493829.9375
16 428089.125
17 372949.71875
18 326344.21875
19 286697.5
20 252788.34375
21 223637.578125
22 198483.203125
23 176780.515625
24 157898.8125
25 141394.59375
26 126911.8125
27 114164.2578125
28 102900.9296875
29 92933.5390625
30 84088.859375
31 76221.1796875
32 69211.359375
33 62945.03125
34 57331.34375
35 52289.82421875
36 47757.66015625
37 43674.9921875
38 39995.36328125
39 36676.59765625
40 33672.2578125
41 30949.060546875
42 28476.416015625
43 26228.703125
44 24181.9921875
45 22317.26953125
46 20616.419921875
47 19061.078125
48 17638.25
49 16335.986328125
50 15142.302734375
51 14046.46484375
52 13040.0732421875
53 12116.05859375
54 11265.435546875
55 10482.1943359375
56 9759.94140625
57 9093.7666015625
58 8478.6611328125
59 7910.494140625
60 7385.01416015625
61 6898.620117187

### 使用Pytorch的Autograd

In [4]:
import torch
device = torch.device('cpu')

In [6]:
# N: batch size
# D_in: input dimension
# H: hidden dimension
# D_out: output dimension
N, D_in, H, D_out = 64, 1000, 100, 10

# 隨機生成x, y
###<your code>###
x = torch.randn(N,D_in)
y = torch.randn(N,D_out)

# 初始化weight W1, W2
###<your code>###
W1 = torch.randn(D_in,H, requires_grad=True)
W2 = torch.randn(H,D_out, requires_grad=True)

# 設置learning rate
learning_rate = 1e-6

# 訓練500個epoch
for t in range(500):
  # 向前傳遞: 計算y_pred
  ###<your code>###
  relu_out = torch.relu(x.mm(W1))
  y_pred = relu_out.mm(W2)
  
  # 計算loss
  ###<your code>###
  residual = y_pred - y
  loss = torch.square(residual).sum()
  print(t, loss.item())

  # 倒傳遞: 計算W1與W2對loss的微分(梯度)
  ###<your code>###
  loss.backward()

  # 參數更新: 這裡再更新參數時，我們不希望更新參數的計算也被紀錄微分相關的資訊，因此使用torch.no_grad()
  with torch.no_grad():
    # 更新參數W1 W2
    ###<your code>###
    W1 -= learning_rate * W1.grad
    W2 -= learning_rate * W2.grad

    # 將紀錄的gradient清空(因為已經更新參數)
    W1.grad.zero_()
    W2.grad.zero_()

0 32671160.0
1 28789500.0
2 26246660.0
3 21847192.0
4 15993009.0
5 10323800.0
6 6240911.5
7 3755246.5
8 2381314.5
9 1627500.25
10 1197052.75
11 931876.8125
12 754222.9375
13 626089.625
14 528181.5625
15 450477.59375
16 387213.0625
17 334865.1875
18 291030.28125
19 254039.28125
20 222570.390625
21 195732.828125
22 172716.53125
23 152878.703125
24 135711.5625
25 120786.703125
26 107775.7421875
27 96403.8984375
28 86431.5
29 77659.453125
30 69924.4765625
31 63084.23828125
32 57024.5234375
33 51638.14453125
34 46841.37890625
35 42559.3984375
36 38730.61328125
37 35299.9140625
38 32226.205078125
39 29463.533203125
40 26972.41015625
41 24725.1796875
42 22694.42578125
43 20855.3515625
44 19187.880859375
45 17674.982421875
46 16298.80859375
47 15045.3876953125
48 13902.53125
49 12860.001953125
50 11906.197265625
51 11033.375
52 10234.681640625
53 9502.0302734375
54 8828.6044921875
55 8209.1640625
56 7639.3046875
57 7114.53564453125
58 6630.658203125
59 6183.73828125
60 5770.73779296875
61 5388