# 14 感知机

### 单层感知机
<img src='./images/13.png' style='zoom:50%'>

$ E = \dfrac {1}{2}\left(O_0^1-t \right)^2$  
$\dfrac {\partial E}{\partial w_{j0}}=\left(O_0-t \right)\cdot O_0\cdot\left(1-O_0\right)\cdot x^0_j$

In [1]:
import torch
import torch.nn.functional as F

In [2]:
x = torch.randn(1, 10)
w = torch.randn(1, 10, requires_grad=True)

In [3]:
o = torch.sigmoid(x @ w.t())
o.shape

torch.Size([1, 1])

In [4]:
loss = F.mse_loss(torch.ones(1,1),o)
loss.shape

torch.Size([])

In [5]:
loss.backward()

In [6]:
w.grad

tensor([[ 0.2691,  0.0391,  0.0429,  0.1579,  0.0942,  0.3219, -0.0146,  0.2591,
          0.0271, -0.1535]])

### 多层感知机
<img src='./images/14.png' style='zoom:50%'>

$ E = \frac{1}{2} \sum \left(O_i^1-t_i \right)^2 $  
$ \dfrac {\partial E}{\partial w_{jk}}=\left(O_k-t_k \right)\cdot O_k \cdot \left(1-O_k \right) \cdot x_j^0 $

In [7]:
x = torch.randn(1, 10)
w = torch.randn(2, 10, requires_grad=True)

In [8]:
o = torch.sigmoid(x @ w.t())
o.shape

torch.Size([1, 2])

In [9]:
loss = F.mse_loss(torch.ones(1,2), o)
loss

tensor(0.9879, grad_fn=<MeanBackward0>)

In [10]:
loss.backward()

In [11]:
w.grad

tensor([[ 4.7849e-04, -1.1851e-03, -5.4842e-04,  1.3553e-08,  2.2871e-04,
          6.2080e-05,  2.5295e-04, -1.7584e-06, -3.1818e-05,  4.2797e-05],
        [ 1.0083e-02, -2.4974e-02, -1.1557e-02,  2.8562e-07,  4.8196e-03,
          1.3082e-03,  5.3305e-03, -3.7054e-05, -6.7050e-04,  9.0188e-04]])

## 链式法则

In [12]:
import torch
from torch import autograd

In [13]:
x = torch.tensor(1.)
w1 = torch.tensor(2., requires_grad=True)
b1 = torch.tensor(1.)
w2 = torch.tensor(2., requires_grad=True)
b2 = torch.tensor(1.)

In [14]:
y1 = x * w1 + b1
y2 = y1 * w2 + b2

In [15]:
dy2_dy1 = autograd.grad(y2, [y1], retain_graph=True)[0]
dy1_dw1 = autograd.grad(y1, [w1], retain_graph=True)[0]
dy2_dw1 = autograd.grad(y2, [w1], retain_graph=True)[0]

In [16]:
dy2_dy1 * dy1_dw1

tensor(2.)

In [17]:
dy2_dw1

tensor(2.)

### Multi-Layer Preceptron

$\dfrac {\partial E}{\partial W{ij}}= O_j \cdot \left(1-O_j \right) \cdot O_i \sum_{k\in K} \left(O_k-t_k \right) \cdot O_k \cdot \left(1-O_k \right) \cdot W_{jk}$  