# 此样例是二分类问题，建立带有一个包含四个神经元的隐藏层的神经网络
## 输入
x1, x2
## 输出
y = x1 + x2 >= 0? 1: 0;
## 实现方法
自定义模块，继承`torch.nn.Module`并定义`forward`函数

In [1]:
import torch

## 产生数据
使用`torch.randn()`随机生成满足标准正态分布的张量，size为$1000\times2$。

In [2]:
x = torch.randn(1000, 2)
x

tensor([[-0.0686,  0.0731],
        [-0.1500, -0.4564],
        [-0.0993, -0.3525],
        ...,
        [ 0.1439,  1.7315],
        [ 0.0517, -0.5053],
        [ 0.4632, -1.8610]])

使用`torch.sum(input, dim, keepdim=False, dtype=None)`生成label，其参数如下：
* `input`：需要求和的tensor
* `dim`：需要求和的维度
* `keepdim`：默认为False，如果为True则求和后的输出的维数与input相同

也可以使用`y[x.sum(dim=1) >= 0] = 1`生成label。

In [3]:
y = torch.zeros(1000, 1)
y[torch.sum(x, dim=1) >= 0] = 1
# y[x.sum(dim=1) >= 0] = 1
y

tensor([[1.],
        [0.],
        [0.],
        [0.],
        [1.],
        [1.],
        [1.],
        [1.],
        [1.],
        [1.],
        [1.],
        [0.],
        [1.],
        [0.],
        [0.],
        [1.],
        [0.],
        [1.],
        [1.],
        [1.],
        [1.],
        [1.],
        [1.],
        [0.],
        [1.],
        [0.],
        [1.],
        [1.],
        [1.],
        [1.],
        [0.],
        [0.],
        [0.],
        [1.],
        [1.],
        [1.],
        [0.],
        [0.],
        [1.],
        [1.],
        [0.],
        [1.],
        [0.],
        [0.],
        [1.],
        [0.],
        [0.],
        [0.],
        [0.],
        [1.],
        [1.],
        [1.],
        [0.],
        [0.],
        [0.],
        [1.],
        [1.],
        [0.],
        [0.],
        [0.],
        [1.],
        [1.],
        [0.],
        [0.],
        [1.],
        [1.],
        [1.],
        [1.],
        [1.],
        [0.],
        [1.],
      

## 建立模型并训练
### 建立网络
自定义模块，继承`torch.nn.Module`并定义`forward`函数。

也可以通过控制流（循环语句）重复使用相同模块（如某一隐藏层）。
```python
class TwoLayerNet(torch.nn.Module):
    def __init__(self, D_in, H, D_out):
        super(TwoLayerNet, self).__init__()
        self.linear1 = torch.nn.Linear(D_in, H)
        self.h = torch.nn.Linear(H, H)
        self.linear2 = torch.nn.Linear(H, D_out)
        self.relu = torch.nn.ReLU()
        self.sigmoid = torch.nn.Sigmoid()
        
    def forward(self, x):
        relu = self.relu(self.linear1(x))
        for _ in range(3):
            relu = self.relu(self.h(relu))
        y_pred = self.sigmoid(self.linear2(relu))
        return y_pred
```
可以通过`list(model.parameters())[index]`查看对应层的参数，其中重复使用的模块权值共享。

In [4]:
class TwoLayerNet(torch.nn.Module):
    def __init__(self, D_in, H, D_out):
        super(TwoLayerNet, self).__init__()
        self.linear1 = torch.nn.Linear(D_in, H)
        self.linear2 = torch.nn.Linear(H, D_out)
        self.relu = torch.nn.ReLU()
        self.sigmoid = torch.nn.Sigmoid()
        
    def forward(self, x):
        relu = self.relu(self.linear1(x))
        y_pred = self.sigmoid(self.linear2(relu))
        return y_pred

### 损失函数
损失函数使用`torch.nn.BCELoss()`，即Binary Cross Entrophy Loss，适用于使用Sigmoid激活函数的二分类问题。
* `input`: Tensor of arbitrary shape
* `target`: Tensor of the same shape as input

### 优化器
使用`torch.optim.Adam()`进行梯度下降，其重要的参数为：
* `params`：网络的权重，通过`model.parameters()`获取，为可迭代类型
* `lr`：学习率learning_rate，默认为1e-3

In [5]:
model = TwoLayerNet(2, 4, 1)
learning_rate = 1e-3
loss_fn = torch.nn.BCELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
print(model)

TwoLayerNet(
  (linear1): Linear(in_features=2, out_features=4, bias=True)
  (linear2): Linear(in_features=4, out_features=1, bias=True)
  (relu): ReLU()
  (sigmoid): Sigmoid()
)


### 自动求导
* `optimizer.zero_grad()`：在反向传播之前，使用optimizer将它要更新的所有张量的梯度清零(这些张量是模型可学习的权重)
* `loss.backward()`：反向传播：根据模型的参数计算loss的梯度
* `optimizer.step()`：调用Optimizer的step函数使它所有参数更新

也可将以上三条语句替换为以下代码：
```python
# 反向传播之前清零梯度
model.zero_grad()
loss.backward()
# 使用梯度下降更新权重。
# 每个参数都是张量，所以我们可以像我们以前那样可以得到它的数值和梯度
with torch.no_grad():
    for param in model.parameters():
        param -= learning_rate * param.grad
```
此时不使用优化器，而是在`torch.no_grad()`上下文环境中更新梯度。

In [6]:
for t in range(1000):
    y_pred = model(x)
    loss = loss_fn(y_pred, y)
    print(t, loss.item())
    
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

0 0.6625894904136658
1 0.6619126200675964
2 0.6612372398376465
3 0.6605626344680786
4 0.6598884463310242
5 0.6592150926589966
6 0.6585428714752197
7 0.6578705906867981
8 0.6572006344795227
9 0.6565296649932861
10 0.6558598875999451
11 0.6551910042762756
12 0.6545232534408569
13 0.653856635093689
14 0.6531898975372314
15 0.6525243520736694
16 0.6518596410751343
17 0.6511958241462708
18 0.6505322456359863
19 0.6498691439628601
20 0.649207353591919
21 0.6485456824302673
22 0.6478842496871948
23 0.6472238898277283
24 0.6465636491775513
25 0.6459033489227295
26 0.6452438235282898
27 0.6445841789245605
28 0.6439260244369507
29 0.6432669162750244
30 0.6426089406013489
31 0.6419514417648315
32 0.641294002532959
33 0.6406360864639282
34 0.6399791836738586
35 0.6393217444419861
36 0.638664722442627
37 0.6380073428153992
38 0.6373507976531982
39 0.6366932392120361
40 0.6360355615615845
41 0.6353772878646851
42 0.6347181797027588
43 0.6340593695640564
44 0.6333993673324585
45 0.6327386498451233
46

402 0.3330024182796478
403 0.3323242962360382
404 0.33164817094802856
405 0.33097413182258606
406 0.330301433801651
407 0.3296310305595398
408 0.328962504863739
409 0.3282955288887024
410 0.3276306986808777
411 0.32696759700775146
412 0.32630619406700134
413 0.3256465196609497
414 0.3249887228012085
415 0.3243326246738434
416 0.323678582906723
417 0.32302629947662354
418 0.322376012802124
419 0.3217269480228424
420 0.32108035683631897
421 0.3204355537891388
422 0.31979236006736755
423 0.3191511631011963
424 0.31851163506507874
425 0.3178738057613373
426 0.3172377347946167
427 0.31660372018814087
428 0.31597089767456055
429 0.31534063816070557
430 0.3147115409374237
431 0.31408482789993286
432 0.3134599030017853
433 0.3128368854522705
434 0.3122156262397766
435 0.31159624457359314
436 0.3109778165817261
437 0.3103618919849396
438 0.3097473680973053
439 0.3091350495815277
440 0.308523565530777
441 0.3079145550727844
442 0.3073074221611023
443 0.30670166015625
444 0.3060980439186096
445 0

820 0.1687399446964264
821 0.1685335338115692
822 0.1683277189731598
823 0.16812242567539215
824 0.16791750490665436
825 0.16771326959133148
826 0.16750919818878174
827 0.16730597615242004
828 0.16710323095321655
829 0.16690117120742798
830 0.1666993945837021
831 0.1664981245994568
832 0.16629742085933685
833 0.1660972237586975
834 0.16589762270450592
835 0.16569837927818298
836 0.1654997318983078
837 0.1653016060590744
838 0.1651039570569992
839 0.16490687429904938
840 0.16471010446548462
841 0.1645139455795288
842 0.16431806981563568
843 0.16412301361560822
844 0.16392827033996582
845 0.1637340784072876
846 0.16354015469551086
847 0.16334688663482666
848 0.16315411031246185
849 0.16296173632144928
850 0.16276976466178894
851 0.1625782996416092
852 0.16238726675510406
853 0.1621967852115631
854 0.16200664639472961
855 0.1618170440196991
856 0.16162800788879395
857 0.16143926978111267
858 0.16125093400478363
859 0.1610630452632904
860 0.1608758419752121
861 0.16068878769874573
862 0.16

## 创建测试样例并进行测试

In [7]:
x_test = torch.randn(100, 2)
x_test

tensor([[-2.1121, -0.5447],
        [ 1.8574,  1.0035],
        [-0.8979,  0.8991],
        [-0.0488, -3.0063],
        [-0.2345, -1.1774],
        [-0.8026, -0.1944],
        [-0.5708, -0.3262],
        [ 0.3987,  0.4420],
        [ 1.1547,  1.0912],
        [-0.1383, -1.7309],
        [ 0.2125,  0.6295],
        [ 0.7286,  0.0242],
        [ 0.2068, -0.2562],
        [ 1.0151, -0.1818],
        [ 0.5681,  0.1236],
        [-0.8858, -1.4594],
        [-0.5313,  0.4251],
        [-0.0476,  0.4525],
        [-0.4573, -0.7203],
        [ 0.4573,  0.5749],
        [ 0.1977,  0.3708],
        [-0.2543,  1.2179],
        [ 0.6553, -0.5920],
        [ 0.3457, -0.7485],
        [-1.7984,  0.0374],
        [-0.2621,  0.7559],
        [-0.4186,  0.1934],
        [-1.1844,  1.4281],
        [-1.1761, -0.1995],
        [ 0.3758,  0.0328],
        [-0.8888,  0.8787],
        [-1.5340,  0.1700],
        [-0.6640,  0.7141],
        [-1.1452,  0.3050],
        [-1.4752, -1.7205],
        [ 0.7598,  2

In [8]:
y_test = torch.zeros(100, 1)
y_test[torch.sum(x_test, dim=1) >= 0] = 1
# y_test[x_test.sum(dim=1) >= 0] = 1
y_test

tensor([[0.],
        [1.],
        [1.],
        [0.],
        [0.],
        [0.],
        [0.],
        [1.],
        [1.],
        [0.],
        [1.],
        [1.],
        [0.],
        [1.],
        [1.],
        [0.],
        [0.],
        [1.],
        [0.],
        [1.],
        [1.],
        [1.],
        [1.],
        [0.],
        [0.],
        [1.],
        [0.],
        [1.],
        [0.],
        [1.],
        [0.],
        [0.],
        [1.],
        [0.],
        [0.],
        [1.],
        [0.],
        [0.],
        [1.],
        [0.],
        [1.],
        [1.],
        [0.],
        [0.],
        [0.],
        [1.],
        [0.],
        [0.],
        [1.],
        [0.],
        [1.],
        [0.],
        [1.],
        [1.],
        [0.],
        [0.],
        [0.],
        [0.],
        [1.],
        [0.],
        [1.],
        [1.],
        [0.],
        [1.],
        [1.],
        [1.],
        [1.],
        [0.],
        [0.],
        [0.],
        [1.],
      

In [9]:
y_test_pred = torch.zeros(100, 1)
y_test_pred[model(x_test) >= .5] = 1
y_test_pred

tensor([[0.],
        [1.],
        [1.],
        [0.],
        [0.],
        [0.],
        [0.],
        [1.],
        [1.],
        [0.],
        [1.],
        [1.],
        [1.],
        [1.],
        [1.],
        [0.],
        [0.],
        [1.],
        [0.],
        [1.],
        [1.],
        [1.],
        [1.],
        [0.],
        [0.],
        [1.],
        [0.],
        [1.],
        [0.],
        [1.],
        [1.],
        [0.],
        [1.],
        [0.],
        [0.],
        [1.],
        [0.],
        [0.],
        [1.],
        [0.],
        [1.],
        [1.],
        [0.],
        [0.],
        [0.],
        [1.],
        [0.],
        [0.],
        [1.],
        [0.],
        [1.],
        [0.],
        [1.],
        [1.],
        [0.],
        [0.],
        [0.],
        [0.],
        [1.],
        [0.],
        [1.],
        [1.],
        [0.],
        [1.],
        [1.],
        [1.],
        [1.],
        [0.],
        [0.],
        [0.],
        [1.],
      

In [10]:
result = torch.zeros(100, 1)
result[y_test == y_test_pred] = 1
print('accurate: {}'.format(result.sum().item() / 100))

accurate: 0.98
