# 此样例是二分类问题，建立带有一个包含四个神经元的隐藏层的神经网络
## 输入
x1, x2
## 输出
y = x1 + x2 >= 0? 1: 0;
## 实现方法
自定义模块，继承`torch.nn.Module`并定义`forward`函数

In [1]:
import torch

## 产生数据
使用`torch.randn()`随机生成满足标准正态分布的张量，size为$1000\times2$。

In [2]:
x = torch.randn(1000, 2)
x

tensor([[ 0.8834, -0.6287],
        [-0.4028, -1.2012],
        [ 0.1142,  1.8035],
        ...,
        [-0.6573, -0.7571],
        [ 0.3850,  3.3426],
        [-0.3604, -1.3974]])

使用`torch.sum(input, dim, keepdim=False, dtype=None)`生成label，其参数如下：
* `input`：需要求和的tensor
* `dim`：需要求和的维度
* `keepdim`：默认为False，如果为True则求和后的输出的维数与input相同

也可以使用`y[x.sum(dim=1) >= 0] = 1`生成label。

In [3]:
y = torch.zeros(1000, 1)
y[torch.sum(x, dim=1) >= 0] = 1
# y[x.sum(dim=1) >= 0] = 1
y

tensor([[1.],
        [0.],
        [1.],
        [1.],
        [0.],
        [0.],
        [0.],
        [1.],
        [0.],
        [1.],
        [1.],
        [0.],
        [1.],
        [0.],
        [0.],
        [0.],
        [0.],
        [0.],
        [0.],
        [1.],
        [1.],
        [0.],
        [0.],
        [1.],
        [1.],
        [0.],
        [1.],
        [0.],
        [1.],
        [1.],
        [1.],
        [1.],
        [0.],
        [1.],
        [1.],
        [1.],
        [1.],
        [1.],
        [1.],
        [1.],
        [1.],
        [0.],
        [0.],
        [1.],
        [1.],
        [0.],
        [0.],
        [1.],
        [0.],
        [0.],
        [1.],
        [1.],
        [1.],
        [0.],
        [0.],
        [1.],
        [0.],
        [1.],
        [1.],
        [0.],
        [0.],
        [0.],
        [1.],
        [1.],
        [1.],
        [1.],
        [0.],
        [1.],
        [1.],
        [0.],
        [1.],
      

## 建立模型并训练
### 建立网络
自定义模块，继承`torch.nn.Module`并定义`forward`函数。

也可以通过控制流（循环语句）重复使用相同模块（如某一隐藏层）。
```python
class TwoLayerNet(torch.nn.Module):
    def __init__(self, D_in, H, D_out):
        super(TwoLayerNet, self).__init__()
        self.linear1 = torch.nn.Linear(D_in, H)
        self.h = torch.nn.Linear(H, H)
        self.linear2 = torch.nn.Linear(H, D_out)
        self.relu = torch.nn.ReLU()
        self.sigmoid = torch.nn.Sigmoid()
        
    def forward(self, x):
        relu = self.relu(self.linear1(x))
        for _ in range(3):
            relu = self.relu(self.h(relu))
        y_pred = self.sigmoid(self.linear2(relu))
        return y_pred
```
可以通过`list(model.parameters())[index]`查看对应层的参数，其中重复使用的模块权值共享。

In [4]:
class TwoLayerNet(torch.nn.Module):
    def __init__(self, D_in, H, D_out):
        super(TwoLayerNet, self).__init__()
        self.linear1 = torch.nn.Linear(D_in, H)
        self.linear2 = torch.nn.Linear(H, D_out)
        self.relu = torch.nn.ReLU()
        self.sigmoid = torch.nn.Sigmoid()
        
    def forward(self, x):
        relu = self.relu(self.linear1(x))
        y_pred = self.sigmoid(self.linear2(relu))
        return y_pred

### 损失函数
损失函数使用`torch.nn.BCELoss()`，即Binary Cross Entrophy Loss，适用于使用Sigmoid激活函数的二分类问题。
* `input`: Tensor of arbitrary shape
* `target`: Tensor of the same shape as input

### 优化器
使用`torch.optim.Adam()`进行梯度下降，其重要的参数为：
* `params`：网络的权重，通过`model.parameters()`获取，为可迭代类型
* `lr`：学习率learning_rate，默认为1e-3

In [5]:
model = TwoLayerNet(2, 4, 1)
learning_rate = 1e-3
loss_fn = torch.nn.BCELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
print(model)

TwoLayerNet(
  (linear1): Linear(in_features=2, out_features=4, bias=True)
  (linear2): Linear(in_features=4, out_features=1, bias=True)
  (relu): ReLU()
  (sigmoid): Sigmoid()
)


### 自动求导
* `optimizer.zero_grad()`：在反向传播之前，使用optimizer将它要更新的所有张量的梯度清零(这些张量是模型可学习的权重)
* `loss.backward()`：反向传播：根据模型的参数计算loss的梯度
* `optimizer.step()`：调用Optimizer的step函数使它所有参数更新

也可将以上三条语句替换为以下代码：
```python
# 反向传播之前清零梯度
model.zero_grad()
loss.backward()
# 使用梯度下降更新权重。
# 每个参数都是张量，所以我们可以像我们以前那样可以得到它的数值和梯度
with torch.no_grad():
    for param in model.parameters():
        param -= learning_rate * param.grad
```
此时不使用优化器，而是在`torch.no_grad()`上下文环境中更新梯度。

In [6]:
for t in range(1000):
    y_pred = model(x)
    loss = loss_fn(y_pred, y)
    print(t, loss.item())
    
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

0 0.7011216878890991
1 0.7003718614578247
2 0.6996227502822876
3 0.6988741159439087
4 0.6981256604194641
5 0.6973768472671509
6 0.6966290473937988
7 0.695881187915802
8 0.6951330304145813
9 0.6943864226341248
10 0.6936401724815369
11 0.6928941011428833
12 0.6921482682228088
13 0.6914020776748657
14 0.6906562447547913
15 0.6899088025093079
16 0.6891614198684692
17 0.6884137392044067
18 0.6876663565635681
19 0.6869195699691772
20 0.6861724257469177
21 0.6854249238967896
22 0.6846781373023987
23 0.6839308142662048
24 0.6831825375556946
25 0.68243408203125
26 0.6816859245300293
27 0.6809367537498474
28 0.6801883578300476
29 0.6794396042823792
30 0.6786895394325256
31 0.6779393553733826
32 0.67718905210495
33 0.67643803358078
34 0.6756865382194519
35 0.6749351024627686
36 0.6741839647293091
37 0.6734322905540466
38 0.6726809740066528
39 0.6719292998313904
40 0.6711776852607727
41 0.6704241037368774
42 0.669671893119812
43 0.6689159274101257
44 0.6681603789329529
45 0.6674028635025024
46 0.6

412 0.31802651286125183
413 0.3172706961631775
414 0.3165174722671509
415 0.31576722860336304
416 0.3150191903114319
417 0.31427374482154846
418 0.3135302662849426
419 0.31278854608535767
420 0.312049925327301
421 0.31131255626678467
422 0.31057819724082947
423 0.3098447620868683
424 0.30911359190940857
425 0.30838480591773987
426 0.3076588809490204
427 0.30693477392196655
428 0.30621209740638733
429 0.30549171566963196
430 0.3047737777233124
431 0.3040580451488495
432 0.3033449351787567
433 0.3026341497898102
434 0.3019256293773651
435 0.3012195825576782
436 0.300516277551651
437 0.2998144030570984
438 0.2991153299808502
439 0.2984186112880707
440 0.2977243661880493
441 0.29703256487846375
442 0.2963434159755707
443 0.2956567704677582
444 0.2949722409248352
445 0.2942904233932495
446 0.29360997676849365
447 0.2929321825504303
448 0.29225724935531616
449 0.29158395528793335
450 0.29091379046440125
451 0.2902452051639557
452 0.28957855701446533
453 0.28891369700431824
454 0.288250863552

872 0.13733457028865814
873 0.137153759598732
874 0.13697347044944763
875 0.1367933601140976
876 0.13661348819732666
877 0.13643446564674377
878 0.13625586032867432
879 0.13607776165008545
880 0.13590015470981598
881 0.1357230395078659
882 0.13554641604423523
883 0.1353701949119568
884 0.13519445061683655
885 0.13501927256584167
886 0.1348445862531662
887 0.1346701681613922
888 0.13449645042419434
889 0.1343231201171875
890 0.13415038585662842
891 0.1339779943227768
892 0.1338060051202774
893 0.13363458216190338
894 0.13346345722675323
895 0.13329289853572845
896 0.13312284648418427
897 0.13295330107212067
898 0.13278429210186005
899 0.1326155811548233
900 0.13244761526584625
901 0.13227994740009308
902 0.13211268186569214
903 0.13194608688354492
904 0.1317797303199768
905 0.13161391019821167
906 0.1314486563205719
907 0.13128365576267242
908 0.13111920654773712
909 0.13095523416996002
910 0.1307917684316635
911 0.1306285858154297
912 0.1304660439491272
913 0.13030393421649933
914 0.13

## 创建测试样例并进行测试

In [7]:
x_test = torch.randn(100, 2)
x_test

tensor([[-0.3878,  0.0505],
        [-0.2113,  0.8510],
        [-0.9553,  0.2549],
        [ 0.8998, -0.8956],
        [ 0.4137,  0.9997],
        [ 0.6366,  0.6506],
        [ 1.5470,  0.4194],
        [-0.1384,  1.7171],
        [ 0.0924,  0.5135],
        [-0.3031, -1.0480],
        [ 1.7467, -0.7529],
        [-0.5966,  0.5582],
        [ 1.2856,  2.1306],
        [-0.2376,  0.7875],
        [ 1.5443,  1.3194],
        [-0.5564, -0.9398],
        [-0.2781, -0.7313],
        [ 1.4259,  0.9916],
        [ 0.5230,  1.2651],
        [ 0.0856,  0.0630],
        [ 0.1164, -1.8620],
        [-0.0625,  1.1149],
        [ 0.3281,  0.4835],
        [ 0.4446,  1.6618],
        [-0.5582, -0.6099],
        [ 0.1642,  1.2040],
        [-1.5272,  0.6586],
        [-0.8581,  1.8812],
        [-0.3627,  1.3267],
        [ 0.9115,  0.8063],
        [-0.9719,  1.1970],
        [ 1.0324,  1.0867],
        [ 0.8912,  0.1097],
        [-0.8998, -0.9643],
        [-0.3981, -1.7017],
        [ 1.5249,  0

In [8]:
y_test = torch.zeros(100, 1)
y_test[torch.sum(x_test, dim=1) >= 0] = 1
# y_test[x_test.sum(dim=1) >= 0] = 1
y_test

tensor([[0.],
        [1.],
        [0.],
        [1.],
        [1.],
        [1.],
        [1.],
        [1.],
        [1.],
        [0.],
        [1.],
        [0.],
        [1.],
        [1.],
        [1.],
        [0.],
        [0.],
        [1.],
        [1.],
        [1.],
        [0.],
        [1.],
        [1.],
        [1.],
        [0.],
        [1.],
        [0.],
        [1.],
        [1.],
        [1.],
        [1.],
        [1.],
        [1.],
        [0.],
        [0.],
        [1.],
        [0.],
        [0.],
        [1.],
        [1.],
        [1.],
        [1.],
        [0.],
        [1.],
        [0.],
        [1.],
        [1.],
        [0.],
        [0.],
        [0.],
        [1.],
        [0.],
        [0.],
        [0.],
        [1.],
        [0.],
        [1.],
        [1.],
        [0.],
        [1.],
        [1.],
        [1.],
        [1.],
        [1.],
        [1.],
        [1.],
        [1.],
        [0.],
        [0.],
        [1.],
        [0.],
      

In [9]:
y_test_pred = torch.zeros(100, 1)
y_test_pred[model(x_test) >= .5] = 1
y_test_pred

tensor([[0.],
        [1.],
        [0.],
        [0.],
        [1.],
        [1.],
        [1.],
        [1.],
        [1.],
        [0.],
        [1.],
        [1.],
        [1.],
        [1.],
        [1.],
        [0.],
        [0.],
        [1.],
        [1.],
        [1.],
        [0.],
        [1.],
        [1.],
        [1.],
        [0.],
        [1.],
        [0.],
        [1.],
        [1.],
        [1.],
        [1.],
        [1.],
        [1.],
        [0.],
        [0.],
        [1.],
        [0.],
        [0.],
        [1.],
        [1.],
        [1.],
        [1.],
        [0.],
        [1.],
        [0.],
        [1.],
        [1.],
        [0.],
        [0.],
        [0.],
        [1.],
        [0.],
        [0.],
        [0.],
        [1.],
        [0.],
        [1.],
        [1.],
        [0.],
        [1.],
        [1.],
        [1.],
        [1.],
        [1.],
        [1.],
        [1.],
        [1.],
        [0.],
        [0.],
        [1.],
        [0.],
      

In [10]:
result = torch.zeros(100, 1)
result[y_test == y_test_pred] = 1
print('accurate: {}'.format(result.sum().item() / 100))

accurate: 0.98
