### 简单层的实现

#### 乘法层的实现

In [1]:
# 乘法层的实现
class MulLayer(object):
    def __init__(self):
        self.x = None
        self.y = None
        
    def forward(self,x,y):
        self.x = x
        self.y = y
        
        out = self.x * self.y
        return out
    
    def backward(self,dout):
        
        # 翻转 x 和 y
        dx = dout * self.y
        dy = dout * self.x
        
        return dx,dy            

![误差反向传播](./img/error_backward_apple-1.png)

In [13]:
apple = 100
num = 2
tax = 1.1

mul_apple_layer = MulLayer()
mul_tax_layer = MulLayer()

apple_price = mul_apple_layer.forward(apple,num)
price = mul_tax_layer.forward(apple_price,tax)
print(price)

220.00000000000003


In [16]:
dout = 1
dapple_price,dtax = mul_tax_layer.backward(dout)
dapple,dnum = mul_apple_layer.backward(dapple_price)

print(dapple,dnum,dtax)

2.2 110.00000000000001 200


#### 加法层的实现

In [22]:
# 加法层的实现
class AddLayer(object):
    def __init__(self):
        pass
    
    def forward(self,x,y):
        return x + y
    
    def backward(self,dout):
        dx = dout
        dy = dout
        return dx,dy

![误差反向传播](./img/error_backward_apple_orange-1.png)

In [23]:
# forwark
mul_apple_layer = MulLayer()
apple_num = 2
apple = 100
apple_price = mul_apple_layer.forward(apple,apple_num)

mul_orange_layer = MulLayer()
orange_num = 3
orange = 150
orange_price = mul_orange_layer.forward(orange,orange_num)

add_apple_orange_layer = AddLayer()
apple_orange_total_price = add_apple_orange_layer.forward(apple_price,orange_price)

mul_tax_layer = MulLayer()
tax = 1.1

total_price = mul_tax_layer.forward(apple_orange_total_price,tax)
print(total_price)

715.0000000000001


In [31]:
# backward
dout = 1
dtotal_price,dtax = mul_tax_layer.backward(dout)
# print(dtax,dtotal_price) # 650 1.1

dapple_price,dorange_price = add_apple_orange_layer.backward(dtotal_price)
# print(dapple_price,dorange_price) # 1.1 1.1

dapple,dapple_num = mul_apple_layer.backward(dapple_price) # 2.2  110
dorange,dorange_num = mul_orange_layer.backward(dorange_price) # 3.3 165

print(dapple,dapple_num,dorange,dorange_num,dtax)

2.2 110.00000000000001 3.3000000000000003 165.0 650


### 激活函数层的实现

#### ReLU层

激活函数式子：

![ReLU函数](./img/ReLU_function.png)

求导：

![ReLU函数的导数](./img/ReLU_function_1.png)

ReLU的计算图：

![ReLU计算图](./img/ReLU_compute_map.png)

代码实现：

In [1]:
class ReLU(object):
    def __init__(self):
        self.mask = None # Numpy 的数组
        
    def forwark(self,x):
        self.mask = (x <= 0) # 数组中所有小于等于0 的元素都为True，反之都为False
        out = x.copy()
        out[self.mask] = 0 #  out 数组中元素为True 设置为0
        
        return out
    
    def backward(self,dout):
        dout[self.mask] = 0
        dx = dout
        
        return dx

In [5]:
import numpy as np
a = np.array([-1,0,-2,1,3])
b = (a<=0)

c = a.copy()
print(b)
c[b] = 0
print(c)

[ True  True  True False False]
[0 0 0 1 3]


#### sigmoid的实现

![sigmoid函数](./img/sigmoid_function.png)

用计算图的正向传播方式表示：

![sigmoid正向图表示](./img/sigmoid_compute_map_forward.png)

反向传播的方式表示：

![sigmoid反向图表示](./img/sigmoid_compute_map_backward.png)

sigmoid层的计算图（简洁版）：

![sigmoid反向图简洁表示](./img/sigmoid_compute_map_backward_simple.png)

代码实现：

In [6]:
class Sigmoid(object):
    def __init__(self):
        self.out = None
        
    def forward(self,x):
        self.out = 1 / (1 + np.exp(-x))
        return self.out
    
    def backward(self,dout):
        dx = dout * self.out * (1 - self.out)

### Affine / softmax 层的实现

#### Affine 层

Y = WX + B，代码实现如下：

```py
Y = np.dot(W, X) + B
```

矩阵的乘积运算规则，对应的维度的元素个数必须保持一致:

![矩阵运算规则](./img/matrix_dot_rule.png)

反向误差传播：

![affine反向误差传播](./img/Affine_error_backward.png)

**注意变量是多维数组，反向传播时各个变量的下方标记了该变量的形状**

矩阵的乘积（dot）的反向传播可以通过组建矩阵对应维度的个数一致的乘积运算推导出来。

![矩阵dot反向传播](./img/matrix_dot_backward.png)

#### 批版本的 Affine 反向传播

批版本的Affine计算图：

![affine反向误差传播](./img/Affine_error_backward_1.png)

代码实现：

In [None]:
class Affine(object):
    def __init__(self,W,b):
        self.W = W
        self.b = b
        self.x = None
        self.db = None
        self.dW = None
        
    def forward(self,x):
        self.x = x
        out = np.dot(self.x,self.W)+b
        return out
    
    def backward(self,dout):
        dx = np.dot(dout,self.W.T)
        self.dW = np.dot(self.x.T,dout)
        self.db = np.sum(dout,axis=0)
        
        return dx

In [9]:
import numpy as np

a = np.array([[1,2,3],[3,4,5]])
b = np.array([[1,2],[3,4],[5,6]])
c = np.array([1,2])
print(a.shape,b.shape,c.shape)

print(np.dot(c,a))

(2, 3) (3, 2) (2,)
[ 7 10 13]


In [14]:
x = np.random.rand(2)
w = np.random.rand(2,3)
b = np.random.rand(3)

print(x,w,b)
print(np.dot(x,w)+b)

[0.26480739 0.931451  ] [[0.13866054 0.66792245 0.33969711]
 [0.57991565 0.80654025 0.61827041]] [0.56606727 0.49242016 0.55381385]
[1.14294862 1.42054368 1.21965675]


### 误差反向传播法的实现

#### 神经网络学习的全貌图

1. 前提

神经网络有合适权重和偏置，调整权重和偏置以便拟合训练数据的过程称为学习。神经网络的学习有以下4个步骤：

+ 步骤1（mini - batch）

从训练数据中随机选择一部分数据。

+ 步骤2 （计算梯度）

计算损失函数关于各个权重参数的梯度。

+ 步骤3（更新参数）

将权重参数根据梯度进行微小的更新

+ 步骤4（重复）

重复步骤1、2、3。

#### 对应误差反向传播法的神经网络实现




