# Stage 2: Express in natural code

这个阶段的主要目标是扩展当前的 DeZero ，使它能够执行更复杂的计算。具体来说，我们将修改 DeZero 的基础代码，使它能够处理接收多个输入的函数和返回多个输出的函数。我们还将扩展 DeZero使它可以用自然的代码来表达，例如能够使用+和*等运算符。

In [1]:
import numpy as np

## Step 11: Variable length parameter (forward)

Stage1中设计的函数输入输出都只有一个变量，下面需要拓展DeZero，使其可以处理可变长的输入和输出。

In [2]:
class Variable:
    def __init__(self, data):
        if data is not None:
            if not isinstance(data, np.ndarray):
                raise TypeError('{} is not supported'.format(type(data)))

        self.data = data
        self.grad = None
        self.creator = None

    def set_creator(self, func):
        self.creator = func

    def backward(self):
        if self.grad is None:
            self.grad = np.ones_like(self.data)

        funcs = [self.creator]
        while funcs:
            f = funcs.pop()
            x, y = f.input, f.output
            x.grad = f.backward(y.grad)

            if x.creator is not None:
                funcs.append(x.creator)

In [3]:
def as_array(x):
    if np.isscalar(x):
        return np.array(x)
    return x

class Function:
    def __call__(self, inputs):
        # 将变量放入列表
        xs = [x.data for x in inputs]  # Get data from Variable
        ys = self.forward(xs)
        outputs = [Variable(as_array(y)) for y in ys]  # Wrap data
        # 逐个设置关联关系
        for output in outputs:
            output.set_creator(self)
        self.inputs = inputs
        self.outputs = outputs
        return outputs

    def forward(self, xs):
        raise NotImplementedError()

    def backward(self, gys):
        raise NotImplementedError()

In [18]:
class Add(Function):
    def forward(self, xs):
        x0, x1 = xs
        y = x0 + x1
        return (y,)  # return as tuple

In [15]:
xs = [Variable(np.array(2)), Variable(np.array(3))]
f = Add()
ys = f(xs)
print(type(ys), type(ys[0]))  # <class 'tuple'> <class '__main__.Variable'>
y = ys[0]
print(y.data)  # 5

<class 'list'> <class '__main__.Variable'>
5


## Step 12: Variable length parameter (improve)

改进上述代码，使得Add类更容易使用，也更容易实现。

In [16]:
# 1. 让Add类更易用，传入多个参数时，不需要将参数打包成列表，而是直接传入多个参数
class Function:
    def __call__(self, *inputs):  # Change inputs to *inputs
        xs = [x.data for x in inputs]  # 从inputs中取出data
        ys = self.forward(xs)
        outputs = [Variable(as_array(y)) for y in ys]

        for output in outputs:
            output.set_creator(self)
        self.inputs = inputs
        self.outputs = outputs

        return outputs if len(outputs) > 1 else outputs[0]  # 如果outputs只有一个元素，则返回该元素

    def forward(self, xs):
        raise NotImplementedError()

    def backward(self, gys):
        raise NotImplementedError()


In [19]:
x0 = Variable(np.array(2))
x1 = Variable(np.array(3))
f = Add()
y = f(x0, x1)
print(y.data)  # 5

5


In [20]:
# 2. 让Add类更容易实现，forward方法直接接收两个变量，直接返回结果变量
class Function:
    def __call__(self, *inputs):  # Change inputs to *inputs
        xs = [x.data for x in inputs]  # 从inputs中取出data
        ys = self.forward(*xs)  # 解包：使用*将列表拆分为单独的元素
        if not isinstance(ys, tuple):  # 如果ys不是元组，则将其转换为元组
            ys = (ys,)
        outputs = [Variable(as_array(y)) for y in ys]

        for output in outputs:
            output.set_creator(self)
        self.inputs = inputs
        self.outputs = outputs

        return outputs if len(outputs) > 1 else outputs[0]  # 如果outputs只有一个元素，则返回该元素

    def forward(self, xs):
        raise NotImplementedError()

    def backward(self, gys):
        raise NotImplementedError()

class Add(Function):
    def forward(self, x0, x1):
        y = x0 + x1
        return y

In [21]:
class Add(Function):
    def forward(self, x0, x1):
        y = x0 + x1
        return y
    
def add(x0, x1):
    return Add()(x0, x1)

In [22]:
x0 = Variable(np.array(2))
x1 = Variable(np.array(3))
y = add(x0, x1)
print(y.data)  # 5

5


## Step 13: Variable length parameter (backward)

加法的反向传播就是将输入的梯度原封不动地传递给输出
<center>
<table>
  <tr>
    <td><img src="./res/反向传播_偏导.png" width="400"/></td>
  </tr>
</table>
</center>

1. 将output的grad收集起来
2. 使用backward方法计算梯度
3. 如果backward的返回值不是元组，则转换为元组
4. 将梯度传递给输入变量

In [23]:
class Add(Function):
    def forward(self, x0, x1):
        y = x0 + x1
        return y

    def backward(self, gy):  
        return gy, gy

In [24]:
class Variable:
    def __init__(self, data):
        if data is not None:
            if not isinstance(data, np.ndarray):
                raise TypeError('{} is not supported'.format(type(data)))

        self.data = data
        self.grad = None
        self.creator = None

    def set_creator(self, func):
        self.creator = func

    def backward(self):
        if self.grad is None:
            self.grad = np.ones_like(self.data)

        funcs = [self.creator]
        while funcs:
            f = funcs.pop()
            gys = [output.grad for output in f.outputs]  # 1. 将output的grad收集起来
            gxs = f.backward(*gys)  # 2. 使用backward方法计算梯度
            if not isinstance(gxs, tuple):  # 3. 如果backward的返回值不是元组，则转换为元组
                gxs = (gxs,)
            
            for x, gx in zip(f.inputs, gxs):  # 4. 将梯度传递给输入变量
                x.grad = gx
                if x.creator is not None:
                    funcs.append(x.creator)


In [26]:
class Square(Function):
    def forward(self, x):
        y = x ** 2
        return y

    def backward(self, gy):
        x = self.inputs[0].data  # input是元组，取出第0个元素
        gx = 2 * x * gy
        return gx
    
def square(x):
    return Square()(x)

In [27]:
x = Variable(np.array(2.0))
y = Variable(np.array(3.0))

z = add(square(x), square(y))
z.backward()
print(z.data)  # 13.0
print(x.grad)  # 4.0
print(y.grad)  # 6.0

13.0
4.0
6.0
