# 概述
从编程范式上说，飞桨支持**声明式编程**和**命令式编程**，即动态图和静态图。
- **静态图模式**：先编译后执行的方式。用户需预先定义完整的网络结构，再对网络结构进行编译优化后，才能执行获得计算结果。
- **动态图模式**：解析式的执行方式。用户无需预先定义完整的网络结构，每写一行网络代码，即可同时获得计算结果

## 动态图模式
目前飞桨默认的模式是动态图，如果想启用或关闭静态图，可以调用
- `paddle.enable_static()`
- `paddle.disable_static()`  

In [7]:
# 以 x += 10为例，比较动态图和静态图的区别
import paddle
import numpy as np

# 创建值为1的[2, 2]二维数组
data = np.ones([2, 2], np.float32)

# 默认动态图模式下，将数据转化为Tensor类型
x = paddle.to_tensor(data)

print("在动态模式下，调用paddle.to_tensor()之后，")
print("x =", x)

x += 10
print('------------------------------')
print("在动态模式下，进行计算之后，", )
print("x =", x.numpy())

在动态模式下，调用paddle.to_tensor()之后，
x = Tensor(shape=[2, 2], dtype=float32, place=CUDAPlace(0), stop_gradient=True,
       [[1., 1.],
        [1., 1.]])
------------------------------
在动态模式下，进行计算之后，
x = [[11. 11.]
 [11. 11.]]


In [8]:
# 启动静态图模式
paddle.enable_static()

# 添加Paddle对计算图的静态描述 -- Program
main_program = paddle.static.Program() 
startup_program = paddle.static.Program()

with paddle.static.program_guard(main_program=main_program, startup_program=startup_program):
    x = paddle.static.data(name='x', shape=[2, 2], dtype='float32')
    print("静态图模式下，调用paddle.static.data接口之后，")
    print("x =", x)
    # 静态图模式下，对占位符Variable类型的数据执行操作, 并且需要用户指定运行的设备
    x += 10
    place = paddle.CPUPlace()
    # 创建执行器来运行组网的Program，并用place指定在什么设备上运行
    exe = paddle.static.Executor(place=place)
    # 进行初始化操作
    exe.run(startup_program)
    # 使用执行器执行已经记录的所有操作
    # 可以通过fetch_list参数来指定要获取哪些变量的计算结果
    # 也可以通过feed参数来传入数据
    data_after_run = exe.run(fetch_list=[x], feed={'x': data})
    print("静态图模式下，运行之后的数据：", data_after_run)
    
# 关闭静态图模式，回归动态图模式
paddle.disable_static()

静态图模式下，调用paddle.static.data接口之后，
x = var x : LOD_TENSOR.shape(2, 2).dtype(float32).stop_gradient(True)
静态图模式下，运行之后的数据： [array([[11., 11.],
       [11., 11.]], dtype=float32)]




- 一个Program集合通常包含启动程序(startup_program)与主程序(main_program)，启动程序用来初始化参数，主程序用来包含网络结构和参数
- paddle.static.program_guard()接口配合with语句将with block里面所有的算子和变量添加进全局主程序和启动程序。 paddle.static.data()会在全局block种创建Tensor变量，可以被计算图中的算子访问，也可以作为占位符用于数据输入。
- 从结果可以看出：
    - 动态图模式下，所有的操作在运行时就已经完成
    - 静态图模式下，过程中并没有实际执行操作，上述例子中可以看到只能打印声明的类型
- 飞桨静态图专用API请参考[paddle.static](https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/static/Overview_cn.html)。

### 1. 动态图模型训练
1. 定义数据读取器：读取数据预处理操作
2. 定义模型和优化器：搭建神经网络结构
3. 训练：配置优化器，学习率，训练参数。
4. 评估测试
5. 模型保存和加载的方法

In [18]:
# 定义数据读取器(调用封装好的数据集API)
import numpy as np
import paddle

# 定义batch_size
BATCH_SIZE = 64
# 调用DataLoader将输入数据打包成BATCH_SIZE大小的批处理数据
class MnistDataset(paddle.vision.datasets.MNIST):
    def __init__(self, mode, return_label=True):
        super(MnistDataset, self).__init__(mode=mode)
        self.return_label = return_label
    
    # reshape 并将[0, 255] 转化为 [-1, 1]
    def __getitem__(self, idx):
        img = np.reshape(self.images[idx], [1, 28, 28])
        img = img / 255.0 * 2.0 - 1.0
        if self.return_label:
            return img, np.array(self.labels[idx].astype('int64'))
        return img,
    
    def __len__(self):
        return len(self.images)

train_reader = paddle.io.DataLoader(MnistDataset(mode='train'), batch_size=BATCH_SIZE, drop_last=True)
test_reader = paddle.io.DataLoader(MnistDataset(mode='test'), batch_size=BATCH_SIZE, drop_last=True)

- 在动态图模式中，参数和变量的存储管理方式与静态图不同。
    - 动态图模式下，网络中学习的参数和中间变量，**生命周期和 Python 对象的生命周期是一致的**。简单来说，一个 Python 对象的生命周期结束，相应的存储空间就会释放。  
- 对于一个网络模型，在模型学习的过程中参数会不断更新，所以参数需要在整个学习周期内一直保持存在，因此需要一个机制保持网络的参数不被释放。
    - 飞桨动态图模式采用了继承自paddle.nn.Layer的面向对象设计的方法管理所有参数。

In [19]:
# 定义网络结构和优化器
import paddle
from paddle.nn import Conv2D, MaxPool2D, ReLU

# 定义网络必须继承自paddle.nn.Layer
class SimpleImgConvPool(paddle.nn.Layer):
    # 在__init__构造函数中会执行变量的初始化、参数初始化、子网络初始化的操作
    # 本例中执行了Conv2D和MaxPool2D网络的初始化操作
    def __init__(self,
                 in_channels, out_channels, filter_size, pool_size, pool_stride,
                 pool_padding=0, conv_stride=1, conv_padding=0, conv_dilation=1,
                 conv_groups=1, weight_attr=None, bias_attr=None):
        super(SimpleImgConvPool, self).__init__()

        # Conv2D网络的初始化
        self._conv2d = Conv2D(in_channels=in_channels,out_channels=out_channels,
                            kernel_size=filter_size, stride=conv_stride,
                            padding=conv_padding, dilation=conv_dilation,
                            groups=conv_groups, weight_attr=weight_attr,
                            bias_attr=bias_attr)
        # ReLU激活的初始化
        self._relu = ReLU()

        # Pool2D网络的初始化
        self._pool2d = MaxPool2D(kernel_size=pool_size, stride=pool_stride, padding=pool_padding)

    # forward函数实现了SimpleImgConvPool网络的执行逻辑
    def forward(self, inputs):
        x = self._conv2d(inputs)
        x = self._relu(x)
        x = self._pool2d(x)
        return x

In [20]:
class MNIST(paddle.nn.Layer):
    def __init__(self):
        super(MNIST, self).__init__()
        self._simple_img_conv_pool_1 = SimpleImgConvPool(
            1, 20, 5, 2, 2)
        self._simple_img_conv_pool_2 = SimpleImgConvPool(
            20, 50, 5, 2, 2)
        
        # self.pool_2_shape变量定义了经过self._simple_img_conv_pool_2层之后的数据
        # 除了batch_size维度之外其他维度的乘积
        self.pool_2_shape = 50 * 4 * 4
        # self.pool_2_shape、self.size定义了self.output_weight参数的维度
        self.size = 10
        # 定义全连接层的参数
        self.output_weight = self.create_parameter([self.pool_2_shape, self.size])

        # 定义计算accuracy的层
        self.accuracy = paddle.metric.Accuracy()
    
    # forward函数实现了MNIST网络的执行逻辑
    def forward(self, inputs, label=None):
        x = self._simple_img_conv_pool_1(inputs)
        x = self._simple_img_conv_pool_2(x)
        x = paddle.reshape(x, shape=[-1, self.pool_2_shape])
        x = paddle.matmul(x, self.output_weight)
        x = paddle.nn.functional.softmax(x)
        if label is not None:
            # Reset只返回当前batch的准确率
            self.accuracy.reset()
            correct = self.accuracy.compute(x, label)
            self.accuracy.update(correct)
            acc = self.accuracy.accumulate()
            return x, acc
        else:
            return x

In [21]:
import numpy as np
from paddle.optimizer import Adam


# 定义MNIST类的对象
mnist = MNIST()
# 定义优化器为Adam，学习率learning_rate为0.001
# 注意动态图模式下必须传入parameters参数，该参数为需要优化的网络参数，本例需要优化mnist网络中的所有参数
adam = Adam(learning_rate=0.001, parameters=mnist.parameters())

# 设置全部样本的训练次数
epoch_num = 5
    
# 执行epoch_num次训练
for epoch in range(epoch_num):
    # 读取训练数据进行训练
    for batch_id, data in enumerate(train_reader()):
        # train_reader 返回的是img和label已经是Tensor类型，可以动态图使用
        img = data[0]
        label = data[1]
        
        # 网络正向执行
        pred, acc = mnist(img, label)
            
        # 计算损失值
        loss = paddle.nn.functional.cross_entropy(pred, label)
        avg_loss = paddle.mean(loss)
        # 执行反向计算
        avg_loss.backward()
        # 参数更新
        adam.step()
        # 将本次计算的梯度值清零，以便进行下一次迭代和梯度更新
        adam.clear_grad()
            
        # 输出对应epoch、batch_id下的损失值，预测精确度
        if batch_id % 100 == 0:
            print("Epoch {} step {}, Loss = {:}, Accuracy = {:}".format(
                    epoch, batch_id, avg_loss.numpy(), acc))

Epoch 0 step 0, Loss = [2.3319526], Accuracy = 0.125
Epoch 0 step 100, Loss = [1.853616], Accuracy = 0.609375
Epoch 0 step 200, Loss = [1.6881948], Accuracy = 0.765625
Epoch 0 step 300, Loss = [1.6814728], Accuracy = 0.78125
Epoch 0 step 400, Loss = [1.589998], Accuracy = 0.875
Epoch 0 step 500, Loss = [1.6732543], Accuracy = 0.8125
Epoch 0 step 600, Loss = [1.6490974], Accuracy = 0.8125
Epoch 0 step 700, Loss = [1.8125134], Accuracy = 0.640625
Epoch 0 step 800, Loss = [1.6831717], Accuracy = 0.78125
Epoch 0 step 900, Loss = [1.6284146], Accuracy = 0.828125
Epoch 1 step 0, Loss = [1.5831273], Accuracy = 0.875
Epoch 1 step 100, Loss = [1.6316577], Accuracy = 0.828125
Epoch 1 step 200, Loss = [1.5331774], Accuracy = 0.9375
Epoch 1 step 300, Loss = [1.5997106], Accuracy = 0.859375
Epoch 1 step 400, Loss = [1.5070084], Accuracy = 0.953125
Epoch 1 step 500, Loss = [1.5377495], Accuracy = 0.921875
Epoch 1 step 600, Loss = [1.5861411], Accuracy = 0.875
Epoch 1 step 700, Loss = [1.6176636], Ac

In [23]:
# 保存训练好的模型
model_dict = mnist.state_dict()
paddle.save(model_dict, "mnist.pdparams")

In [29]:
# 评估测试
mnist_eval = MNIST()
# 加载保存的模型
model_dict = paddle.load("mnist.pdparams")
mnist_eval.set_state_dict(model_dict)
print("checkpoint loaded")

# model.eval()      #切换到评估模式
# model.train()     #切换到训练模式
# 切换到预测评估模式
mnist_eval.eval()

acc_set = []
avg_loss_set = []
# 读取测试数据进行评估测试
for batch_id, data in enumerate(test_reader()):
    img = data[0]
    label = data[1]

    # 网络正向执行
    prediction, acc = mnist_eval(img, label)
        
    # 计算损失值
    loss = paddle.nn.functional.cross_entropy(prediction, label)
    avg_loss = paddle.mean(loss)

    acc_set.append(float(acc))
    avg_loss_set.append(float(avg_loss.numpy()))
        
# 输出不同 batch 数据下损失值和准确率的平均值
acc_val_mean = np.array(acc_set).mean()
avg_loss_val_mean = np.array(avg_loss_set).mean()
print("Eval avg_loss is: {}, acc is: {}".format(avg_loss_val_mean, acc_val_mean))

checkpoint loaded
Eval avg_loss is: 1.475463283367646, acc is: 0.9859775641025641


在动态图模式下，模型和优化器在不同的模块中，所以模型和优化器分别在不同的对象中存储，使得模型参数和优化器信息需分别存储。因此模型的保存需要单独调用模型和优化器中的 state_dict() 接口，同样模型的加载也需要单独进行处理。

In [30]:
# 1. 保存模型参数
paddle.save(mnist.state_dict(), "mnist.pdparams")
# 2. 保存优化器信息
paddle.save(adam.state_dict(), "adam.pdopt")

In [31]:
# 1. 获取模型参数和优化器信息
model_state = paddle.load("mnist.pdparams")
opt_state = paddle.load("adam.pdopt")
# 2. 加载模型参数
mnist.set_state_dict(model_state)
# 3. 加载优化器信息
adam.set_state_dict(opt_state)

### 2. 多卡训练
针对数据量、计算量较大的任务，我们需要多卡并行训练，以提高训练效率。目前动态图模式可支持GPU的单机多卡训练方式。  
动态图多卡通过 Python 基础库 `subprocess` 在每一张 GPU 上启动单独的 Python 程序的方式，每张卡的程序独立运行，只是在每一轮梯度计算完成之后，所有的程序进行梯度的同步，然后更新训练的参数。

In [35]:
import numpy as np
import paddle
import paddle.distributed as dist
from paddle.optimizer import Adam

# 准备多卡环境
dist.init_parallel_env()

epoch_num = 5
BATCH_SIZE = 64
mnist = MNIST()
adam = Adam(learning_rate=0.001, parameters=mnist.parameters())

# 数据并行模块
mnist = paddle.DataParallel(mnist)

# 通过调用paddle.io.DataLoader来构造reader
# 需要使用DistributedBatchSampler为多张卡拆分数据
train_sampler = paddle.io.DistributedBatchSampler(MnistDataset(mode='train'),
                                                  batch_size=BATCH_SIZE, 
                                                  drop_last=True)
train_reader = paddle.io.DataLoader(MnistDataset(mode='train'), batch_sampler=train_sampler)

In [36]:
for epoch in range(epoch_num):
    for batch_id, data in enumerate(train_reader()):
        img = data[0]
        label = data[1]
        label.stop_gradient = True
            
        pred, acc = mnist(img, label)
            
        loss = paddle.nn.functional.cross_entropy(pred, label)
        avg_loss = paddle.mean(loss)
        avg_loss.backward()
        adam.step()
        adam.clear_grad()
        if batch_id % 100 == 0 and batch_id is not 0:
            print("Epoch {} step {}, Loss = {:}, Accuracy = {:}".format(
                    epoch, batch_id, avg_loss.numpy(), acc))

  if batch_id % 100 == 0 and batch_id is not 0:


Epoch 0 step 100, Loss = [1.7339315], Accuracy = 0.71875
Epoch 0 step 200, Loss = [1.573685], Accuracy = 0.90625
Epoch 0 step 300, Loss = [1.494304], Accuracy = 0.984375
Epoch 0 step 400, Loss = [1.4964607], Accuracy = 0.96875
Epoch 0 step 500, Loss = [1.4984132], Accuracy = 0.96875
Epoch 0 step 600, Loss = [1.4760737], Accuracy = 0.984375
Epoch 0 step 700, Loss = [1.5143863], Accuracy = 0.953125
Epoch 0 step 800, Loss = [1.4996874], Accuracy = 0.96875
Epoch 0 step 900, Loss = [1.506871], Accuracy = 0.953125
Epoch 1 step 100, Loss = [1.4795102], Accuracy = 0.984375
Epoch 1 step 200, Loss = [1.5102603], Accuracy = 0.953125
Epoch 1 step 300, Loss = [1.4787824], Accuracy = 0.984375
Epoch 1 step 400, Loss = [1.4697437], Accuracy = 1.0
Epoch 1 step 500, Loss = [1.4853926], Accuracy = 0.96875
Epoch 1 step 600, Loss = [1.476786], Accuracy = 0.984375
Epoch 1 step 700, Loss = [1.4872568], Accuracy = 0.984375
Epoch 1 step 800, Loss = [1.4995837], Accuracy = 0.96875
Epoch 1 step 900, Loss = [1.47

飞桨动态图多进程多卡模型训练启动时，需要指定使用的 GPU，比如使用 0,1 卡，可执行如下命令启动训练  
`$ python -m paddle.distributed.launch --gpus=0,1 --log_dir ./mylog train.py`

### 3. 模型部署
动态图虽然有非常多的优点，但是如果用户希望使用 C++ 部署已经训练好的模型，会存在一些不便利。比如，动态图中可使用 Python 原生的控制流，包含 if/else、switch、for/while，这些控制流需要通过一定的机制才能映射到 C++ 端，实现在 C++ 端的部署。
- 如果用户使用的 if/else、switch、for/while 与输入（包括输入的值和 shape ）无关，则可以使用 `@paddle.jit.to_static` 将前向动态图模型转换为静态图模型。可以将动态图保存后做在线C++预测；
- 除此以外，用户也可使用转换后的静态图模型在Python端做预测，通常比原先的动态图性能更好。

In [37]:
import numpy as np
import paddle
from paddle.optimizer import Adam


class MNIST(paddle.nn.Layer):
    def __init__(self):
        super(MNIST, self).__init__()
        self._simple_img_conv_pool_1 = SimpleImgConvPool(
            1, 20, 5, 2, 2)
        self._simple_img_conv_pool_2 = SimpleImgConvPool(
            20, 50, 5, 2, 2)

        self.pool_2_shape = 50 * 4 * 4
        self.size = 10
        self.output_weight = self.create_parameter(
            [self.pool_2_shape, self.size])

    @paddle.jit.to_static # 在 forward 函数添加装饰器
    def forward(self, inputs):
        x = self._simple_img_conv_pool_1(inputs)
        x = self._simple_img_conv_pool_2(x)
        x = paddle.reshape(x, shape=[-1, self.pool_2_shape])
        x = paddle.matmul(x, self.output_weight)
        x = paddle.nn.functional.softmax(x)
        return x
    
epoch_num = 5
BATCH_SIZE = 64
mnist = MNIST()
adam = Adam(learning_rate=0.001, parameters=mnist.parameters())
for epoch in range(epoch_num):
    for batch_id, data in enumerate(train_reader()):
        img = data[0]
        label = data[1]
        pred = mnist(img)

        loss = paddle.nn.functional.cross_entropy(pred, label)
        avg_loss = paddle.mean(loss)
        avg_loss.backward()
        adam.step()
        adam.clear_grad()

        if batch_id % 100 == 0:
            print("Epoch {} step {}, Loss = {:}".format(
                    epoch, batch_id, avg_loss.numpy()))
            
# 此处的 path 参数为前缀，而非文件名
paddle.jit.save(mnist, "inference/mnist")

  return (isinstance(seq, collections.Sequence) and


Epoch 0 step 0, Loss = [2.351206]
Epoch 0 step 100, Loss = [1.7277366]
Epoch 0 step 200, Loss = [1.6548624]
Epoch 0 step 300, Loss = [1.759743]
Epoch 0 step 400, Loss = [1.6397439]
Epoch 0 step 500, Loss = [1.5950389]
Epoch 0 step 600, Loss = [1.5542469]
Epoch 0 step 700, Loss = [1.634324]
Epoch 0 step 800, Loss = [1.5691906]
Epoch 0 step 900, Loss = [1.5881759]
Epoch 1 step 0, Loss = [1.6015033]
Epoch 1 step 100, Loss = [1.5508146]
Epoch 1 step 200, Loss = [1.5696216]
Epoch 1 step 300, Loss = [1.5691619]
Epoch 1 step 400, Loss = [1.6166158]
Epoch 1 step 500, Loss = [1.5700891]
Epoch 1 step 600, Loss = [1.5532005]
Epoch 1 step 700, Loss = [1.6156965]
Epoch 1 step 800, Loss = [1.5692351]
Epoch 1 step 900, Loss = [1.5078506]
Epoch 2 step 0, Loss = [1.4957088]
Epoch 2 step 100, Loss = [1.4897172]
Epoch 2 step 200, Loss = [1.4915651]
Epoch 2 step 300, Loss = [1.4922736]
Epoch 2 step 400, Loss = [1.4618129]
Epoch 2 step 500, Loss = [1.491091]
Epoch 2 step 600, Loss = [1.4768126]
Epoch 2 ste

In [38]:
# 通过调用 paddle.jit.save 接口将静态图模型保存为用于预测部署的模型
# 之后如果在Python中想要使用，可以利用 paddle.jit.load 接口将保存的模型加载。
load_mnist = paddle.jit.load("inference/mnist")

load_mnist.eval()
x = paddle.randn([1, 1, 28, 28], 'float32')
pred = load_mnist(x)

print("Load MNIST predict: {:} ".format(pred.numpy()))

Load MNIST predict: [[1.5050951e-04 5.3357985e-09 1.9456142e-04 1.8036899e-03 1.4148206e-06
  5.3630345e-02 3.0761273e-03 1.5505937e-05 9.2861640e-01 1.2511353e-02]] 


## 静态图模式
静态图的优势在于在运行时所有的操作和执行顺序都已经定义完成了，能够根据全局信息来做各种优化策略，比如合并相邻操作来进行加速或者减少中间变量，因此对于同样的网络结构，使用静态图模型运行往往能够获取更好的性能和更少的内存占用。

### 1. 静态图的数据表示和定义
静态图也使用变量和常量来表示数据，但是由于在调用**执行器**之前，静态图并不执行实际操作（这个阶段一般称为“组网阶段”或者“编译阶段”），因此也不会在此时读入数据，所以在静态图中还需要一种特殊的变量来表示输入数据，一般称为“**占位符**”。在飞桨中我们使用 paddle.static.data 来创建占位符，paddle.static.data 需要指定Variable的形状信息和数据类型，当遇到无法确定的维度时，可以将相应维度指定为None

In [40]:
import paddle
paddle.enable_static()
x = paddle.static.data(name='x', shape=[3, None], dtype="int64")
print('x =', x)
# 大多数网络都会采用batch方式进行数据组织，batch大小在定义时不确定
batched_x = paddle.static.data(name="batched_x", shape=[None, 3, None], dtype='int64')
print('batched_x =', batched_x)

x = var x : LOD_TENSOR.shape(3, -1).dtype(int64).stop_gradient(True)
batched_x = var batched_x : LOD_TENSOR.shape(-1, 3, -1).dtype(int64).stop_gradient(True)


### 2. 使用静态图组建网络
在飞桨中，数据计算类API统一称为**Operator（算子）,简称OP**。  
- 下面是一个完整的静态图计算网络的例子，这个例子中完成一个简单的“result = a + b”运算
    - 定义了两个int64类型的输入数据a和b
    - 并使用elementwise_add OP来对a、b进行“逐元素加和”的操作

In [42]:
# -*- coding: utf-8 -*- 
"""
Created on 2022/3/7 18:34

使用静态图组建网络

@Author : Jiabin Wang 
"""

import paddle
import numpy

paddle.enable_static()
a = paddle.static.data(name="a", shape=[None, 1], dtype="int64")
b = paddle.static.data(name="b", shape=[None, 1], dtype="int64")

# 组建网络，此处网络仅有一个操作构成
result = paddle.add(a, b)

place = paddle.CPUPlace()  # 定义运算设备
exe = paddle.static.Executor(place=place)  # 创建执行器
exe.run(paddle.static.default_startup_program())  # 网络参数初始化

# 读取输入数据
data_1 = int(input("Please enter an integer: a="))
data_2 = int(input("Please enter an integer: b="))
print('---------------------------------')
x = numpy.array([[data_1]]).astype('int64')
y = numpy.array([[data_2]]).astype('int64')

# 运行网络
outs = exe.run(
    feed={'a': x, 'b': y},  # 给变量赋值
    fetch_list=[result]  # 指定要获取的变量
)

# 输出计算结果
print("%d+%d=%d" % (data_1, data_2, outs[0][0]))



Please enter an integer: a=1
Please enter an integer: b=2
---------------------------------
1+2=3


- 在动态图中可以方便的用python的控制流语句来进行条件判断，但是在静态图中，在组网阶段没有实际执行操作，也没有产生中间计算结果，因此无法使用pyhton控制流语句进行条件判断。  
- 为此静态图提供了多个控制流OP来实现条件判断语句。这里以`paddle.static.nn.while_loop`为例来说明如何在静态图中实现条件循环的操作。

In [44]:
# 该示例代码展示整数循环+1，循环10次，输出计数结果
import paddle

# 定义cond方法，作为while_loop的判断条件
def cond(i, ten):
    return i < ten


# 定义body方法，作为while_loop的执行体，只要cond返回值为True，while_loop就会一直调用该方法进行计算
# 由于在使用while_loop OP时，cond和body的参数都是由while_loop的loop_vars参数指定的，
# 所以cond和body必须有相同数量的参数列表，因此body中虽然只需要i这个参数，
# 但是仍然要保持参数列表个数为2，此处添加了一个dummy参数来进行"占位"
def body(i, dummy):
    # 计算过程是对输入参数i进行自增操作，即 i = i + 1
    i = i + 1
    return i, dummy

# 为了与上面cell的代码进行隔离
main_program = paddle.static.Program()
startup_program = paddle.static.Program()
with paddle.static.program_guard(main_program, startup_program):
    paddle.enable_static()
    i = paddle.full(shape=[1], fill_value=0, dtype='int64')  # 循环计数器
    ten = paddle.full(shape=[1], fill_value=10, dtype='int64')  # 循环次数
    # while_loop的返回值是一个tensor列表，其长度，结构，类型与loop_vars相同
    out, ten = paddle.static.nn.while_loop(cond, body, [i, ten])

    exe = paddle.static.Executor(place=paddle.CPUPlace())
    res = exe.run(paddle.static.default_main_program(), feed={}, fetch_list=out)
    print(res)  # [array([10], dtype=int64)]
paddle.disable_static()

[array([10], dtype=int64)]


### 3. 根据动态图写静态图

In [46]:
# 动态图：预测波士顿房价
import paddle

# 定义网络结构，该任务中使用线性回归模型，网络由一个FC层构成
class LinearRegression(paddle.nn.Layer):
    def __init__(self, input_dim, hidden):
        super(LinearRegression, self).__init__()
        self.linear = paddle.nn.Linear(input_dim, hidden)

    def forward(self, x):
        x = self.linear(x)
        return x


# 训练和预测的数据读取处理，这部分的用法动态图和静态图是一致的
batch_size = 20
train_reader = paddle.io.DataLoader(paddle.text.datasets.UCIHousing(mode='train'), batch_size=batch_size, shuffle=True)
test_reader = paddle.io.DataLoader(paddle.text.datasets.UCIHousing(mode='test'), batch_size=batch_size)

# 波士顿房价预测任务中，共有13个特征
input_feature = 13

# 定义网络
model = LinearRegression(input_dim=input_feature, hidden=1)
# 定义优化器
sgd = paddle.optimizer.SGD(learning_rate=0.001, parameters=model.parameters())

max_epoch_num = 100  # 执行max_epoch_num次训练
for epoch in range(max_epoch_num):
    # 读取训练数据进行训练
    for batch_id, data in enumerate(train_reader()):
        x_tensor, y_tensor = data
        # 调用网络，执行前向计算
        prediction = model(x_tensor)

        # 计算损失值
        loss = paddle.nn.functional.square_error_cost(prediction, y_tensor)
        avg_loss = paddle.mean(loss)

        if batch_id % 10 == 0 and batch_id is not 0:
            print("epoch: {}, batch_id: {}, loss is: {}".format(epoch, batch_id, avg_loss.numpy()))

        # 执行反向计算，并调用minimize接口计算和更新梯度
        avg_loss.backward()
        sgd.minimize(avg_loss)

        # 将本次计算的梯度值清零，以便进行下一次迭代和梯度更新
        model.clear_gradients()

# 训练结束，保存训练好的模型
paddle.save(model.state_dict(), 'linear.pdparams')

linear_infer = LinearRegression(input_dim=input_feature, hidden=1)
# 加载之前已经训练好的模型准备进行预测
model_dict = paddle.load("linear.pdparams")
linear_infer.set_dict(model_dict)
print("checkpoint loaded.")

# 开启评估测试模式（区别于训练模式）
linear_infer.eval()
(infer_x_tensor, infer_y_tensor) = next(test_reader())

infer_result = linear_infer(infer_x_tensor)
print(infer_result.numpy())

print("id: prediction ground_truth")
for idx, val in enumerate(infer_result.numpy()):
    print("%d: %.2f %.2f" % (idx, val, infer_y_tensor.numpy()[idx]))
    
linear_infer = LinearRegression(input_dim=input_feature, hidden=1)
# 加载之前已经训练好的模型准备进行预测
model_dict = paddle.load("linear.pdparams")
linear_infer.set_dict(model_dict)
print("checkpoint loaded.")

# 开启评估测试模式（区别于训练模式）
linear_infer.eval()
(infer_x_tensor, infer_y_tensor) = next(test_reader())

infer_result = linear_infer(infer_x_tensor)
print(infer_result.numpy())

print("id: prediction ground_truth")
for idx, val in enumerate(infer_result.numpy()):
    print("%d: %.2f %.2f" % (idx, val, infer_y_tensor.numpy()[idx]))

item  3/12 [=====>........................] - ETA: 0s - 6ms/item 

Cache file C:\Users\wangjiabin01\.cache\paddle\dataset\uci_housing\housing.data not found, downloading http://paddlemodels.bj.bcebos.com/uci_housing/housing.data 
Begin to download


epoch: 0, batch_id: 20, loss is: [379.65326]
epoch: 1, batch_id: 10, loss is: [494.0562]
epoch: 1, batch_id: 20, loss is: [553.76495]
epoch: 2, batch_id: 10, loss is: [661.4332]
epoch: 2, batch_id: 20, loss is: [415.32324]
epoch: 3, batch_id: 10, loss is: [441.50348]
epoch: 3, batch_id: 20, loss is: [430.46228]


Download finished
  if batch_id % 10 == 0 and batch_id is not 0:



epoch: 4, batch_id: 10, loss is: [373.88834]
epoch: 4, batch_id: 20, loss is: [240.96436]
epoch: 5, batch_id: 10, loss is: [391.0603]
epoch: 5, batch_id: 20, loss is: [273.99362]
epoch: 6, batch_id: 10, loss is: [372.81314]
epoch: 6, batch_id: 20, loss is: [610.01404]
epoch: 7, batch_id: 10, loss is: [318.67462]
epoch: 7, batch_id: 20, loss is: [238.03229]
epoch: 8, batch_id: 10, loss is: [552.6242]
epoch: 8, batch_id: 20, loss is: [567.91034]
epoch: 9, batch_id: 10, loss is: [331.6791]
epoch: 9, batch_id: 20, loss is: [138.4335]
epoch: 10, batch_id: 10, loss is: [385.5222]
epoch: 10, batch_id: 20, loss is: [266.11]
epoch: 11, batch_id: 10, loss is: [310.43417]
epoch: 11, batch_id: 20, loss is: [72.60823]
epoch: 12, batch_id: 10, loss is: [468.53107]
epoch: 12, batch_id: 20, loss is: [122.37219]
epoch: 13, batch_id: 10, loss is: [246.44864]
epoch: 13, batch_id: 20, loss is: [158.34715]
epoch: 14, batch_id: 10, loss is: [178.89157]
epoch: 14, batch_id: 20, loss is: [498.45743]
epoch: 1

epoch: 97, batch_id: 10, loss is: [11.765215]
epoch: 97, batch_id: 20, loss is: [89.3209]
epoch: 98, batch_id: 10, loss is: [92.6944]
epoch: 98, batch_id: 20, loss is: [14.764287]
epoch: 99, batch_id: 10, loss is: [31.108477]
epoch: 99, batch_id: 20, loss is: [235.86124]
checkpoint loaded.
[[14.106716]
 [14.193714]
 [13.857725]
 [15.989067]
 [14.674107]
 [15.668662]
 [15.171934]
 [15.025968]
 [12.281435]
 [14.525974]
 [11.457995]
 [13.711352]
 [14.547316]
 [13.685104]
 [13.65205 ]
 [15.145768]
 [15.960554]
 [15.793779]
 [16.109455]
 [14.739093]]
id: prediction ground_truth
0: 14.11 8.50
1: 14.19 5.00
2: 13.86 11.90
3: 15.99 27.90
4: 14.67 17.20
5: 15.67 27.50
6: 15.17 15.00
7: 15.03 17.20
8: 12.28 17.90
9: 14.53 16.30
10: 11.46 7.00
11: 13.71 7.20
12: 14.55 7.50
13: 13.69 10.40
14: 13.65 8.80
15: 15.15 8.40
16: 15.96 16.70
17: 15.79 14.20
18: 16.11 20.80
19: 14.74 13.40
checkpoint loaded.
[[14.106716]
 [14.193714]
 [13.857725]
 [15.989067]
 [14.674107]
 [15.668662]
 [15.171934]
 [15.02

In [56]:
import paddle


# 定义网络结构，该任务中使用线性回归模型，网络由一个fc层构成
def linear_regression_net(input, hidden):
    # 区别1：飞桨的大部分API可以做到动静通用，但也有部分带有明显特性的API只能用于动态图或静态图，在模型中需要按需使用，
    # 另外由于动态图是基于面向对象的编程方式，而静态图还经常使用基于过程的编程方式，因此有些API虽然动静通用，
    # 但在静态图中使用静态图专用API可能会更加方便，比如与paddle.nn.Linear对应的静态图API是paddle.static.nn.fc，
    # 飞桨2.0及以上版本中，静态图专用API都统一置于paddle.static下
    out = paddle.static.nn.fc(input, hidden)
    return out


# 区别2：飞桨2.0及以上版本默认动态图模式，因此需要显式开启静态图运行模式
paddle.enable_static()
# 区别3：在静态图中需要明确定义输入变量，即“占位符”，在静态图组网阶段并没有读入数据，所以需要使用占位符指明输入数据的类型、shape等信息
# 波士顿房价预测任务中，共有13个特征，数据以batch形式组织，batch大小在定义时可以不确定，用None表示，因此shape=[None, 13]
main_program = paddle.static.Program()
startup_program = paddle.static.Program()
with paddle.static.program_guard(main_program, startup_program):
    x = paddle.static.data(name='x', shape=[None, 13], dtype='float32')
    # y代表实际结果，只有一个值，因此shape=[None, 1]
    y = paddle.static.data(name='y', shape=[None, 1], dtype='float32')

    # 区别4：训练和预测的数据读取处理基本与动态图一致，但需要注意由于静态图在定义DataLoader时并没有实际读入数据，
    # 所以需要通过feed_list参数指定要读取的变量列表
    batch_size = 20
    train_reader = paddle.io.DataLoader(paddle.text.datasets.UCIHousing(mode='train'), feed_list=[x, y],
                                        batch_size=batch_size, shuffle=True)
    test_reader = paddle.io.DataLoader(paddle.text.datasets.UCIHousing(mode='test'), feed_list=[x], batch_size=batch_size)

    # 调用网络，执行前向计算
    prediction = linear_regression_net(x, 1)

    # 计算损失值
    loss = paddle.nn.functional.square_error_cost(input=prediction, label=y)
    avg_loss = paddle.mean(loss)

    # 定义优化器，并调用minimize接口计算和更新梯度
    sgd_optimizer = paddle.optimizer.SGD(learning_rate=0.001)
    sgd_optimizer.minimize(avg_loss)

    # 区别5：静态图中需要使用执行器执行之前已经定义好的网络，此处创建执行器
    exe = paddle.static.Executor()

    # 区别6：静态图中需要显式对网络进行初始化操作
    exe.run(paddle.static.default_startup_program())

    max_epoch_num = 100  # 执行max_epoch_num次训练
    for epoch in range(max_epoch_num):
        for batch_id, (x_tensor, y_tensor) in enumerate(train_reader()):
            # 区别7：静态图中需要调用执行器的run方法执行计算过程，需要获取的计算结果（如avg_loss）需要通过fetch_list指定
            avg_loss_value, = exe.run(feed={'x': x_tensor, 'y': y_tensor}, fetch_list=[avg_loss])

            if batch_id % 10 == 0 and batch_id is not 0:
                print("epoch: {}, batch_id: {}, loss is: {}".format(epoch, batch_id, avg_loss_value))

    # 区别8：静态图中需要使用save_inference_model来保存模型，以供预测使用
    paddle.static.save_inference_model('./static_linear', [x], [prediction], exe)

infer_exe = paddle.static.Executor()
inference_scope = paddle.static.Scope()
# 使用训练好的模型做预测
with paddle.static.scope_guard(inference_scope):
    # 区别9：静态图中需要使用load_inference_model来加载之前保存的模型
    [inference_program, feed_target_names, fetch_targets
     ] = paddle.static.load_inference_model('./static_linear', infer_exe)

    # 读取一组测试数据
    (infer_x, infer_y) = next(test_reader())

    # 区别10：静态图中预测时也需要调用执行器的run方法执行计算过程，并指定之前加载的inference_program
    results = infer_exe.run(
        inference_program,
        feed={feed_target_names[0]: infer_x},
        fetch_list=fetch_targets)

    print("id: prediction ground_truth")
    for idx, val in enumerate(results[0]):
        print("%d: %.2f %.2f" % (idx, val, infer_y.__array__()[idx]))

  if batch_id % 10 == 0 and batch_id is not 0:


epoch: 0, batch_id: 10, loss is: [584.5659]
epoch: 0, batch_id: 20, loss is: [576.61444]
epoch: 1, batch_id: 10, loss is: [476.04672]
epoch: 1, batch_id: 20, loss is: [408.51077]
epoch: 2, batch_id: 10, loss is: [567.94775]
epoch: 2, batch_id: 20, loss is: [573.7333]
epoch: 3, batch_id: 10, loss is: [542.2305]
epoch: 3, batch_id: 20, loss is: [438.43756]
epoch: 4, batch_id: 10, loss is: [461.57367]
epoch: 4, batch_id: 20, loss is: [924.50165]
epoch: 5, batch_id: 10, loss is: [488.01862]
epoch: 5, batch_id: 20, loss is: [298.5144]
epoch: 6, batch_id: 10, loss is: [586.33655]
epoch: 6, batch_id: 20, loss is: [205.19876]
epoch: 7, batch_id: 10, loss is: [330.88754]
epoch: 7, batch_id: 20, loss is: [579.54205]
epoch: 8, batch_id: 10, loss is: [223.78882]
epoch: 8, batch_id: 20, loss is: [285.00436]
epoch: 9, batch_id: 10, loss is: [272.37854]
epoch: 9, batch_id: 20, loss is: [218.97125]
epoch: 10, batch_id: 10, loss is: [318.42395]
epoch: 10, batch_id: 20, loss is: [240.31184]
epoch: 11, b

epoch: 91, batch_id: 10, loss is: [27.64689]
epoch: 91, batch_id: 20, loss is: [11.942961]
epoch: 92, batch_id: 10, loss is: [128.92136]
epoch: 92, batch_id: 20, loss is: [15.376593]
epoch: 93, batch_id: 10, loss is: [61.570473]
epoch: 93, batch_id: 20, loss is: [17.13982]
epoch: 94, batch_id: 10, loss is: [62.24692]
epoch: 94, batch_id: 20, loss is: [19.3348]
epoch: 95, batch_id: 10, loss is: [95.80588]
epoch: 95, batch_id: 20, loss is: [13.684561]
epoch: 96, batch_id: 10, loss is: [21.51617]
epoch: 96, batch_id: 20, loss is: [45.71259]
epoch: 97, batch_id: 10, loss is: [20.027863]
epoch: 97, batch_id: 20, loss is: [179.2323]
epoch: 98, batch_id: 10, loss is: [63.03998]
epoch: 98, batch_id: 20, loss is: [85.17096]
epoch: 99, batch_id: 10, loss is: [59.8744]
epoch: 99, batch_id: 20, loss is: [649.7136]
id: prediction ground_truth
0: 14.55 8.50
1: 15.01 5.00
2: 14.28 11.90
3: 16.15 27.90
4: 14.91 17.20
5: 15.77 27.50
6: 15.47 15.00
7: 15.01 17.20
8: 12.46 17.90
9: 14.86 16.30
10: 11.90 