# 深度概率编程

## 概述

深度学习模型具有强大的拟合能力，而贝叶斯理论具有很好的可解释能力。MindSpore深度概率编程（MindSpore Deep Probabilistic Programming, MDP）将深度学习和贝叶斯学习结合，通过设置网络权重为分布、引入隐空间分布等，可以对分布进行采样前向传播，由此引入了不确定性，从而增强了模型的鲁棒性和可解释性。MDP不仅包含通用、专业的概率学习编程语言，适用于“专业”用户，而且支持使用开发深度学习模型的逻辑进行概率编程，让初学者轻松上手；此外，还提供深度概率学习的工具箱，拓展贝叶斯应用功能。

本章将详细介绍深度概率编程在MindSpore上的应用。在动手进行实践之前，确保，你已经正确安装了MindSpore 0.7.0-beta及其以上版本。

> 本例适用于GPU和Ascend环境。

## 环境准备

设置训练模式为图模式，计算平台为GPU。

In [1]:
from mindspore import context

context.set_context(mode=context.GRAPH_MODE, save_graphs=False, device_target="GPU")

## 数据准备

### 下载数据集
下载MNIST数据集并解压到指定位置，在Jupyter Notebook中执行如下命令：

In [2]:
!mkdir -p ./datasets/MNIST_Data/train ./datasets/MNIST_Data/test
!wget -NP ./datasets/MNIST_Data/train https://mindspore-website.obs.myhuaweicloud.com/notebook/datasets/mnist/train-labels-idx1-ubyte --no-check-certificate 
!wget -NP ./datasets/MNIST_Data/train https://mindspore-website.obs.myhuaweicloud.com/notebook/datasets/mnist/train-images-idx3-ubyte --no-check-certificate
!wget -NP ./datasets/MNIST_Data/test https://mindspore-website.obs.myhuaweicloud.com/notebook/datasets/mnist/t10k-labels-idx1-ubyte --no-check-certificate
!wget -NP ./datasets/MNIST_Data/test https://mindspore-website.obs.myhuaweicloud.com/notebook/datasets/mnist/t10k-images-idx3-ubyte --no-check-certificate
!tree ./datasets/MNIST_Data

./datasets/MNIST_Data
├── test
│   ├── t10k-images-idx3-ubyte
│   └── t10k-labels-idx1-ubyte
└── train
    ├── train-images-idx3-ubyte
    └── train-labels-idx1-ubyte

2 directories, 4 files


### 定义数据集增强方法

MNIST数据集的原始训练数据集是60000张$28\times28$像素的单通道数字图片，本次训练用到的含贝叶斯层的LeNet5网络接收到训练数据的张量为`(32,1,32,32)`，通过自定义create_dataset函数将原始数据集增强为适应训练要求的数据，具体的增强操作解释可参考官网快速入门[实现一个图片分类应用](https://www.mindspore.cn/docs/programming_guide/zh-CN/master/quick_start/quick_start.html)。

In [3]:
import mindspore.dataset.vision.c_transforms as CV
import mindspore.dataset.transforms.c_transforms as C
from mindspore.dataset.vision import Inter
from mindspore import dataset as ds

def create_dataset(data_path, batch_size=32, repeat_size=1,
                   num_parallel_workers=1):
    # define dataset
    mnist_ds = ds.MnistDataset(data_path)

    # define some parameters needed for data enhancement and rough justification
    resize_height, resize_width = 32, 32
    rescale = 1.0 / 255.0
    shift = 0.0
    rescale_nml = 1 / 0.3081
    shift_nml = -1 * 0.1307 / 0.3081

    # according to the parameters, generate the corresponding data enhancement method
    c_trans = [
        CV.Resize((resize_height, resize_width), interpolation=Inter.LINEAR),
        CV.Rescale(rescale_nml, shift_nml),
        CV.Rescale(rescale, shift),
        CV.HWC2CHW()
    ]
    type_cast_op = C.TypeCast(mstype.int32)

    # using map to apply operations to a dataset
    mnist_ds = mnist_ds.map(operations=type_cast_op, input_columns="label", num_parallel_workers=num_parallel_workers)
    mnist_ds = mnist_ds.map(operations=c_trans, input_columns="image", num_parallel_workers=num_parallel_workers)

    
    # process the generated dataset
    buffer_size = 10000
    mnist_ds = mnist_ds.shuffle(buffer_size=buffer_size)
    mnist_ds = mnist_ds.batch(batch_size, drop_remainder=True)
    mnist_ds = mnist_ds.repeat(repeat_size)

    return mnist_ds

## 定义深度神经网络

在经典LeNet5网络中，数据经过如下计算过程：卷积1->激活->池化->卷积2->激活->池化->降维->全连接1->全连接2->全连接3。  
本例中将引入概率编程方法，将卷积1和全连接1两个计算层改造成贝叶斯层，构造成含贝叶斯层的LeNet5网络。

In [4]:
from mindspore.common.initializer import Normal
import mindspore.nn as nn
from mindspore.nn.probability import bnn_layers
import mindspore.ops as ops
from mindspore import dtype as mstype


class BNNLeNet5(nn.Cell):
    def __init__(self, num_class=10):
        super(BNNLeNet5, self).__init__()
        self.num_class = num_class
        self.conv1 = bnn_layers.ConvReparam(1, 6, 5, stride=1, padding=0, has_bias=False, pad_mode="valid")
        self.conv2 = nn.Conv2d(6, 16, 5, pad_mode='valid')
        self.fc1 = bnn_layers.DenseReparam(16 * 5 * 5, 120)
        self.fc2 = nn.Dense(120, 84, weight_init=Normal(0.02))
        self.fc3 = nn.Dense(84, self.num_class)
        self.relu = nn.ReLU()
        self.max_pool2d = nn.MaxPool2d(kernel_size=2, stride=2)
        self.flatten = nn.Flatten()

    def construct(self, x):
        x = self.max_pool2d(self.relu(self.conv1(x)))
        x = self.max_pool2d(self.relu(self.conv2(x)))
        x = self.flatten(x)
        x = self.relu(self.fc1(x))
        x = self.relu(self.fc2(x))
        x = self.fc3(x) 
        return x

本例中将卷积层1和全连接1两个计算层换成了贝叶斯卷积层`bnn_layers.ConvReparam`和贝叶斯全连接层`bnn_layers.DenseReparam`。

### 定义训练网络

定义训练网络并进行训练。

In [5]:
from mindspore.nn import TrainOneStepCell
from mindspore import Tensor, Model
from mindspore.train.callback import ModelCheckpoint, CheckpointConfig, LossMonitor
from mindspore.nn import Accuracy
from mindspore.nn.loss import SoftmaxCrossEntropyWithLogits
import os


lr = 0.01
momentum = 0.9
model_path = "./models/ckpt/probability_bnnlenet5/"

# clean old run files
os.system("rm -f {0}*.meta {0}*.ckpt".format(model_path))
network = BNNLeNet5()
criterion = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction="mean")
optimizer = nn.Momentum(network.trainable_params(), lr, momentum)
model = Model(network, criterion, optimizer, metrics={"Accuracy": Accuracy()} )

config_ck = CheckpointConfig(save_checkpoint_steps=1875, keep_checkpoint_max=16)
ckpoint_cb = ModelCheckpoint(prefix="checkpoint_lenet", directory=model_path, config=config_ck)

ds_train_path = "./datasets/MNIST_Data/train/"
train_set = create_dataset(ds_train_path, 32, 1)
model.train(1, train_set, callbacks=[ckpoint_cb, LossMonitor()])

epoch: 1 step: 1, loss is 2.3022718
epoch: 1 step: 2, loss is 2.3022223
epoch: 1 step: 3, loss is 2.3028727
epoch: 1 step: 4, loss is 2.3034232
epoch: 1 step: 5, loss is 2.3019493
epoch: 1 step: 6, loss is 2.3017588
... ...
epoch: 1 step: 1866, loss is 0.097549394
epoch: 1 step: 1867, loss is 0.082386635
epoch: 1 step: 1868, loss is 0.027000971
epoch: 1 step: 1869, loss is 0.026424333
epoch: 1 step: 1870, loss is 0.19351783
epoch: 1 step: 1871, loss is 0.02400064
epoch: 1 step: 1872, loss is 0.3389563
epoch: 1 step: 1873, loss is 0.004886848
epoch: 1 step: 1874, loss is 0.020785151
epoch: 1 step: 1875, loss is 0.33145565


训练完成后会在对应的路径上生成`.ckpt`为后缀的权重参数文件和`.meta`为后缀的计算图文件。  
其路径结构为：

In [6]:
!tree $model_path

./models/ckpt/probability_bnnlenet5/
├── checkpoint_lenet-1_1875.ckpt
└── checkpoint_lenet-graph.meta

0 directories, 2 files


## 验证模型精度

载入验证数据集，并验证含有贝叶斯层的LeNet5网络模型的精度。

In [7]:
ds_eval_path = "./datasets/MNIST_Data/test/"
test_set = create_dataset(ds_eval_path, 32, 1)
acc = model.eval(test_set)
print(acc)

{'Accuracy': 0.9730568910256411}


模型精度大于0.95，证明模型效果良好。

## 总结

本例使用了深度概率编程在经典LeNet5深度神经网络中应用，含有贝叶斯层的LeNet5网络和原本的LeNet5网络的训练体验过程极其相似，有心的用户可以对比两者在训练收敛效率，稳定性等方面的不同，是否体现了概述中深度概率编程的优点。  
当然深度概率编程近年来最激动人心的是在CVAE以及GAN等生成网络中的应用，这使我们在拥有了以假乱真的数据生成能力，接下来一篇就以CVAE网络体验介绍深度概率编程的另一种应用。