带dropout op的网络预测多次结果不一致 #9144

gavin1332 · 2018-03-16T02:26:36Z

实验条件：

训练语言模型（单层GRU，GRU输出接dropout），测试ppl
使用fluid.io.load_inference_model加载模型
分别从模型不同的epoch顺序预测
查看预测网络pb配置中启用了预测模式

结论：

带dropout op的模型从不同epoch预测，如果启动的epoch一致，则对应epoch的结果一致（见表1最后两列）；如果启动的epoch不一致，则对应epoch的ppl结果不一致（见表1中间三列）
不带dropout op的模型无论如何预测，对应epoch的结果均一致。

表1 带dropout op的实验结果

predict from	epoch 0	epoch 2	epoch 3	epoch 3 (again)
epoch 0	442.66687
epoch 1	329.35619
epoch 2	279.49527	279.49610
epoch 3	245.50802	245.50782	245.50917	245.50917
epoch 4	221.23342	221.23379	221.23351	221.23351

表2 不带dropout op的实验结果

predict from	epoch 0	epoch 2	epoch 3
epoch 0	212.51747
epoch 1	163.43292
epoch 2	143.05275	143.05275
epoch 3	135.12422	135.12422	135.12422
epoch 4	132.23307	132.23307	132.23307

qingqing01 · 2018-03-16T05:34:04Z

查看了下dropout CPU/GPU实现在test阶段的代码，并没发现问题。进一步debug，对比下每次dropout输入输出是否一样？

guoshengCS · 2018-03-16T06:01:48Z

我这边单独测试了下dropout的test mode，测试了test mode下全1输入经过dropout后的输出（最大和最小值），得到的输出感觉是符合预期的，测试代码如下：

import numpy as np

import paddle.v2 as paddle
import paddle.fluid as fluid

data_shape = [64, 32, 512]
is_test = True

def program():
    x = fluid.layers.data(name='x', shape=data_shape, dtype='float32', append_batch_size=False)
    out = fluid.layers.dropout(x, dropout_prob=0.1, is_test=is_test)
    return out


def main():
    # place = fluid.CPUPlace()
    place = fluid.CUDAPlace(0)
    exe = fluid.Executor(place)
    out = program()
    data_input = {}
    in_tensor = fluid.LoDTensor()
    in_tensor.set(np.ones(data_shape, dtype="float32"), place)
    data_input['x'] = in_tensor
    for i in range(10):
       out_ = exe.run(fluid.framework.default_main_program(), feed=data_input, fetch_list=[out])[0]
       print np.max(out_), np.min(out_)


if __name__ == "__main__":
    main()

输出如下：

0.9 0.9
0.9 0.9
0.9 0.9
0.9 0.9
0.9 0.9
0.9 0.9
0.9 0.9
0.9 0.9
0.9 0.9
0.9 0.9

可否再尝试下fetch出来dropout的输入，确认下是否是输入的差异导致的。

pkuyym · 2018-03-19T09:39:27Z

Related #8654

gavin1332 · 2018-03-23T06:28:54Z

更新了主干，主题中遇到的问题没有再出现，关闭该issue。

gavin1332 assigned guru4elephant Mar 16, 2018

guru4elephant assigned guoshengCS Mar 16, 2018

qingqing01 added the User 用于标记用户问题 label Mar 16, 2018

qingqing01 added the 屯 label Mar 16, 2018

This was referenced Mar 21, 2018

Fix/dropout seed #9304

Closed

"fast hack" #9325

Merged

gavin1332 closed this as completed Mar 23, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

带dropout op的网络预测多次结果不一致 #9144

带dropout op的网络预测多次结果不一致 #9144

gavin1332 commented Mar 16, 2018

qingqing01 commented Mar 16, 2018

guoshengCS commented Mar 16, 2018

pkuyym commented Mar 19, 2018

gavin1332 commented Mar 23, 2018

带dropout op的网络预测多次结果不一致 #9144

带dropout op的网络预测多次结果不一致 #9144

Comments

gavin1332 commented Mar 16, 2018

qingqing01 commented Mar 16, 2018

guoshengCS commented Mar 16, 2018

pkuyym commented Mar 19, 2018

gavin1332 commented Mar 23, 2018