模型转换为pytorch的ckpt并加载之后，同样的id 进行embedding的结果不同 #75

randomtutu · 2019-04-03T12:38:19Z

我在自己的数据集上，取得了比较大的提升，因此想要继续修改模型。

因为paddle文档比较少，因此想要把转换成pytorch。

我使用 #37 上的脚本，将paddle的ckpt转换为dict,再通过简单的名字对应和tensor的转换，将对应权重加载到pytorch中。

但是加载后做测试，结果相差很多。

为了找到不同的地方，我将两个框架的模型跑同一数据的前向传播，将每一步的数据进行对比。
结果发现，两个框架上同样的embedding权重（weight），对相同id的embedding居然不同。
我暂时没有对比两个框架的实现方法有何异同。

我推测要么是两个框架的embedding实现方法有差别，要么就是这个脚本保存的权重出现了一点问题。

希望有在做同样的事情的同学和我联系，一起交流。

randomtutu · 2019-04-04T07:43:31Z

问题已经解决，主要是模型的LaynerNorm的eps不一样导致的。

一楼的问题主要是因为在从scope 里面传出数据的时候没有加 persistable = True。

Mansterteddy · 2019-04-08T03:05:29Z

@randomtutu 请问可以share一下你的pytorch转化脚本么？另外，ERNIE和BERT的vocab.txt是不一样的，请问你是怎么解决这个问题的？

huntzhan · 2019-04-12T06:11:04Z

@randomtutu 请教一下，你用的是 https://github.com/huggingface/pytorch-pretrained-BERT 的实现吗？方便 share 一下你的脚本吗？:)

Mansterteddy · 2019-04-12T07:28:57Z

已成功转成pytorch版本，发现效果不错，可以参考huggingface将tensorflow转为pytorch的脚本。

randomtutu · 2019-04-15T06:32:23Z

@randomtutu 请教一下，你用的是 https://github.com/huggingface/pytorch-pretrained-BERT 的实现吗？方便 share 一下你的脚本吗？:)

参考楼下的思路，脚本很简单哈，我这里因为是内网，就暂时不发了

randomtutu · 2019-04-15T06:33:14Z

已成功转成pytorch版本，发现效果不错，可以参考huggingface将tensorflow转为pytorch的脚本。

你好，希望留一个联系方式交流一下，我还在沿着这个模型做更多的工作。微信563056419

arlenzhu · 2019-04-21T08:17:53Z

问题已经解决，主要是模型的LaynerNorm的eps不一样导致的。

一楼的问题主要是因为在从scope 里面传出数据的时候没有加 persistable = True。

你好请问persistable = True是加在哪里？对paddlepaddle框架不是很熟悉

wq343580510 · 2019-04-25T12:53:43Z

问题已经解决，主要是模型的LaynerNorm的eps不一样导致的。

一楼的问题主要是因为在从scope 里面传出数据的时候没有加 persistable = True。

看了下eps都是1e-5啊

huntzhan · 2019-04-25T13:32:59Z

问题已经解决，主要是模型的LaynerNorm的eps不一样导致的。

一楼的问题主要是因为在从scope 里面传出数据的时候没有加 persistable = True。

Hi @randomtutu , 请教一下你是在哪里设置 persistable=True 的？

我观察到 ernie_encoder.py 有设置 persistable=True 的，不知道是否与你的做法相关

https://github.com/PaddlePaddle/LARK/blob/b9dae026c25602b96adf7ee776ff9f894c912338/ERNIE/ernie_encoder.py#L78-L81

huntzhan · 2019-04-25T13:48:12Z

问题已经解决，主要是模型的LaynerNorm的eps不一样导致的。
一楼的问题主要是因为在从scope 里面传出数据的时候没有加 persistable = True。

看了下eps都是1e-5啊

pytorch-pretrained-BERT 里是 1e-12 😂

https://github.com/huggingface/pytorch-pretrained-BERT/blob/d76a57b0ba198eee27b3777f57fcabb6aba8b965/pytorch_pretrained_bert/modeling.py#L231-L234

huntzhan · 2019-04-25T15:55:56Z

感谢 @randomtutu，按照他的发现把所有 LayerNorm 的 epsilon 设成 1e-5 了，然后可以保证五位小数精度与 PaddlePaddle 的结果一致

wq343580510 · 2019-04-26T11:49:25Z

感谢 @randomtutu，按照他的发现把所有 LayerNorm 的 epsilon 设成 1e-5 了，然后可以保证五位小数精度与 PaddlePaddle 的结果一致

所以请问persistable需要加吗？

fyubang · 2019-04-27T09:37:52Z

请问pytorch的persistable在哪加啊？

fyubang · 2019-04-27T10:00:02Z

感谢 @randomtutu，按照他的发现把所有 LayerNorm 的 epsilon 设成 1e-5 了，然后可以保证五位小数精度与 PaddlePaddle 的结果一致

请问一下如果设置成1e-12训练的话，效果会差多少？

huntzhan · 2019-04-27T11:26:16Z

@wq343580510 @fyubang
与 @randomtutu 交流后，确认 persistable 不需要加
转成 torch 之后可以复现 paper 里的结果

fyubang · 2019-04-27T13:20:37Z

@wq343580510 @fyubang
与 @randomtutu 交流后，确认 persistable 不需要加
转成 torch 之后可以复现 paper 里的结果

你好，能加一下我微信交流一下吗？zhaofubang0014 。最近有个项目，试了bert和ernie，ernie的表现却一直没有比bert好。

wq343580510 · 2019-04-28T08:39:02Z

遇到一个奇怪的问题，用paddle代码finetune的模型确实比bert好，而且转成pytorch后预估打分一致，但是将作者提供的预训练模型转成pytorch版后，再进行微调结果不理想。试了下应该不是最后一层分类的初始化方式不同导致的，具体原因还在查（一个是先finetune再转，一个是先将pretrain model转格式，再在pytorch代码finetune）有老哥遇到类似的问题吗？

fyubang · 2019-04-28T08:57:57Z

遇到一个奇怪的问题，用paddle代码finetune的模型确实比bert好，而且转成pytorch后预估打分一致，但是将作者提供的预训练模型转成pytorch版后，再进行微调结果不理想。试了下应该不是最后一层分类的初始化方式不同导致的，具体原因还在查（一个是先finetune再转，一个是先将pretrain model转格式，再在pytorch代码finetune）有老哥遇到类似的问题吗？

我是直接转格式再在torch上fine tune的，一直没有得到特别好的效果。（因为数据流和最终的metrics比较麻烦，所以一直没在paddlepaddle上fine tune过）。你两种情况结果差多少个点啊？

wq343580510 · 2019-04-28T09:26:43Z

遇到一个奇怪的问题，用paddle代码finetune的模型确实比bert好，而且转成pytorch后预估打分一致，但是将作者提供的预训练模型转成pytorch版后，再进行微调结果不理想。试了下应该不是最后一层分类的初始化方式不同导致的，具体原因还在查（一个是先finetune再转，一个是先将pretrain model转格式，再在pytorch代码finetune）有老哥遇到类似的问题吗？

我是直接转格式再在torch上fine tune的，一直没有得到特别好的效果。（因为数据流和最终的metrics比较麻烦，所以一直没在paddlepaddle上fine tune过）。你两种情况结果差多少个点啊？

ernie能在我的任务上提高一个点左右的auc（相对于bert），但我先转完pytorch后finetune还有点问题在排查，建议你先用paddle的代码跑通看看效果

huntzhan · 2019-04-28T10:24:30Z

遇到一个奇怪的问题，用paddle代码finetune的模型确实比bert好，而且转成pytorch后预估打分一致，但是将作者提供的预训练模型转成pytorch版后，再进行微调结果不理想。试了下应该不是最后一层分类的初始化方式不同导致的，具体原因还在查（一个是先finetune再转，一个是先将pretrain model转格式，再在pytorch代码finetune）有老哥遇到类似的问题吗？

我转成 pytorch 之后在 LCQMC 上 finetune 可以得到与论文匹配到结果，用的是 https://github.com/PaddlePaddle/LARK/blob/develop/ERNIE/script/run_lcqmc.sh 里的参数

stale · 2020-05-21T13:20:45Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reopen it. Thank you for your contributions.

stale bot added the wontfix This will not be worked on label May 21, 2020

stale bot closed this as completed May 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

模型转换为pytorch的ckpt并加载之后，同样的id 进行embedding的结果不同 #75

模型转换为pytorch的ckpt并加载之后，同样的id 进行embedding的结果不同 #75

randomtutu commented Apr 3, 2019

randomtutu commented Apr 4, 2019

Mansterteddy commented Apr 8, 2019

huntzhan commented Apr 12, 2019

Mansterteddy commented Apr 12, 2019

randomtutu commented Apr 15, 2019

randomtutu commented Apr 15, 2019

arlenzhu commented Apr 21, 2019

wq343580510 commented Apr 25, 2019

huntzhan commented Apr 25, 2019

huntzhan commented Apr 25, 2019

huntzhan commented Apr 25, 2019

wq343580510 commented Apr 26, 2019

fyubang commented Apr 27, 2019

fyubang commented Apr 27, 2019

huntzhan commented Apr 27, 2019

fyubang commented Apr 27, 2019

wq343580510 commented Apr 28, 2019

fyubang commented Apr 28, 2019

wq343580510 commented Apr 28, 2019

huntzhan commented Apr 28, 2019

stale bot commented May 21, 2020

模型转换为pytorch的ckpt并加载之后，同样的id 进行embedding的结果不同 #75

模型转换为pytorch的ckpt并加载之后，同样的id 进行embedding的结果不同 #75

Comments

randomtutu commented Apr 3, 2019

randomtutu commented Apr 4, 2019

Mansterteddy commented Apr 8, 2019

huntzhan commented Apr 12, 2019

Mansterteddy commented Apr 12, 2019

randomtutu commented Apr 15, 2019

randomtutu commented Apr 15, 2019

arlenzhu commented Apr 21, 2019

wq343580510 commented Apr 25, 2019

huntzhan commented Apr 25, 2019

huntzhan commented Apr 25, 2019

huntzhan commented Apr 25, 2019

wq343580510 commented Apr 26, 2019

fyubang commented Apr 27, 2019

fyubang commented Apr 27, 2019

huntzhan commented Apr 27, 2019

fyubang commented Apr 27, 2019

wq343580510 commented Apr 28, 2019

fyubang commented Apr 28, 2019

wq343580510 commented Apr 28, 2019

huntzhan commented Apr 28, 2019

stale bot commented May 21, 2020