-
Notifications
You must be signed in to change notification settings - Fork 433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
部署时精度差异大 #23
Comments
我们实测RepVGG转换前后的等价性是可以精确到小数点后十位的,这个跟FLOPs高低没有关系。看起来这是一个customize的模型(因为最后是372类)。可否像FAQs的第一条那样check一下? |
输出: |
我在实现RepVGG的时候观察到两个现象:
会造成较大精度不对齐,而使用下述初始化则可以保证一致性
下面是测试代码:
希望能够对你有所帮助 |
@zjykzj 感谢🙏 我确定是eval了以后才convert的。convert似乎就是bn融合+1x1padding成3x3,我找时间再读读。 |
在任何情况下任何含有BN或dropout的模型在inference前都应该调用eval(),这一点跟RepVGG没有关系。@zjykzj请问可否给出一个造成较大精度不对齐的示例?我试过不管怎么初始化,weights的值域不管多大,实测都是完全一样的。 |
上面的测试代码直接跑的是用初始化的权重,这样的话是没问题。但是加载训练过的权重的话就不行了 |
我的实现和测试是在自定义仓库( ZJCV/ZCls )上完成的,目前关键代码均来自楼主的实现,所以整体实现是一样的,我会在下面贴出具体定义的模型,比较一下就了解了 可分为两个部分测试:
测试RepVGGBlock测试代码测试代码如下(参考test_repvgg_block.py):
测试结果初始化方式一使用如下方式进行初始化:
测试结果如下:
初始化方式二使用如下方式进行初始化:
测试结果如下:
测试RepVGG_B2G4和上面的实现类似 测试代码
测试结果初始化方式一使用如下方式进行初始化:
测试结果如下:
初始化方式二使用如下方式进行初始化:
测试结果如下:
小结具体现象和 @MaeThird 类似,就是小模型的误差精度小,大模型的误差精度大 |
我们的经验是这点浮点误差在任何任务上都观察不到任何差别。另外楼上说这种初始化造成误差小,那是自然的,因为gamma从1变成0.01了嘛…… |
感谢大佬的作品。
使用时,我训练小模型10MFLOPS以内部署时精度损失可以忽略,但是大模型2GFLOPS时精度就对不齐了:
LOG:
deploy param: stage0.rbr_reparam.weight torch.Size([64, 1, 3, 3]) -0.048573527
deploy param: stage0.rbr_reparam.bias torch.Size([64]) 0.23182523
deploy param: stage1.0.rbr_reparam.weight torch.Size([128, 64, 3, 3]) -0.0054542203
deploy param: stage1.0.rbr_reparam.bias torch.Size([128]) 1.0140312
deploy param: stage1.1.rbr_reparam.weight torch.Size([128, 64, 3, 3]) 0.0006282824
deploy param: stage1.1.rbr_reparam.bias torch.Size([128]) 0.32761782
deploy param: stage1.2.rbr_reparam.weight torch.Size([128, 128, 3, 3]) 0.0023862773
deploy param: stage1.2.rbr_reparam.bias torch.Size([128]) 0.34976208
deploy param: stage1.3.rbr_reparam.weight torch.Size([128, 64, 3, 3]) -9.027165e-05
deploy param: stage1.3.rbr_reparam.bias torch.Size([128]) 0.0063683093
deploy param: stage2.0.rbr_reparam.weight torch.Size([256, 128, 3, 3]) -8.460902e-05
deploy param: stage2.0.rbr_reparam.bias torch.Size([256]) 0.11033552
deploy param: stage2.1.rbr_reparam.weight torch.Size([256, 128, 3, 3]) -0.00010023986
deploy param: stage2.1.rbr_reparam.bias torch.Size([256]) -0.15826604
deploy param: stage2.2.rbr_reparam.weight torch.Size([256, 256, 3, 3]) -5.3966836e-05
deploy param: stage2.2.rbr_reparam.bias torch.Size([256]) -0.15924689
deploy param: stage2.3.rbr_reparam.weight torch.Size([256, 128, 3, 3]) -6.7551824e-05
deploy param: stage2.3.rbr_reparam.bias torch.Size([256]) -0.37404576
deploy param: stage2.4.rbr_reparam.weight torch.Size([256, 256, 3, 3]) -0.00012947948
deploy param: stage2.4.rbr_reparam.bias torch.Size([256]) -0.6853457
deploy param: stage2.5.rbr_reparam.weight torch.Size([256, 128, 3, 3]) 7.473848e-05
deploy param: stage2.5.rbr_reparam.bias torch.Size([256]) -0.16874048
deploy param: stage3.0.rbr_reparam.weight torch.Size([512, 256, 3, 3]) -0.000433887
deploy param: stage3.0.rbr_reparam.bias torch.Size([512]) 0.18602118
deploy param: stage3.1.rbr_reparam.weight torch.Size([512, 256, 3, 3]) 0.00048246872
deploy param: stage3.1.rbr_reparam.bias torch.Size([512]) -0.7235512
deploy param: stage3.2.rbr_reparam.weight torch.Size([512, 512, 3, 3]) 0.00021061227
deploy param: stage3.2.rbr_reparam.bias torch.Size([512]) -0.5657553
deploy param: stage3.3.rbr_reparam.weight torch.Size([512, 256, 3, 3]) -0.00081703335
deploy param: stage3.3.rbr_reparam.bias torch.Size([512]) -0.37847003
deploy param: stage3.4.rbr_reparam.weight torch.Size([512, 512, 3, 3]) -0.00033185782
deploy param: stage3.4.rbr_reparam.bias torch.Size([512]) -0.57922906
deploy param: stage3.5.rbr_reparam.weight torch.Size([512, 256, 3, 3]) -0.0007206367
deploy param: stage3.5.rbr_reparam.bias torch.Size([512]) -0.56909364
deploy param: stage3.6.rbr_reparam.weight torch.Size([512, 512, 3, 3]) -0.0003344199
deploy param: stage3.6.rbr_reparam.bias torch.Size([512]) -0.5628111
deploy param: stage3.7.rbr_reparam.weight torch.Size([512, 256, 3, 3]) -0.00021987755
deploy param: stage3.7.rbr_reparam.bias torch.Size([512]) -0.34248477
deploy param: stage3.8.rbr_reparam.weight torch.Size([512, 512, 3, 3]) -0.00010127398
deploy param: stage3.8.rbr_reparam.bias torch.Size([512]) -0.5895205
deploy param: stage3.9.rbr_reparam.weight torch.Size([512, 256, 3, 3]) -0.0005824505
deploy param: stage3.9.rbr_reparam.bias torch.Size([512]) -0.37577158
deploy param: stage3.10.rbr_reparam.weight torch.Size([512, 512, 3, 3]) -0.00012262027
deploy param: stage3.10.rbr_reparam.bias torch.Size([512]) -0.6199002
deploy param: stage3.11.rbr_reparam.weight torch.Size([512, 256, 3, 3]) 1.503076e-06
deploy param: stage3.11.rbr_reparam.bias torch.Size([512]) -0.7054796
deploy param: stage3.12.rbr_reparam.weight torch.Size([512, 512, 3, 3]) 0.0006349176
deploy param: stage3.12.rbr_reparam.bias torch.Size([512]) -1.0350925
deploy param: stage3.13.rbr_reparam.weight torch.Size([512, 256, 3, 3]) 0.00037807773
deploy param: stage3.13.rbr_reparam.bias torch.Size([512]) -1.1399512
deploy param: stage3.14.rbr_reparam.weight torch.Size([512, 512, 3, 3]) 0.00025178236
deploy param: stage3.14.rbr_reparam.bias torch.Size([512]) -0.27695537
deploy param: stage3.15.rbr_reparam.weight torch.Size([512, 256, 3, 3]) 0.00074805244
deploy param: stage3.15.rbr_reparam.bias torch.Size([512]) -0.8776718
deploy param: stage4.0.rbr_reparam.weight torch.Size([1024, 512, 3, 3]) -0.00013951868
deploy param: stage4.0.rbr_reparam.bias torch.Size([1024]) 0.021552037
deploy param: linear.weight torch.Size([372, 1024]) 0.0051029953
deploy param: linear.bias torch.Size([372]) 0.17604762
打印代码:
The text was updated successfully, but these errors were encountered: