[AMP] fix static promote #53439

zhangting2020 · 2023-04-28T02:45:50Z

PR types

Bug fixes

PR changes

Others

Description

fix static promote

将因性能有问题而放入unsupprot_list中的算子放入黑名单中，以保证在O2模式下，只有3种场景权重会保持fp32：

算子不支持fp16
不在fp16-guard下
特殊算子bn等需要保持fp32权重

一些模型中可能存在某些算子权重被后续在白名单中的算子使用，权重的名字同时在keep_fp32_var_names和to_fp16_var_names中，可能会导致权重var.dtype和存储的数据的dtype不同。解决方案：如果var在keep_fp32_var_names中，那么将从to_fp16_var_names移除

该场景在transformer模型中存在，修复以下报错问题

paddle-bot · 2023-04-28T02:45:54Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

paddle-bot · 2023-04-28T02:45:56Z

❌ The PR is not created using PR's template. You can refer to this Demo.
Please use PR's template, it helps save our maintainers' time so that more developers get helped.

…53225)" This reverts commit 81c89dd.

…addle#53358)" This reverts commit 1bd468e.

…addlePaddle#53358)"" This reverts commit 31b9309.

…ePaddle#53225)"" This reverts commit 3da2fab.

zhangting2020 · 2023-05-05T07:05:10Z

python/paddle/static/amp/fp16_lists.py

@@ -196,8 +225,6 @@ def _update_list(self):
                elif op_name in self.gray_list:
                    self.gray_list.remove(op_name)
                self.white_list.add(op_name)
-                if op_name in _extra_unsupported_list:
-                    self.unsupported_list.remove(op_name)


该文件235行，实际上不需要把自定义黑名单中的算子放入不支持列表中，因为已经加入了黑名单中
但是由于python/paddle/distributed/passes/auto_parallel_fp16.py 实现中用了AutoMixedPrecisionLists，但是其中部分流程的处理未考虑黑名单算子，因此如果235行删除，会导致相关单测失败。目前尚不清楚直接修改auto_parallel_fp16.py的实现有多大影响，因此暂时未删除。

zhangting2020 · 2023-05-05T07:07:16Z

python/paddle/static/amp/fp16_utils.py

+            # from black_list and unsupport_list.
+            if op in ['lookup_table', 'lookup_table_v2']:
+                continue
+            if _need_keep_fp32(op, amp_lists.unsupported_list, use_fp16_guard):


为了解决transformer模型中出现的报错问题，同时为了动静态图在模型参数类型转换时行为统一：黑名单中的算子依然会保持fp16权重，仅不支持fp16的算子或者不在use_fp16_guard下的算子需要保持fp32权重。

Xreki · 2023-05-06T05:08:10Z

PR最好merge下develop

Xreki

LGTM

fix static promote 将因性能有问题而放入unsupprot_list中的算子放入黑名单中，以保证在O2模式下，只有3种场景权重会保持fp32

zhangting2020 added 2 commits April 28, 2023 03:49

fix static promote

1ad0510

fix unittest

b27447a

zhangting2020 force-pushed the amp_fix branch from 46e3e28 to b27447a Compare April 28, 2023 08:57

zhangting2020 added 5 commits April 28, 2023 10:24

Revert "fix right value is 0d and index is List/Tensor (PaddlePaddle#…

3da2fab

…53225)" This reverts commit 81c89dd.

Revert "Hack__getitem__ from 0-d to 1-d with FLAGS_set_to_1d (PaddleP…

31b9309

…addle#53358)" This reverts commit 1bd468e.

fix unittest failure

4976d88

Revert "Revert "Hack__getitem__ from 0-d to 1-d with FLAGS_set_to_1d (P…

f70169d

…addlePaddle#53358)"" This reverts commit 31b9309.

Revert "Revert "fix right value is 0d and index is List/Tensor (Paddl…

ac71a7b

…ePaddle#53225)"" This reverts commit 3da2fab.

zhangting2020 commented May 5, 2023

View reviewed changes

Xreki approved these changes May 8, 2023

View reviewed changes

Xreki merged commit 2bf6128 into PaddlePaddle:develop May 8, 2023
24 checks passed

niuliling123 pushed a commit to niuliling123/Paddle that referenced this pull request May 9, 2023

[AMP] fix static promote (PaddlePaddle#53439)

a306078

lanxianghit pushed a commit that referenced this pull request May 9, 2023

[AMP] fix static promote (#53439) (#53641)

c27e6d2

fix static promote 将因性能有问题而放入unsupprot_list中的算子放入黑名单中，以保证在O2模式下，只有3种场景权重会保持fp32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMP] fix static promote #53439

[AMP] fix static promote #53439

zhangting2020 commented Apr 28, 2023 •

edited

paddle-bot bot commented Apr 28, 2023

paddle-bot bot commented Apr 28, 2023

zhangting2020 May 5, 2023

zhangting2020 May 5, 2023

Xreki commented May 6, 2023

Xreki left a comment

[AMP] fix static promote #53439

[AMP] fix static promote #53439

Conversation

zhangting2020 commented Apr 28, 2023 • edited

PR types

PR changes

Description

paddle-bot bot commented Apr 28, 2023

paddle-bot bot commented Apr 28, 2023

zhangting2020 May 5, 2023

Choose a reason for hiding this comment

zhangting2020 May 5, 2023

Choose a reason for hiding this comment

Xreki commented May 6, 2023

Xreki left a comment

Choose a reason for hiding this comment

zhangting2020 commented Apr 28, 2023 •

edited