add fc-residual quantization #46917

sfraczek · 2022-10-11T15:39:55Z

PR types

Performance optimization

PR changes

Others

Describe

Add quantization of FC+residual fused op

paddle-bot · 2022-10-11T15:39:59Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

tsocha · 2022-10-12T08:41:58Z

paddle/fluid/framework/ir/mkldnn/cpu_quantize_squash_pass.cc

+        bool residual_fc = false;

        std::string last_op_input_name =
            FindInputNameByVarName(last_op_op, quant_out->Name());
+        if (last_op_input_name.empty()) {
+          last_op_input_name =
+              FindOutputNameByVarName(last_op_op, quant_out->Name());
+          PADDLE_ENFORCE_EQ(last_op_input_name,
+                            "ResidualData",
+                            platform::errors::InvalidArgument(
+                                "Only the ResidualData output is allowed to be "
+                                "linked to earlier op."));
+          PADDLE_ENFORCE_EQ(last_op_op->Type(),
+                            "fc",
+                            platform::errors::InvalidArgument(
+                                "Only fc operator supports ResidualData output "
+                                "linked as input."));
+          residual_fc = true;
+        }


This logic seems a little bit odd for me.
Does residual connection must have an empty name?
Are there any other scenarios where the name can be empty?
What if op will have empty name in graph, will this pass throw an exception?

it's empty because FindInputNameByVarName returns empty string if it's not found. In that case we search for name in the outputs instead of inputs because fc is special case when residualData is output instead of input.
If this if is entered and those two checks pass, then we know we are dealing with fc with residual data (in output). And so we set residual_fc= true which later informs decisions whether we should call SetInput or SetOutput

tsocha · 2022-10-12T08:44:21Z

paddle/fluid/framework/ir/mkldnn/cpu_quantize_pass.cc

+  bool residual_fc = input_name == "ResidualData" && op->Op()->Type() == "fc";
+  auto inputs = residual_fc ? op->Op()->OutputNames() : op->Op()->InputNames();
  bool name_found =
      std::find(inputs.begin(), inputs.end(), input_name) != inputs.end();
  PADDLE_ENFORCE_EQ(name_found,
                    true,
                    platform::errors::InvalidArgument(
-                        "Var(%s) isn't the input of the %s operator.",
+                        "Var(%s) isn't the %s of the %s operator.",
                        input_name,
+                        residual_fc ? "output" : "input",


It's stinky for me, what if we will add more ops with residual connection in the future ?

It depends on what we will agree on with baidu. Previously it was easy because we assumed that ResidualData will be input because on the graph it's linked as input and in OpDesc it was set as input. But recently when FC with residual was added, we were instructed to use Output in OpDesc but it's impossible to set it as output in graph so it's still linked as input. From this comes this confusion in code both in graph pattern detector I had to write custom asserts and here I had to do some hacks that I don't have a good idea for design yet especially that it might be subject to change based on what baidu will say about this. I don't know if there is a great way to make this clear because it is inherently confusing so let's say it's a POC.

Silv3S · 2022-10-12T10:13:48Z

paddle/fluid/framework/ir/mkldnn/fc_mkldnn_pass.cc

@@ -32,7 +32,7 @@ class Graph;
 namespace {
 void LogEnabledOps(const int counter, const std::string& details) {


This method is redefined in each pass. Some log only if count is > 0, others log all. Do you think we could extract this fuse pass logger to mkldnn_reuse.h and only call it?

I saw one in cpu_quantize_pass.cc from which I copied and modified it but this is only two examples so please give me a third as per the rule of three :D
https://understandlegacycode.com/blog/refactoring-rule-of-three/#use-the-rule-of-three

I mean we have multiple fuse pass counters with logs e.g. elt_act_mkldnn_fuse_pass.cc:102, fc_elementwise_add_mkldnn_fuse_pass.cc:119, matmul_transpose_reshape_mkldnn_fuse_pass.cc:108 etc.

I think that same code is repeated each time, with only hardcoded op name being changed. It is not particularly related to your PR, I just wanted to gather opinions if you find it good idea as future improvement.

Ok there is some repetition even in non mkldnn passes so if we were to add such method in future we could put it in FusePassBase or some other common parent class.

sfraczek · 2022-10-12T13:41:12Z

Hi @zhiqiu,
This PR #41776 has created some additional confusion with regards to writing asserts for residual connection in graph_pattern_detector.cc https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/framework/ir/graph_pattern_detector.cc#L1081-L1094. Also in quantization (this PR) now we have to look for residual data not just in input but also in output. Can you please share your thoughts on how we should proceed with residual connections?

jakpiase

LGTM, but this logic with residualdata as an output looks weird to me

Silv3S

LGTM

Silv3S · 2022-10-13T14:35:45Z

paddle/fluid/framework/ir/mkldnn/cpu_quantize_pass.cc

+                    is_residual_unsigned,
+                    "Scale_in_eltwise");
+    } else {
+      if (!AreScalesPresentForNodes({input, weights})) {


You can check if scales are present for input and weights first, because it's common for both conditions. Then if with_residual_data check only if scales are present for node residual_data

call twice AreScalesPresntForNodes instead of if-else

zhiqiu · 2022-10-14T03:58:17Z

Hi @zhiqiu, This PR #41776 has created some additional confusion with regards to writing asserts for residual connection in graph_pattern_detector.cc https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/fluid/framework/ir/graph_pattern_detector.cc#L1081-L1094. Also in quantization (this PR) now we have to look for residual data not just in input but also in output. Can you please share your thoughts on how we should proceed with residual connections?

Hi, I don't really know the background of this work. But I think it is not a good design that ResidualData can be both input and output. Can you explain more about your coments above, i.e.,instructed to use Output in OpDesc?

sfraczek · 2022-10-14T10:52:26Z

Hi, I don't really know the background of this work. But I think it is not a good design that ResidualData can be both input and output. Can you explain more about your coments above, i.e.,instructed to use Output in OpDesc?

I was referring to your suggestion that ResidualData should be made output instead of input #40834 (comment). This PR was closed and reopened as new one PR #41776 and merged. Previously, we had already been using ResidualData as input for convolution and now we have ResidualData for FC as output. In Paddle we have quantization code for convolution op with ResidualData as input. Here I am adding quantization of fc with ResidualData as Output.
As you say, it seems wrong to have both input and output so we want to unify it.
We can

change CONV's ResidualData to Output,
change FC's ResidualData to Input.

To me it seems that Input was better for the following reasons:

In graph_pattern_detector.cc it was easy to write a pattern using existing assert functions but for residual_data as output the pattern got this complicated because it checks if it is input and output at the same time so I added a comment https://github.com/PaddlePaddle/Paddle/pull/46757/files#diff-3363ff47a1f22a111386eeff77a8572f6618d7d0602ffe71e92ecf8682faa84eR1081-R1094,
Netron shows an arrow for ResidualData connection only when it is input but when it's output it doesn't show it so at the first glance there is no connection visible when analyzing graph of model and it requires clicking on a node to see if it has residualData output. Using netron to inspect models is what we are used to do routinely to look for optimizations.

It is logically a little confusing to me that we have to SetOutput and ir_node_link_to residual_data -> fc at the same time https://github.com/PaddlePaddle/Paddle/pull/41776/files#diff-9bebe579d9782bcfbf4b11872065b743461e89a36c3924972e98ee7df8824d16R102-R109

Beside the above a small extra argument for me is that In quantization code (cpu_quantize_pass.cc) we will not be able to reuse the QuantizeInput method like it is used for all inputs but we will have to add another method that will be doing mostly the same thing but replace input with output - perhaps called QuantizeResidualData.

zhiqiu · 2022-10-19T08:49:26Z

Hi, I don't really know the background of this work. But I think it is not a good design that ResidualData can be both input and output. Can you explain more about your coments above, i.e.,instructed to use Output in OpDesc?

I was referring to your suggestion that ResidualData should be made output instead of input #40834 (comment). This PR was closed and reopened as new one PR #41776 and merged. Previously, we had already been using ResidualData as input for convolution and now we have ResidualData for FC as output. In Paddle we have quantization code for convolution op with ResidualData as input. Here I am adding quantization of fc with ResidualData as Output. As you say, it seems wrong to have both input and output so we want to unify it. We can

change CONV's ResidualData to Output,

change FC's ResidualData to Input.

To me it seems that Input was better for the following reasons:

In graph_pattern_detector.cc it was easy to write a pattern using existing assert functions but for residual_data as output the pattern got this complicated because it checks if it is input and output at the same time so I added a comment https://github.com/PaddlePaddle/Paddle/pull/46757/files#diff-3363ff47a1f22a111386eeff77a8572f6618d7d0602ffe71e92ecf8682faa84eR1081-R1094,

Netron shows an arrow for ResidualData connection only when it is input but when it's output it doesn't show it so at the first glance there is no connection visible when analyzing graph of model and it requires clicking on a node to see if it has residualData output. Using netron to inspect models is what we are used to do routinely to look for optimizations.

It is logically a little confusing to me that we have to SetOutput and ir_node_link_to residual_data -> fc at the same time https://github.com/PaddlePaddle/Paddle/pull/41776/files#diff-9bebe579d9782bcfbf4b11872065b743461e89a36c3924972e98ee7df8824d16R102-R109

Beside the above a small extra argument for me is that In quantization code (cpu_quantize_pass.cc) we will not be able to reuse the QuantizeInput method like it is used for all inputs but we will have to add another method that will be doing mostly the same thing but replace input with output - perhaps called QuantizeResidualData.

Thanks, I got your views. If Input s already used and it is BETTER, I agree that you can change output to input.

sfraczek · 2022-10-26T20:29:28Z

Something is broken about the CI.

yeliang2258 · 2022-11-04T10:13:58Z

@sfraczek Please resolve the conflict of the code, thank you

yeliang2258 · 2022-11-07T02:27:03Z

@sfraczek Please resolve the conflict of the code, thanks

sfraczek · 2022-11-15T16:09:33Z

@chenwhql Please approve because the test that failed is outdated. It uses old Quant2Int8MkldnnPass which was replaced by C++ passes. We are planning to rewrite the test in Q1.

chenwhql · 2022-11-16T14:18:17Z

@chenwhql Please approve because the test that failed is outdated. It uses old Quant2Int8MkldnnPass which was replaced by C++ passes. We are planning to rewrite the test in Q1.

@sfraczek I don't understand, approving cannot resolve the test failed problem

python/paddle/fluid/contrib/slim/quantization/quant2_int8_mkldnn_pass.py

revert changes to unsupported script

yeliang2258

LGTM

jczaja

LGTM

paddle-bot · 2022-11-21T11:09:02Z

你的PR已合入Paddle库，请关注后续测试结果。
Your PR has been merged into the repository. An official integration test will be conducted later. Stay tuned.

$sfraczek$

$@sfraczek$

add fc-residual quantization

8bc4fb8

paddle-bot bot added contributor External developers status: proposed labels Oct 11, 2022

$@sfraczek$ sfraczek added Intel int8 labels Oct 11, 2022

paddle-bot bot removed the status: proposed label Oct 11, 2022

$@sfraczek$ sfraczek requested review from Silv3S, wozna, jakpiase and tsocha October 11, 2022 15:45

sfraczek added 2 commits October 12, 2022 10:21

$@sfraczek$

revert removal of check for use_mkldnn

b2ecf93

$@sfraczek$

fix bug

4bf3682

tsocha suggested changes Oct 12, 2022

View reviewed changes

Silv3S reviewed Oct 12, 2022

View reviewed changes

$@sfraczek$

add disable_logs

558a3ff

tsocha previously approved these changes Oct 13, 2022

View reviewed changes

jakpiase previously approved these changes Oct 13, 2022

View reviewed changes

Silv3S previously approved these changes Oct 13, 2022

View reviewed changes

$@sfraczek$

review fix

d25018d

call twice AreScalesPresntForNodes instead of if-else

$@sfraczek$ sfraczek dismissed stale reviews from Silv3S, jakpiase, and tsocha via d25018d October 13, 2022 14:55

paddle-bot-old bot added contributor External developers and removed contributor External developers labels Oct 17, 2022

$@sfraczek$

Merge branch 'develop' into fc-residual-v2

8266f35

sfraczek added 5 commits October 26, 2022 10:47

$@sfraczek$

revert fc mkldnn taking residual data

850bc97

$@sfraczek$

format fix

4373bb6

$@sfraczek$

Merge branch 'develop' into fc-residual-v2

8b33535

$@sfraczek$

fix LoDTensor->DenseTensor

ac275b2

$@sfraczek$

LoDTensor->DenseTensor

091c8bd

$@sfraczek$

Merge branch 'PaddlePaddle:develop' into fc-residual-v2

ee75b76

onecatcn requested a review from yeliang2258 November 3, 2022 13:42

$@sfraczek$

Merge branch 'develop' into fc-residual-v2

fdb3e9f

onecatcn assigned yeliang2258 Nov 14, 2022

sfraczek added 2 commits November 14, 2022 17:33

$@sfraczek$

output->input

78297a5

$@sfraczek$

Merge branch 'develop' into fc-residual-v2

c2f6860

$@sfraczek$

This comment was marked as outdated.

Sign in to view

$@sfraczek$ sfraczek requested a review from Silv3S November 15, 2022 16:44

$sfraczek$

sfraczek commented Nov 17, 2022

View reviewed changes

python/paddle/fluid/contrib/slim/quantization/quant2_int8_mkldnn_pass.py Outdated Show resolved Hide resolved

$@sfraczek$

revert changes to unsupported script

f2eca68

revert changes to unsupported script

$@sfraczek$ sfraczek mentioned this pull request Nov 17, 2022

Some residualdata fixes #48118

Merged

$@sfraczek$

remove fc residualdata from output blocklist in cpu_bfloat16_pass.cc

4e1a154

$@sfraczek$ sfraczek requested review from jakpiase and tsocha November 18, 2022 12:18

tsocha approved these changes Nov 18, 2022

View reviewed changes

yeliang2258 approved these changes Nov 20, 2022

View reviewed changes

jczaja approved these changes Nov 21, 2022

View reviewed changes

jczaja merged commit fed0ed3 into PaddlePaddle:develop Nov 21, 2022

paddle-bot bot added the status: accepted label Nov 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add fc-residual quantization #46917

add fc-residual quantization #46917

$@sfraczek$ sfraczek commented Oct 11, 2022

paddle-bot bot commented Oct 11, 2022

tsocha Oct 12, 2022

$@sfraczek$ sfraczek Oct 12, 2022 •

edited

Loading

tsocha Oct 12, 2022

$@sfraczek$ sfraczek Oct 12, 2022 •

edited

Loading

Silv3S Oct 12, 2022

$@sfraczek$ sfraczek Oct 12, 2022

Silv3S Oct 12, 2022

$@sfraczek$ sfraczek Oct 12, 2022

sfraczek commented Oct 12, 2022 •

edited

Loading

jakpiase left a comment

Silv3S left a comment

Silv3S Oct 13, 2022

$@sfraczek$ sfraczek Oct 13, 2022

zhiqiu commented Oct 14, 2022 •

edited

Loading

sfraczek commented Oct 14, 2022

zhiqiu commented Oct 19, 2022

sfraczek commented Oct 26, 2022

yeliang2258 commented Nov 4, 2022

yeliang2258 commented Nov 7, 2022

This comment was marked as outdated.

sfraczek commented Nov 15, 2022 •

edited

Loading

chenwhql commented Nov 16, 2022

yeliang2258 left a comment

jczaja left a comment

paddle-bot bot commented Nov 21, 2022

		@@ -32,7 +32,7 @@ class Graph;
		namespace {
		void LogEnabledOps(const int counter, const std::string& details) {

add fc-residual quantization #46917

add fc-residual quantization #46917

Conversation

sfraczek commented Oct 11, 2022

PR types

PR changes

Describe

paddle-bot bot commented Oct 11, 2022

Choose a reason for hiding this comment

sfraczek Oct 12, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfraczek Oct 12, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfraczek commented Oct 12, 2022 • edited Loading

jakpiase left a comment

Choose a reason for hiding this comment

Silv3S left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhiqiu commented Oct 14, 2022 • edited Loading

sfraczek commented Oct 14, 2022

zhiqiu commented Oct 19, 2022

sfraczek commented Oct 26, 2022

yeliang2258 commented Nov 4, 2022

yeliang2258 commented Nov 7, 2022

This comment was marked as outdated.

sfraczek commented Nov 15, 2022 • edited Loading

chenwhql commented Nov 16, 2022

yeliang2258 left a comment

Choose a reason for hiding this comment

jczaja left a comment

Choose a reason for hiding this comment

paddle-bot bot commented Nov 21, 2022

$@sfraczek$ sfraczek commented Oct 11, 2022

$@sfraczek$ sfraczek Oct 12, 2022 •

edited

Loading

$@sfraczek$ sfraczek Oct 12, 2022 •

edited

Loading

sfraczek commented Oct 12, 2022 •

edited

Loading

zhiqiu commented Oct 14, 2022 •

edited

Loading

sfraczek commented Nov 15, 2022 •

edited

Loading