-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add fc-residual quantization #46917
add fc-residual quantization #46917
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
bool residual_fc = false; | ||
|
||
std::string last_op_input_name = | ||
FindInputNameByVarName(last_op_op, quant_out->Name()); | ||
if (last_op_input_name.empty()) { | ||
last_op_input_name = | ||
FindOutputNameByVarName(last_op_op, quant_out->Name()); | ||
PADDLE_ENFORCE_EQ(last_op_input_name, | ||
"ResidualData", | ||
platform::errors::InvalidArgument( | ||
"Only the ResidualData output is allowed to be " | ||
"linked to earlier op.")); | ||
PADDLE_ENFORCE_EQ(last_op_op->Type(), | ||
"fc", | ||
platform::errors::InvalidArgument( | ||
"Only fc operator supports ResidualData output " | ||
"linked as input.")); | ||
residual_fc = true; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic seems a little bit odd for me.
Does residual connection must have an empty name?
Are there any other scenarios where the name can be empty?
What if op will have empty name in graph, will this pass throw an exception?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's empty because FindInputNameByVarName returns empty string if it's not found. In that case we search for name in the outputs instead of inputs because fc is special case when residualData is output instead of input.
If this if is entered and those two checks pass, then we know we are dealing with fc with residual data (in output). And so we set residual_fc= true which later informs decisions whether we should call SetInput or SetOutput
bool residual_fc = input_name == "ResidualData" && op->Op()->Type() == "fc"; | ||
auto inputs = residual_fc ? op->Op()->OutputNames() : op->Op()->InputNames(); | ||
bool name_found = | ||
std::find(inputs.begin(), inputs.end(), input_name) != inputs.end(); | ||
PADDLE_ENFORCE_EQ(name_found, | ||
true, | ||
platform::errors::InvalidArgument( | ||
"Var(%s) isn't the input of the %s operator.", | ||
"Var(%s) isn't the %s of the %s operator.", | ||
input_name, | ||
residual_fc ? "output" : "input", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's stinky for me, what if we will add more ops with residual connection in the future ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends on what we will agree on with baidu. Previously it was easy because we assumed that ResidualData will be input because on the graph it's linked as input and in OpDesc it was set as input. But recently when FC with residual was added, we were instructed to use Output in OpDesc but it's impossible to set it as output in graph so it's still linked as input. From this comes this confusion in code both in graph pattern detector I had to write custom asserts and here I had to do some hacks that I don't have a good idea for design yet especially that it might be subject to change based on what baidu will say about this. I don't know if there is a great way to make this clear because it is inherently confusing so let's say it's a POC.
@@ -32,7 +32,7 @@ class Graph; | |||
namespace { | |||
void LogEnabledOps(const int counter, const std::string& details) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is redefined in each pass. Some log only if count is > 0, others log all. Do you think we could extract this fuse pass logger to mkldnn_reuse.h
and only call it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw one in cpu_quantize_pass.cc from which I copied and modified it but this is only two examples so please give me a third as per the rule of three :D
https://understandlegacycode.com/blog/refactoring-rule-of-three/#use-the-rule-of-three
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean we have multiple fuse pass counters with logs e.g. elt_act_mkldnn_fuse_pass.cc:102, fc_elementwise_add_mkldnn_fuse_pass.cc:119, matmul_transpose_reshape_mkldnn_fuse_pass.cc:108 etc.
I think that same code is repeated each time, with only hardcoded op name being changed. It is not particularly related to your PR, I just wanted to gather opinions if you find it good idea as future improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok there is some repetition even in non mkldnn passes so if we were to add such method in future we could put it in FusePassBase or some other common parent class.
Hi @zhiqiu, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but this logic with residualdata as an output looks weird to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
is_residual_unsigned, | ||
"Scale_in_eltwise"); | ||
} else { | ||
if (!AreScalesPresentForNodes({input, weights})) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can check if scales are present for input and weights first, because it's common for both conditions. Then if with_residual_data
check only if scales are present for node residual_data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
call twice AreScalesPresntForNodes instead of if-else
Hi, I don't really know the background of this work. But I think it is not a good design that |
I was referring to your suggestion that ResidualData should be made output instead of input #40834 (comment). This PR was closed and reopened as new one PR #41776 and merged. Previously, we had already been using ResidualData as input for convolution and now we have ResidualData for FC as output. In Paddle we have quantization code for convolution op with ResidualData as input. Here I am adding quantization of fc with ResidualData as Output.
To me it seems that Input was better for the following reasons:
It is logically a little confusing to me that we have to SetOutput and ir_node_link_to residual_data -> fc at the same time https://github.com/PaddlePaddle/Paddle/pull/41776/files#diff-9bebe579d9782bcfbf4b11872065b743461e89a36c3924972e98ee7df8824d16R102-R109 Beside the above a small extra argument for me is that In quantization code (cpu_quantize_pass.cc) we will not be able to reuse the QuantizeInput method like it is used for all inputs but we will have to add another method that will be doing mostly the same thing but replace input with output - perhaps called QuantizeResidualData. |
Thanks, I got your views. If Input s already used and it is BETTER, I agree that you can change |
Something is broken about the CI. |
@sfraczek Please resolve the conflict of the code, thank you |
@sfraczek Please resolve the conflict of the code, thanks |
This comment was marked as outdated.
This comment was marked as outdated.
@chenwhql Please approve because the test that failed is outdated. It uses old Quant2Int8MkldnnPass which was replaced by C++ passes. We are planning to rewrite the test in Q1. |
python/paddle/fluid/contrib/slim/quantization/quant2_int8_mkldnn_pass.py
Outdated
Show resolved
Hide resolved
revert changes to unsupported script
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
你的PR已合入Paddle库,请关注后续测试结果。 |
PR types
Performance optimization
PR changes
Others
Describe
Add quantization of FC+residual fused op