-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[one-optimize] Fuse Mul with FullyConnected layer #13528
Conversation
This commit introduces fuse_mul_with_fully_connected pass that combines FullyConnected and Mul into one node. ONE-DCO-1.0-Signed-off-by: Jan Iwaszkiewicz <j.iwaszkiewi@samsung.com>
Addresses the issue #13515 TODO: add testcases and cover possible edge-cases. @jinevening please take a look when you have some time, all feedback is highly appreciated (especially about optimization rules/guidelines in the project, i.e. rules regarding usage of |
@@ -42,6 +42,7 @@ | |||
#include "luci/Pass/FuseBatchNormWithDwConvPass.h" | |||
#include "luci/Pass/FuseBatchNormWithTConvPass.h" | |||
#include "luci/Pass/FuseBCQPass.h" | |||
#include "luci/Pass/FuseMulWithFullyConnectedPass.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plz move this line below #include "luci/Pass/FuseMulWithDivPass.h"
@@ -41,6 +41,7 @@ class CircleOptimizer final | |||
FuseBatchNormWithConv, | |||
FuseBatchNormWithDwConv, | |||
FuseBatchNormWithTConv, | |||
FuseMulWithFullyConnected, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plz move this line below FuseMulWithDiv,
if (arser.get<bool>("--fuse_mul_with_fully_connected")) | ||
options->enable(Algorithms::FuseMulWithFullyConnected); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plz move below
if (arser.get("--fuse_mul_with_div"))
options->enable(Algorithms::FuseMulWithDiv);
@@ -112,6 +112,8 @@ int entry(int argc, char **argv) | |||
add_switch(arser, "--fuse_mean_with_mean", | |||
"This will fuse two Mean operations when they follow one by one. This will fold them " | |||
"into one operation and merge reduction indices."); | |||
add_switch(arser, "--fuse_mul_with_fully_connected", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plz move below add_switch(arser, "--fuse_mul_with_div",
@@ -0,0 +1,183 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plz add FuseMulWithFullyConnectedPass.test.cpp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the unit test, you should add count(positive) <= count(negative)
where negative method name ends with _NEG
RETURN_FALSE_UNLESS(weights->opcode() == luci::CircleOpcode::CIRCLECONST or | ||
weights->opcode() == luci::CircleOpcode::CIRCLEOUTPUTEXCLUDE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weights is always const.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in scope of 4c7f5a9
update_values(fused_weights, multiplication); | ||
|
||
fc->weights(fused_weights); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check all conditions at first & update at last.
If you incrementally update a node like this, fc
can be invalid in some cases. For example, if bias is non-const, fc's weights will be updated, but that conversion would be wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in cfcb68b
For validation, plz add or edit
|
add_switch(arser, "--fuse_mul_to_fullyconnected_weights", | ||
"This will fuse Mul to following FullyConnected weights"); | ||
add_switch(arser, "--fuse_mul_with_conv", | ||
"This will fuse Mul operation with a preceding Conv if possible."); | ||
add_switch(arser, "--fuse_mul_with_div", | ||
"This will fuse Mul operation with a Div operation whose numerator is const."); | ||
add_switch(arser, "--fuse_mul_with_fully_connected", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add_switch(arser, "--fuse_mul_with_fully_connected", | |
add_switch(arser, "--fuse_mul_with_fullyconnected", |
to follow other options
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it! Done in scope of d386541
@seanshpark @jinevening I would ask for another review on this PR. I really appreciate your help! I have a question regarding possible merging of the PR. If approved, should I split it into smaller ones like "[luci] Fuse Mul with FC"/"[tests] Fuse Mul with FC" or keep it as-is? |
yes, please separate |
RETURN_FALSE_UNLESS(fused_weights->size<loco::DataType::FLOAT32>() == | ||
weights->size<loco::DataType::FLOAT32>()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems unnecessary.
RETURN_FALSE_UNLESS(fused_bias->size<loco::DataType::FLOAT32>() == | ||
const_bias->size<loco::DataType::FLOAT32>()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
inline void update_values(luci::CircleConst *fused_node, luci::CircleConst *multiplication, | ||
bool is_weights) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you split this to update_weights
and update_bias
?
There are some minor comments, but looks good in overall. Please proceed after reading @seanshpark 's comment (#13528 (comment)). |
This commit introduce FuseMulWithFullyConnectedPass which will fuse Mul to previous FullyConnected if possible. ONE-DCO-1.0-Signed-off-by: Jan Iwaszkiewicz <j.iwaszkiewi@samsung.com>
This commit adds one-cmd option for FuseMulWithFullyConnectedPass. ONE-DCO-1.0-Signed-off-by: Jan Iwaszkiewicz <j.iwaszkiewi@samsung.com>
This commit is adding circle2circle dredd test for FC + Mul fusion. ONE-DCO-1.0-Signed-off-by: Jan Iwaszkiewicz <j.iwaszkiewi@samsung.com>
@seanshpark I have merged other branches here. Now let's wait for CIs to pass. |
@nnfw-bot test nncc-release |
This commit extends Net_FullyConnectedMul tflite recipes with 'no bias' case. ONE-DCO-1.0-Signed-off-by: Jan Iwaszkiewicz <j.iwaszkiewi@samsung.com>
All related PRs from #13515 are closed. Closing this draft as well. |
This commit introduces fuse_mul_with_fully_connected pass that combines FullyConnected and Mul into one node.
ONE-DCO-1.0-Signed-off-by: Jan Iwaszkiewicz j.iwaszkiewi@samsung.com