Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[one-optimize] Fuse Mul with FullyConnected layer #13528

Closed
wants to merge 48 commits into from

Conversation

jiwaszki
Copy link
Contributor

This commit introduces fuse_mul_with_fully_connected pass that combines FullyConnected and Mul into one node.

ONE-DCO-1.0-Signed-off-by: Jan Iwaszkiewicz j.iwaszkiewi@samsung.com

This commit introduces fuse_mul_with_fully_connected pass that combines FullyConnected and Mul into one node.

ONE-DCO-1.0-Signed-off-by: Jan Iwaszkiewicz <j.iwaszkiewi@samsung.com>
@jiwaszki
Copy link
Contributor Author

Addresses the issue #13515
Changes from PR were tested on custom made model with same dims as mentioned in the issue.

TODO: add testcases and cover possible edge-cases.

@jinevening please take a look when you have some time, all feedback is highly appreciated (especially about optimization rules/guidelines in the project, i.e. rules regarding usage of dynamic_cast/must_cast and returning false/fail checks). Thanks in advance!

@@ -42,6 +42,7 @@
#include "luci/Pass/FuseBatchNormWithDwConvPass.h"
#include "luci/Pass/FuseBatchNormWithTConvPass.h"
#include "luci/Pass/FuseBCQPass.h"
#include "luci/Pass/FuseMulWithFullyConnectedPass.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plz move this line below #include "luci/Pass/FuseMulWithDivPass.h"

@@ -41,6 +41,7 @@ class CircleOptimizer final
FuseBatchNormWithConv,
FuseBatchNormWithDwConv,
FuseBatchNormWithTConv,
FuseMulWithFullyConnected,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plz move this line below FuseMulWithDiv,

Comment on lines 317 to 318
if (arser.get<bool>("--fuse_mul_with_fully_connected"))
options->enable(Algorithms::FuseMulWithFullyConnected);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plz move below

if (arser.get("--fuse_mul_with_div"))
options->enable(Algorithms::FuseMulWithDiv);

@@ -112,6 +112,8 @@ int entry(int argc, char **argv)
add_switch(arser, "--fuse_mean_with_mean",
"This will fuse two Mean operations when they follow one by one. This will fold them "
"into one operation and merge reduction indices.");
add_switch(arser, "--fuse_mul_with_fully_connected",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plz move below add_switch(arser, "--fuse_mul_with_div",

@@ -0,0 +1,183 @@
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plz add FuseMulWithFullyConnectedPass.test.cpp

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the unit test, you should add count(positive) <= count(negative) where negative method name ends with _NEG

Comment on lines 109 to 110
RETURN_FALSE_UNLESS(weights->opcode() == luci::CircleOpcode::CIRCLECONST or
weights->opcode() == luci::CircleOpcode::CIRCLEOUTPUTEXCLUDE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weights is always const.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in scope of 4c7f5a9

Comment on lines 116 to 118
update_values(fused_weights, multiplication);

fc->weights(fused_weights);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check all conditions at first & update at last.

If you incrementally update a node like this, fc can be invalid in some cases. For example, if bias is non-const, fc's weights will be updated, but that conversion would be wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in cfcb68b

@seanshpark
Copy link
Contributor

For validation, plz add or edit

  • res/TensorFlowLiteRecipes/Net_FullyConnected_Mul_000 and more for testing the pass
    • test.recipe for source model
    • test.rule for dredd test after the pass
  • compiler/luci-pass-value-py-test/test.lst
    • add Net_FullyConnected_Mul_000 and others with optimize option name
  • compiler/circle2circle-dredd-recipe-test/test.lst
    • add Net_FullyConnected_Mul_000 and others

add_switch(arser, "--fuse_mul_to_fullyconnected_weights",
"This will fuse Mul to following FullyConnected weights");
add_switch(arser, "--fuse_mul_with_conv",
"This will fuse Mul operation with a preceding Conv if possible.");
add_switch(arser, "--fuse_mul_with_div",
"This will fuse Mul operation with a Div operation whose numerator is const.");
add_switch(arser, "--fuse_mul_with_fully_connected",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
add_switch(arser, "--fuse_mul_with_fully_connected",
add_switch(arser, "--fuse_mul_with_fullyconnected",

to follow other options

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it! Done in scope of d386541

@jiwaszki jiwaszki marked this pull request as ready for review August 5, 2024 13:48
@jiwaszki
Copy link
Contributor Author

jiwaszki commented Aug 5, 2024

@seanshpark @jinevening I would ask for another review on this PR. I really appreciate your help!

I have a question regarding possible merging of the PR. If approved, should I split it into smaller ones like "[luci] Fuse Mul with FC"/"[tests] Fuse Mul with FC" or keep it as-is?

@seanshpark
Copy link
Contributor

... should I split it into smaller ones like ...

yes, please separate compiler/* and
it would be better to put single context of change in each PRs

Comment on lines 147 to 148
RETURN_FALSE_UNLESS(fused_weights->size<loco::DataType::FLOAT32>() ==
weights->size<loco::DataType::FLOAT32>());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems unnecessary.

Comment on lines 142 to 143
RETURN_FALSE_UNLESS(fused_bias->size<loco::DataType::FLOAT32>() ==
const_bias->size<loco::DataType::FLOAT32>());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Comment on lines 32 to 33
inline void update_values(luci::CircleConst *fused_node, luci::CircleConst *multiplication,
bool is_weights)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you split this to update_weights and update_bias ?

@jinevening
Copy link
Contributor

There are some minor comments, but looks good in overall. Please proceed after reading @seanshpark 's comment (#13528 (comment)).

This commit introduce FuseMulWithFullyConnectedPass which will fuse Mul to previous FullyConnected if possible.

ONE-DCO-1.0-Signed-off-by: Jan Iwaszkiewicz <j.iwaszkiewi@samsung.com>
This commit adds one-cmd option for FuseMulWithFullyConnectedPass.

ONE-DCO-1.0-Signed-off-by: Jan Iwaszkiewicz <j.iwaszkiewi@samsung.com>
This commit is adding circle2circle dredd test for FC + Mul fusion.

ONE-DCO-1.0-Signed-off-by: Jan Iwaszkiewicz <j.iwaszkiewi@samsung.com>
@jiwaszki
Copy link
Contributor Author

jiwaszki commented Aug 9, 2024

@seanshpark I have merged other branches here. Now let's wait for CIs to pass.

@jiwaszki
Copy link
Contributor Author

jiwaszki commented Aug 9, 2024

@nnfw-bot test nncc-release

This commit extends Net_FullyConnectedMul tflite recipes with 'no bias' case.

ONE-DCO-1.0-Signed-off-by: Jan Iwaszkiewicz <j.iwaszkiewi@samsung.com>
@jiwaszki
Copy link
Contributor Author

All related PRs from #13515 are closed. Closing this draft as well.

@jiwaszki jiwaszki closed this Aug 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants