Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Counting ReLU vs HardSwish FLOPs #62

Closed
jahongir7174 opened this issue Mar 16, 2021 · 10 comments
Closed

Counting ReLU vs HardSwish FLOPs #62

jahongir7174 opened this issue Mar 16, 2021 · 10 comments

Comments

@jahongir7174
Copy link

Thank you very much for sharing the source code.
I have a question related to FLOPs counting for ReLU and HardSwish. I saw in the paper the flops are the same in ReLU and HardSwish.
Can you explain this situation?
image

@iamhankai
Copy link
Member

I treat HSwish as zero FLOPs as MobileNetV3 does.

@jahongir7174
Copy link
Author

I could not find any information about counting HardSwish as zero FLOPs in the paper and this might be my mistake.
Can you share the point in the paper?

this file describes FLOPS of the models.
I found that FLOPS of mobilenetv3 has a similar value with paper

@iamhankai
Copy link
Member

Sorry, I'm misleading. The FLOPS of HSwish is so small compared to that of Conv, its value is rounded-up.

@jahongir7174
Copy link
Author

jahongir7174 commented Mar 19, 2021

I calculated provided ghostnet parameters and FLOPS below.
Number of parameters - 5171524
ReLU
FLOP per operator type:
0.279741 GFLOP. 98.7183%. Conv
0.002561 GFLOP. 0.903756%. FC
0.000563712 GFLOP. 0.198929%. Add
0.000503328 GFLOP. 0.17762%. Mul
3.936e-06 GFLOP. 0.00138898%. Div
0 GFLOP. 0%. Concat
0 GFLOP. 0%. Relu
0.283373 GFLOP in Total

I replaced ReLU with HardSwish of provided ghostnet. I found that there is 0.007331 GFLOP difference.
HardSwish

FLOP per operator type:
0.279741 GFLOP. 96.2288%. Conv
0.00300736 GFLOP. 1.03451%. Add
0.00294697 GFLOP. 1.01374%. Mul
0.002561 GFLOP. 0.880965%. FC
0.00244758 GFLOP. 0.841949%. Div
0 GFLOP. 0%. Concat
0.290704 GFLOP in Total

I counted FLOPS using this

@iamhankai
Copy link
Member

I count FLOPS using https://github.com/Lyken17/pytorch-OpCounter and ignore BatchNorm since it can be fused into Conv during inference.

@jahongir7174
Copy link
Author

Can you share a custom op for HardSwish?
shared counter has no operation counter for HardSwish.

@iamhankai
Copy link
Member

HardSwish is formulated as x = clip(x+3, 0, 6) / 6 where +3 and /6 can be fused into BN and Conv, so the FLOPS of HardSwish is the same as clip.

@jahongir7174
Copy link
Author

Sorry for giving more questions,
x = clip(x+3, 0, 6) / 6 is HardSwish or HardSigmoid?

@iamhankai
Copy link
Member

Sorry, It's HardSigmoid. HardSwish is x*HardSigmoid(x). Implement its operation counter is easy.

@jahongir7174
Copy link
Author

Thank you very much for your time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants