Counting ReLU vs HardSwish FLOPs #62

jahongir7174 · 2021-03-16T06:35:12Z

Thank you very much for sharing the source code.
I have a question related to FLOPs counting for ReLU and HardSwish. I saw in the paper the flops are the same in ReLU and HardSwish.
Can you explain this situation?

iamhankai · 2021-03-17T15:04:29Z

I treat HSwish as zero FLOPs as MobileNetV3 does.

jahongir7174 · 2021-03-18T01:57:49Z

I could not find any information about counting HardSwish as zero FLOPs in the paper and this might be my mistake.
Can you share the point in the paper?

this file describes FLOPS of the models.
I found that FLOPS of mobilenetv3 has a similar value with paper

iamhankai · 2021-03-19T12:46:08Z

Sorry, I'm misleading. The FLOPS of HSwish is so small compared to that of Conv, its value is rounded-up.

jahongir7174 · 2021-03-19T15:19:23Z

I calculated provided ghostnet parameters and FLOPS below.
Number of parameters - 5171524
ReLU
FLOP per operator type:
0.279741 GFLOP. 98.7183%. Conv
0.002561 GFLOP. 0.903756%. FC
0.000563712 GFLOP. 0.198929%. Add
0.000503328 GFLOP. 0.17762%. Mul
3.936e-06 GFLOP. 0.00138898%. Div
0 GFLOP. 0%. Concat
0 GFLOP. 0%. Relu
0.283373 GFLOP in Total

I replaced ReLU with HardSwish of provided ghostnet. I found that there is 0.007331 GFLOP difference.
HardSwish

FLOP per operator type:
0.279741 GFLOP. 96.2288%. Conv
0.00300736 GFLOP. 1.03451%. Add
0.00294697 GFLOP. 1.01374%. Mul
0.002561 GFLOP. 0.880965%. FC
0.00244758 GFLOP. 0.841949%. Div
0 GFLOP. 0%. Concat
0.290704 GFLOP in Total

I counted FLOPS using this

iamhankai · 2021-03-20T02:10:04Z

I count FLOPS using https://github.com/Lyken17/pytorch-OpCounter and ignore BatchNorm since it can be fused into Conv during inference.

jahongir7174 · 2021-03-20T03:17:55Z

Can you share a custom op for HardSwish?
shared counter has no operation counter for HardSwish.

iamhankai · 2021-03-20T07:08:46Z

HardSwish is formulated as x = clip(x+3, 0, 6) / 6 where +3 and /6 can be fused into BN and Conv, so the FLOPS of HardSwish is the same as clip.

jahongir7174 · 2021-03-20T07:48:44Z

Sorry for giving more questions,
x = clip(x+3, 0, 6) / 6 is HardSwish or HardSigmoid?

iamhankai · 2021-03-21T02:22:03Z

Sorry, It's HardSigmoid. HardSwish is x*HardSigmoid(x). Implement its operation counter is easy.

jahongir7174 · 2021-03-21T02:35:21Z

Thank you very much for your time

jahongir7174 closed this as completed Mar 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Counting ReLU vs HardSwish FLOPs #62

Counting ReLU vs HardSwish FLOPs #62

jahongir7174 commented Mar 16, 2021

iamhankai commented Mar 17, 2021

jahongir7174 commented Mar 18, 2021

iamhankai commented Mar 19, 2021

jahongir7174 commented Mar 19, 2021 •

edited

iamhankai commented Mar 20, 2021

jahongir7174 commented Mar 20, 2021

iamhankai commented Mar 20, 2021

jahongir7174 commented Mar 20, 2021

iamhankai commented Mar 21, 2021

jahongir7174 commented Mar 21, 2021

Counting ReLU vs HardSwish FLOPs #62

Counting ReLU vs HardSwish FLOPs #62

Comments

jahongir7174 commented Mar 16, 2021

iamhankai commented Mar 17, 2021

jahongir7174 commented Mar 18, 2021

iamhankai commented Mar 19, 2021

jahongir7174 commented Mar 19, 2021 • edited

iamhankai commented Mar 20, 2021

jahongir7174 commented Mar 20, 2021

iamhankai commented Mar 20, 2021

jahongir7174 commented Mar 20, 2021

iamhankai commented Mar 21, 2021

jahongir7174 commented Mar 21, 2021

jahongir7174 commented Mar 19, 2021 •

edited