Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FLOPs count seems to be off #17

Closed
matthewygf opened this issue Jun 5, 2019 · 17 comments
Closed

FLOPs count seems to be off #17

matthewygf opened this issue Jun 5, 2019 · 17 comments

Comments

@matthewygf
Copy link

Hi Luke, Thanks for your great work !

I am interested in the FLOPs of the models implemented.

I have always been using this to count FLOPs.

And for most cases, models from vision:

  1. Resnet50
  2. VGG19
  3. Densenet121
  4. Densenet169
  5. Shufflenetv2_2_0

they all seems to match the FLOPs count from paper.

However in this case it is not. I can not think of a reason why, when i traverse the code down to each module, even added the FLOPs count for both padding inside the conv2dblock and Swish activation .

Do you have any idea?

Thanks in advance.

@sdoria
Copy link

sdoria commented Jun 5, 2019

Hi Matthew,
Could you share your FLOPs results? I am also wondering why EfficientnetB1 is taking longer to train (time per epoch) than resnet50, using FP16.
Thanks.

@matthewygf
Copy link
Author

matthewygf commented Jun 5, 2019

@sdoria

Resnet50: 4GFLOPs

Resnet50 is 6 times more flops than the EfficientNetB1 here.

EfficientNetB0: 4.1MFLOPs, Paper reported: 3.9MFLOPs
EfficientNetB1: 6.2 MFLOPs, Paper reported: 7MFLOPs
EfficientNetB2: 7.1 MFLOPs, Paper reported: 1GFLOPs
EfficientNetB3: 1GFLOPs, Paper reported: 1.8GFLOPs

Hope it helps.

@matthewygf
Copy link
Author

@sdoria Actually i think one of the reason why EfficientNetB1 is slower than ResNet50, could be the control flow logic in Padding, deciding whether to pad.
GPU sync for diverge path is costly, that might be the case.

@sdoria
Copy link

sdoria commented Jun 5, 2019

I assume you are talking about the logic behind 'same' padding in Conv2dSamePadding. I don't think that's a problem. I have an alternative version that sets padding = kernel_size//2 instead and it trains at the same speed (time per epoch)

@zhjpqq
Copy link

zhjpqq commented Jun 12, 2019

My FLOPS calculation code, http://studyai.com/article/a718990b

It is more close to paper than your method, but still different.

@zhjpqq
Copy link

zhjpqq commented Jun 15, 2019

For some paper, the "multiply_adds==True",
while for other papers the "multiply_adds==False".
code is here: http://studyai.com/article/a718990b

Under "multiply_adds==False", the results are below, which are the same to paper.

#model depth param GFLOPs

effb0 82l 5.28M 0.393G

effb1 116L 7.79M 0.697G

effb2 116L 9.11M 1.007G

effb3 131L 12.23M 1.851G

effb4 161L 19.34M 4.443G

@matthewygf
Copy link
Author

@zhjpqq thank you for you info :)

@maoyichun
Copy link

@zhjpqq hello, I run your code but the results is different from yours:
efficient b0
*** Number of layers: 82 , conv2d: 81, classifier: 1 ...

  • Number of FLOPs: 0.00802G

*** Number of params: 5.288548 million...

@matkalinowski
Copy link
Contributor

matkalinowski commented Aug 22, 2020

@zhjpqq Could you share the code you used to receive those results here? I am not able to enter site from the link you have sent.

EDIT_0:
I found probably the source code that was mentioned earlier and it is here but I have the same results as @maoyichun....

EDIT_1:
Reason of low FLOPS looks to be a missing calculation for CONV operations. When running this script with:

  • multiply_adds=False
  • input_size set according to EfficientNet definition
    it works as described.

EDIT_2:
After running some models I found out increased error in bigger models. It looks like the cause of it was error in static padding calculation going from _depthwise_conv in block 16. Run code below on model B1 to see what I am talking about.

dict(dict(model.named_modules())['_blocks.16'].named_modules())['_depthwise_conv']

I think I fixed it in this pull request: #223

I am also using different flop_count method that I recommend

@shawnricecake
Copy link

For some paper, the "multiply_adds==True",
while for other papers the "multiply_adds==False".
code is here: http://studyai.com/article/a718990b

Under "multiply_adds==False", the results are below, which are the same to paper.

#model depth param GFLOPs

effb0 82l 5.28M 0.393G

effb1 116L 7.79M 0.697G

effb2 116L 9.11M 1.007G

effb3 131L 12.23M 1.851G

effb4 161L 19.34M 4.443G

Hi,
I can not open the link u gave above.
Can you give me a new one for calculating flops of efficientnet?

@matkalinowski
Copy link
Contributor

@shen494157765 Use this one: https://github.com/facebookresearch/fvcore/blob/ffd5dfff8ee6d5a88939376f208b08022562e789/fvcore/nn/flop_count.py#L28 it should work just fine.

@shawnricecake
Copy link

@shen494157765 Use this one: https://github.com/facebookresearch/fvcore/blob/ffd5dfff8ee6d5a88939376f208b08022562e789/fvcore/nn/flop_count.py#L28 it should work just fine.

Hi,

Yes, I used the package of fvcore, and my code is below:

from fvcore.nn import flop_count
from efficientnet_pytorch import EfficientNet
model = EfficientNet.from_name('efficientnet-b2')
netinput = torch.randn(1, 3, 224, 224)
final_count, skipped_ops = flop_count(model, (netinput, )) 
print(final_count)

but the result is

Skipped operation aten::batch_norm 69 time(s)
Skipped operation prim::PythonOp 69 time(s)
Skipped operation aten::adaptive_avg_pool2d 24 time(s)
Skipped operation aten::sigmoid 23 time(s)
Skipped operation aten::mul 39 time(s)
Skipped operation aten::rand 16 time(s)
Skipped operation aten::add 32 time(s)
Skipped operation aten::div 16 time(s)
Skipped operation aten::dropout 1 time(s)
defaultdict(<class 'float'>, {'conv': 0.65755544, 'addmm': 0.001408})

which is different from the result in paper. (efficientnet b2)
and for other model (efficientnet b0, b1 ,... b7), the results are also different.

Thanks

@matkalinowski
Copy link
Contributor

There is ongoing work on pytorch side. There are multiple packages to calculate flops but this one is the closest to original I was able to find.

Other packages you can try:

@shawnricecake
Copy link

There is ongoing work on pytorch side. There are multiple packages to calculate flops but this one is the closest to original I was able to find.

Other packages you can try:

Hi,

Thanks for your reply, I have tried that 'ongoing' work, and I got the results which were similar to the results I got from the tool 'fvcore' (https://github.com/facebookresearch/fvcore/blob/master/fvcore/nn/flop_count.py)

And now I just think that the only way to calculate the flops of this efficientnet is calculating by myself.....

Thanks

@matkalinowski
Copy link
Contributor

Also @shen494157765 please make sure you are using the newest verion of this library. I have proposed fix (#223) to the architecture lately.

@shawnricecake
Copy link

Also @shen494157765 please make sure you are using the newest verion of this library. I have proposed fix (#223) to the architecture lately.

Hi,

Yes, I have used the latest version when I calculated the flops.
What I have done yesterday was that I just calculated the flops of efficientnet b0 by myself, which means I calculated conv2d in efficientnet b0 one by one. (if you want to see that I can share it to you)

The result is that the efficientnet b0 has 0.821842112 GFLOPs (about 0.41 GMACs), which is similar to 0.39 showed in paper.

(Just ignore the GFLOPs GMACs, I just think that the result in paper is GMACs not GFLOPs, buy anyway, many papers has same kind of definition.)

But for other kinds of efficientnet (b1 - b7), there are too much layers for them, so I did not choose them to verify the result.

For my own experiments, what I need is part of layers in efficientnet, and that is the reason that I need to know how to calculate the flops of efficientnet.

Now, I just calculated it manually with every conv2d.

If you know there is some tool that can get a similar result to the efficientnet paper, please let me know.

I do not know that if you have tried the tool you recommended above (can get the closet result) in effcientnet b6 or b7, that was a real disaster.

Thanks

@Monkey-D-Luffy-star
Copy link

@matthewygf I think your input size maybe (3,224,224) for all models(efficientnet-b0,efficientnet-b1,..etc.),but the input size of paper is variable for different models

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants