Skip to content
This repository has been archived by the owner on Aug 29, 2023. It is now read-only.

the input size of Flops is 256x256? #50

Closed
Sunting78 opened this issue Nov 15, 2021 · 4 comments
Closed

the input size of Flops is 256x256? #50

Sunting78 opened this issue Nov 15, 2021 · 4 comments

Comments

@Sunting78
Copy link

https://github.com/facebookresearch/detectron2/blob/main/tools/analyze_model.py

Hi Bowen. I calculate the flop and params with the scirpt, but the result is not the same with your paper.
The maskformer_swin_small_bs16_160k.yaml is 63M Params and 111G Flops. In your paper is 63M Params and 79G Flops. Is there any problems with my calculation? When the input shape resize to 256x256 it is the similar as your paper.

python3 analyze_model.py --config-file ./configs/ade20k-150/swin/maskformer_swin_small_bs16_160k.yaml --tasks flop

Augmentations used in inference: [ResizeShortestEdge(short_edge_length=(512, 512), max_size=2048, sample_style='choice')]
[11/15 13:41:29 detectron2]: Flops table computed from only one input sample:

module #parameters or shape #flops
model 63.075M 80.909G
backbone 48.839M 49.38G
backbone.patch_embed 4.896K 83.362M
backbone.patch_embed.proj 4.704K 75.497M
backbone.patch_embed.norm 0.192K 7.864M
backbone.layers 48.831M 49.282G
backbone.layers.0 0.299M 4.394G
backbone.layers.1 1.188M 4.367G
backbone.layers.2 33.16M 35.953G
backbone.layers.3.blocks 14.184M 4.567G
backbone.norm0 0.192K 7.864M
backbone.norm0.weight (96,)
backbone.norm0.bias (96,)
backbone.norm1 0.384K 3.932M
backbone.norm1.weight (192,)
backbone.norm1.bias (192,)
backbone.norm2 0.768K 1.966M
backbone.norm2.weight (384,)
backbone.norm2.bias (384,)
backbone.norm3 1.536K 0.983M
backbone.norm3.weight (768,)
backbone.norm3.bias (768,)
sem_seg_head 14.236M 27.453G
sem_seg_head.pixel_decoder 4.305M 23.56G
sem_seg_head.pixel_decoder.adapter_1 25.088K 0.424G
sem_seg_head.pixel_decoder.layer_1 0.59M 9.685G
sem_seg_head.pixel_decoder.adapter_2 49.664K 0.207G
sem_seg_head.pixel_decoder.layer_2 0.59M 2.421G
sem_seg_head.pixel_decoder.adapter_3 98.816K 0.102G
sem_seg_head.pixel_decoder.layer_3 0.59M 0.605G
sem_seg_head.pixel_decoder.layer_4 1.77M 0.453G
sem_seg_head.pixel_decoder.mask_features 0.59M 9.664G
sem_seg_head.predictor 9.932M 3.887G
sem_seg_head.predictor.transformer.decoder 9.473M 1.179G
sem_seg_head.predictor.query_embed 25.6K
sem_seg_head.predictor.input_proj 0.197M 50.332M
sem_seg_head.predictor.class_embed 38.807K 23.194M
sem_seg_head.predictor.mask_embed.layers 0.197M 0.118G
[11/15 13:41:29 detectron2]: Average GFlops for each type of operators:
[('conv', 32.83191595008), ('layer_norm', 0.22296760319999998), ('linear', 67.07614236672), ('matmul', 1.92566500224), ('group_norm', 0.0769406976), ('upsample_nearest2d', 0.00764854272), ('bmm', 0.139984896), ('einsum', 8.959275), ('upsample_bilinear2d', 0.29302461)]
[11/15 13:41:29 detectron2]: Total GFlops: 111.5±12.8
@bowenc0221
Copy link
Contributor

We calculate FLOPs with the corresponding training crop size: if it is 512x512 in the Table it means we calculate FLOPs with an image of size 512x512.

The augmentation in the config uses [ResizeShortestEdge(short_edge_length=(512, 512), max_size=2048, sample_style='choice')] which only resize the shorter side to 512 but the longer side could be larger than 512.

You can measure the FLOPs by feeding some dummy image of size 512x512 instead of using ADE20K images.

@Sunting78
Copy link
Author

Yes, I tried. When I calculate FLOPs with an image of size 256x256 , it can match the FLOPs reported in your paper. 512x512 can't. May you check it again, please?

@bowenc0221
Copy link
Contributor

I'm sure the input size is 512x512, here is my output:

[11/30 11:23:08 detectron2]: Flops table computed from only one input sample:
| module                                        | #parameters or shape   | #flops     |
|:----------------------------------------------|:-----------------------|:-----------|
| model                                         | 63.075M                | 81.079G    |
|  backbone                                     |  48.839M               |  49.38G    |
|   backbone.patch_embed                        |   4.896K               |   83.362M  |
|    backbone.patch_embed.proj                  |    4.704K              |    75.497M |
|    backbone.patch_embed.norm                  |    0.192K              |    7.864M  |
|   backbone.layers                             |   48.831M              |   49.282G  |
|    backbone.layers.0                          |    0.299M              |    4.394G  |
|    backbone.layers.1                          |    1.188M              |    4.367G  |
|    backbone.layers.2                          |    33.16M              |    35.953G |
|    backbone.layers.3.blocks                   |    14.184M             |    4.567G  |
|   backbone.norm0                              |   0.192K               |   7.864M   |
|    backbone.norm0.weight                      |    (96,)               |            |
|    backbone.norm0.bias                        |    (96,)               |            |
|   backbone.norm1                              |   0.384K               |   3.932M   |
|    backbone.norm1.weight                      |    (192,)              |            |
|    backbone.norm1.bias                        |    (192,)              |            |
|   backbone.norm2                              |   0.768K               |   1.966M   |
|    backbone.norm2.weight                      |    (384,)              |            |
|    backbone.norm2.bias                        |    (384,)              |            |
|   backbone.norm3                              |   1.536K               |   0.983M   |
|    backbone.norm3.weight                      |    (768,)              |            |
|    backbone.norm3.bias                        |    (768,)              |            |
|  sem_seg_head                                 |  14.236M               |  27.453G   |
|   sem_seg_head.pixel_decoder                  |   4.305M               |   23.56G   |
|    sem_seg_head.pixel_decoder.adapter_1       |    25.088K             |    0.424G  |
|    sem_seg_head.pixel_decoder.layer_1         |    0.59M               |    9.685G  |
|    sem_seg_head.pixel_decoder.adapter_2       |    49.664K             |    0.207G  |
|    sem_seg_head.pixel_decoder.layer_2         |    0.59M               |    2.421G  |
|    sem_seg_head.pixel_decoder.adapter_3       |    98.816K             |    0.102G  |
|    sem_seg_head.pixel_decoder.layer_3         |    0.59M               |    0.605G  |
|    sem_seg_head.pixel_decoder.layer_4         |    1.77M               |    0.453G  |
|    sem_seg_head.pixel_decoder.mask_features   |    0.59M               |    9.664G  |
|   sem_seg_head.predictor                      |   9.932M               |   3.887G   |
|    sem_seg_head.predictor.transformer.decoder |    9.473M              |    1.179G  |
|    sem_seg_head.predictor.query_embed         |    25.6K               |            |
|    sem_seg_head.predictor.input_proj          |    0.197M              |    50.332M |
|    sem_seg_head.predictor.class_embed         |    38.807K             |    23.194M |
|    sem_seg_head.predictor.mask_embed.layers   |    0.197M              |    0.118G  |
[11/30 11:23:08 detectron2]: Average GFlops for each type of operators:
[('conv', 23.630708736), ('layer_norm', 0.16134144), ('linear', 48.940320768), ('matmul', 1.413401472), ('group_norm', 0.05537792), ('upsample_nearest2d', 0.005505024), ('bmm', 0.1093632), ('einsum', 6.4485), ('upsample_bilinear2d', 0.3146752)]
[11/30 11:23:08 detectron2]: Total GFlops: 81.1±0.0

The FLOPs is 81.1G (compared to 79G in the paper), the slight increase in FLOPs is probably due to update in the fvcore package.

@bowenc0221
Copy link
Contributor

I have committed the script for calculating FLOPs.

Please use the following command: python tools/analyze_model.py --num-inputs 1 --tasks flop --use-fixed-input-size --config-file configs/ade20k-150/swin/maskformer_swin_small_bs16_160k.yaml MODEL.WEIGHTS ""

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants