Could the author share the code for calculating the model parameters(Param.) and the model computational complexity(MACs) of the pipeline. #53

StormArcher · 2024-02-29T14:05:42Z

Could the author share the code for calculating the model parameters(Param.) and the model computational complexity(MACs) of the pipeline. very thank you!

bokyeong1015 · 2024-02-29T19:48:51Z

Hi, we've added the code and please run:

pip install thop==0.1.1.post2209072238
python src/count_macs_params.py

Click for the results:

== CompVis/stable-diffusion-v1-4 | 512x512 img generation ==
[Text Enc] MACs: 6.5G = 6545882112
[Text Enc] Params: 123.1M = 123060480
[U-Net] MACs: 338.7G = 338749194240
[U-Net] Params: 859.5M = 859520964
[Img Dec] MACs: 1240.1G = 1240079532032
[Img Dec] Params: 49.5M = 49490179
[Total] MACs: 1585.4G = 1585374608384
[Total] Params: 1032.1M = 1032071623

== nota-ai/bk-sdm-base | 512x512 img generation ==
[Text Enc] MACs: 6.5G = 6545882112
[Text Enc] Params: 123.1M = 123060480
[U-Net] MACs: 223.8G = 223755632640
[U-Net] Params: 579.4M = 579384964
[Img Dec] MACs: 1240.1G = 1240079532032
[Img Dec] Params: 49.5M = 49490179
[Total] MACs: 1470.4G = 1470381046784
[Total] Params: 751.9M = 751935623

== nota-ai/bk-sdm-small | 512x512 img generation ==
[Text Enc] MACs: 6.5G = 6545882112
[Text Enc] Params: 123.1M = 123060480
[U-Net] MACs: 217.7G = 217727959040
[U-Net] Params: 482.3M = 482346884
[Img Dec] MACs: 1240.1G = 1240079532032
[Img Dec] Params: 49.5M = 49490179
[Total] MACs: 1464.4G = 1464353373184
[Total] Params: 654.9M = 654897543

== nota-ai/bk-sdm-tiny | 512x512 img generation ==
[Text Enc] MACs: 6.5G = 6545882112
[Text Enc] Params: 123.1M = 123060480
[U-Net] MACs: 205.0G = 205035274240
[U-Net] Params: 323.4M = 323384964
[Img Dec] MACs: 1240.1G = 1240079532032
[Img Dec] Params: 49.5M = 49490179
[Total] MACs: 1451.7G = 1451660688384
[Total] Params: 495.9M = 495935623

== runwayml/stable-diffusion-v1-5 | 512x512 img generation ==
[Text Enc] MACs: 6.5G = 6545882112
[Text Enc] Params: 123.1M = 123060480
[U-Net] MACs: 338.7G = 338749194240
[U-Net] Params: 859.5M = 859520964
[Img Dec] MACs: 1240.1G = 1240079532032
[Img Dec] Params: 49.5M = 49490179
[Total] MACs: 1585.4G = 1585374608384
[Total] Params: 1032.1M = 1032071623

== stabilityai/stable-diffusion-2-1-base | 512x512 img generation ==
[Text Enc] MACs: 22.3G = 22299160576
[Text Enc] Params: 340.4M = 340387840
[U-Net] MACs: 339.2G = 339241205760
[U-Net] Params: 865.9M = 865910724
[Img Dec] MACs: 1240.1G = 1240079532032
[Img Dec] Params: 49.5M = 49490179
[Total] MACs: 1601.6G = 1601619898368
[Total] Params: 1255.8M = 1255788743

== stabilityai/stable-diffusion-2-1 | 768x768 img generation ==
[Text Enc] MACs: 22.3G = 22299160576
[Text Enc] Params: 340.4M = 340387840
[U-Net] MACs: 760.8G = 760797839360
[U-Net] Params: 865.9M = 865910724
[Img Dec] MACs: 1240.1G = 1240079532032
[Img Dec] Params: 49.5M = 49490179
[Total] MACs: 2023.2G = 2023176531968
[Total] Params: 1255.8M = 1255788743

StormArcher · 2024-03-01T08:36:48Z

We followed the author code for testing, and the experimental results show that the results of MACs differ too much from the paper.
Is there a problem writing code in this？

== CompVis/stable-diffusion-v1-4 | 512x512 img generation ==
[Text Enc] MACs: 6.5G = 6545882112
[Text Enc] Params: 123.1M = 123060480
[U-Net] MACs: 0.2G = 232980480
[U-Net] Params: 859.5M = 859520964
[Img Dec] MACs: 1.0G = 981467136
[Img Dec] Params: 49.5M = 49490179
[Total] MACs: 7.8G = 7760329728
[Total] Params: 1032.1M = 1032071623

bokyeong1015 · 2024-03-01T09:49:47Z

We obtained the results below with the refactored and uploaded code, which are identical to those presented in our paper.

Refer to the previous message for the full results.

== CompVis/stable-diffusion-v1-4 | 512x512 img generation ==
[Text Enc] MACs: 6.5G = 6545882112
[Text Enc] Params: 123.1M = 123060480
[U-Net] MACs: 338.7G = 338749194240
[U-Net] Params: 859.5M = 859520964
[Img Dec] MACs: 1240.1G = 1240079532032
[Img Dec] Params: 49.5M = 49490179
[Total] MACs: 1585.4G = 1585374608384
[Total] Params: 1032.1M = 1032071623

Could you share the output of pip show thop to check the version and ensure that pip install thop==0.1.1.post2209072238? If you could share the exact procedure or code that you've run, it would help us in reproducing your issue.

StormArcher · 2024-03-01T10:44:11Z

I ran your code directly, just for "CompVis/stable-diffusion-v1-4",as follows:
get_macs_params(model_id="CompVis/stable-diffusion-v1-4", img_size=512, txt_emb_size=768, device=device)

2 . the version is same as you.
import thop
print(thop.version)
0.1.1

Why is running step 1?
batch_size=1
dummy_timesteps = torch.zeros(batch_size).to(device)

How to control the sample step?

The code as follows:

@@ -0,0 +1,62 @@

------------------------------------------------------------------------------------

Copyright 2024. Nota Inc. All Rights Reserved.

------------------------------------------------------------------------------------

import torch
from diffusers import StableDiffusionPipeline
from thop import profile

def count_params(model):
return sum(p.numel() for p in model.parameters())

def get_macs_params(model_id, img_size=512, txt_emb_size=768, device="cuda", batch_size=1):
pipeline = StableDiffusionPipeline.from_pretrained(model_id).to(device)
text_encoder = pipeline.text_encoder
unet = pipeline.unet
vae_decoder = pipeline.vae.decoder

# text encoder    
dummy_input_ids = torch.zeros(batch_size, 77).long().to(device)  # (1,77)
macs_txt_enc, _ = profile(text_encoder, inputs=(dummy_input_ids,))  # (1,77)
macs_txt_enc = macs_txt_enc/batch_size
params_txt_enc = count_params(text_encoder)

# unet
dummy_noisy_latents = torch.zeros(batch_size, 4, int(img_size/8), int(img_size/8)).to(device)  # (1, 4, 512/8, 512/8) = (1, 4, 64, 64)
dummy_timesteps = torch.zeros(batch_size).to(device)  # 1
dummy_text_emb = torch.zeros(batch_size, 77, txt_emb_size).to(device)  # (1, 77, 768)
# key (1, 4, 64, 64)(1)(1, 77, 768)
macs_unet, _ = profile(unet, inputs= (dummy_noisy_latents, dummy_timesteps, dummy_text_emb))  # (1, 4, 64, 64) (1) (1, 77, 768)
macs_unet = macs_unet/batch_size
params_unet = count_params(unet)

# image decoder
dummy_latents = torch.zeros(batch_size, 4, 64, 64).to(device)  # (1, 4, 64, 64)
macs_img_dec, _ = profile(vae_decoder, inputs= (dummy_latents,))
macs_img_dec = macs_img_dec/batch_size
params_img_dec = count_params(vae_decoder)

# total
macs_total = macs_txt_enc+macs_unet+macs_img_dec
params_total = params_txt_enc+params_unet+params_img_dec

# print
print(f"== {model_id} | {img_size}x{img_size} img generation ==")
print(f"  [Text Enc] MACs: {(macs_txt_enc/1e9):.1f}G = {int(macs_txt_enc)}")
print(f"  [Text Enc] Params: {(params_txt_enc/1e6):.1f}M = {int(params_txt_enc)}")
print(f"  [U-Net] MACs: {(macs_unet/1e9):.1f}G = {int(macs_unet)}")
print(f"  [U-Net] Params: {(params_unet/1e6):.1f}M = {int(params_unet)}")
print(f"  [Img Dec] MACs: {(macs_img_dec/1e9):.1f}G = {int(macs_img_dec)}")
print(f"  [Img Dec] Params: {(params_img_dec/1e6):.1f}M = {int(params_img_dec)}")    
print(f"  [Total] MACs: {(macs_total/1e9):.1f}G = {int(macs_total)}")
print(f"  [Total] Params: {(params_total/1e6):.1f}M = {int(params_total)}")

if name == "main":
device="cuda:0"
get_macs_params(model_id="CompVis/stable-diffusion-v1-4", img_size=512, txt_emb_size=768, device=device)
# get_macs_params(model_id="nota-ai/bk-sdm-base", img_size=512, txt_emb_size=768, device=device)
# get_macs_params(model_id="nota-ai/bk-sdm-small", img_size=512, txt_emb_size=768, device=device)
# get_macs_params(model_id="nota-ai/bk-sdm-tiny", img_size=512, txt_emb_size=768, device=device)

# get_macs_params(model_id="runwayml/stable-diffusion-v1-5", img_size=512, txt_emb_size=768, device=device)
# get_macs_params(model_id="stabilityai/stable-diffusion-2-1-base", img_size=512, txt_emb_size=1024, device=device)
# get_macs_params(model_id="stabilityai/stable-diffusion-2-1", img_size=768, txt_emb_size=1024, device=device)

bokyeong1015 · 2024-03-01T11:25:49Z

1 - Thanks for checking.
3 - Do you mean how to control the number of denoising steps? We calculated MACs for a single denoising step and then multiplied it with the total number of steps, which is 25. dummy_timesteps is a single scalar value for the timestep index.

2 - Umm, when we did pip install thop==0.1.1.post2209072238, we obtained the below log with your code. Please share:

the output of pip show thop (0.1.1 seems to have multiple tags)
the output of pip show diffusers
the full log you've obtained

the log we obtained

`text_config_dict` is provided which will be used to initialize `CLIPTextConfig`. The value `text_config["id2label"]` will be overriden.
`text_config_dict` is provided which will be used to initialize `CLIPTextConfig`. The value `text_config["bos_token_id"]` will be overriden.
`text_config_dict` is provided which will be used to initialize `CLIPTextConfig`. The value `text_config["eos_token_id"]` will be overriden.
[INFO] Register count_linear() for <class 'torch.nn.modules.linear.Linear'>.
[INFO] Register count_normalization() for <class 'torch.nn.modules.normalization.LayerNorm'>.
[INFO] Register count_convNd() for <class 'torch.nn.modules.conv.Conv2d'>.
[INFO] Register count_linear() for <class 'torch.nn.modules.linear.Linear'>.
[INFO] Register count_normalization() for <class 'torch.nn.modules.normalization.LayerNorm'>.
[INFO] Register zero_ops() for <class 'torch.nn.modules.dropout.Dropout'>.
[INFO] Register count_convNd() for <class 'torch.nn.modules.conv.Conv2d'>.
[INFO] Register zero_ops() for <class 'torch.nn.modules.dropout.Dropout'>.
[INFO] Register count_linear() for <class 'torch.nn.modules.linear.Linear'>.
== CompVis/stable-diffusion-v1-4 | 512x512 img generation ==
  [Text Enc] MACs: 6.5G = 6545882112
  [Text Enc] Params: 123.1M = 123060480
  [U-Net] MACs: 338.7G = 338749194240
  [U-Net] Params: 859.5M = 859520964
  [Img Dec] MACs: 1240.1G = 1240079532032
  [Img Dec] Params: 49.5M = 49490179
  [Total] MACs: 1585.4G = 1585374608384
  [Total] Params: 1032.1M = 1032071623

pip show thop 0.1.1.post2209072238

Name: thop
Version: 0.1.1.post2209072238
Summary: A tool to count the FLOPs of PyTorch model.
Home-page: https://github.com/Lyken17/pytorch-OpCounter/

pip show diffusers 0.15.0

Name: diffusers
Version: 0.15.0
Summary: Diffusers
Home-page: https://github.com/huggingface/diffusers

StormArcher · 2024-03-01T12:23:43Z

I think the author may have forgotten to add “/8” for img_size of input of UNet, so the computational complexity of unet is (339G), if "img_size/8" should be the same as mine (0.2G)

if img_size/8, is 0.2G
if img_size/4, is 0.9G
if img_size/2, is 3.7G

I think the version of thop and diffuers is same with you， but the MACs og my UNet for "runwayml/stable-diffusion-v1-5" is

-> the output of pip show thop
Name: thop
Version: 0.1.1.post2209072238
Summary: A tool to count the FLOPs of PyTorch model.
Home-page: https://github.com/Lyken17/pytorch-OpCounter/
Author: Ligeng Zhu
Author-email: ligeng.zhu+github@gmail.com
License: MIT
Location: /opt/conda/lib/python3.8/site-packages
Requires: torch
Required-by:

-> the output of pip show diffusers
Name: diffusers
Version: 0.15.0.dev0
Summary: Diffusers
Home-page: https://github.com/huggingface/diffusers
Author: The HuggingFace team
Author-email: patrick@huggingface.co
License: Apache
Location: /opt/conda/lib/python3.8/site-packages
Editable project location: /home/pansiyuan/.jupyter/diffusers
Requires: filelock, huggingface-hub, importlib-metadata, numpy, Pillow, regex, requests
Required-by:

== CompVis/stable-diffusion-v1-4 | 512x512 img generation ==
[Text Enc] MACs: 6.5G = 6545882112
[Text Enc] Params: 123.1M = 123060480
[U-Net] MACs: 0.2G = 232980480
[U-Net] Params: 859.5M = 859520964
[Img Dec] MACs: 1.0G = 981467136
[Img Dec] Params: 49.5M = 49490179
[Total] MACs: 7.8G = 7760329728
[Total] Params: 1032.1M = 1032071623

bokyeong1015 · 2024-03-01T13:16:08Z

I think the author may have forgotten to add “/8” for img_size of input of UNet, so the computational complexity of unet is (339G), if "img_size/8" should be the same as mine (0.2G)

We didn't get this point. We think that the division by 8 ("/8") is correctly considered in our code.

If we put img_size=512, the latent input size for the U-Net becomes 1x4x64x64, which is correct.

dummy_noisy_latents = torch.zeros(batch_size, 4, int(img_size/8), int(img_size/8)).to(device)

Futhermore, if we changed img_size as you mentioned, the following results were obtained, and we were not able to reproduce if img_size/8, is 0.2G:

# get_macs_params(model_id="CompVis/stable-diffusion-v1-4", img_size=512, txt_emb_size=768, device=device)

== CompVis/stable-diffusion-v1-4 | 512x512 img generation ==
  [U-Net] MACs: 338.7G = 338749194240
  [U-Net] Params: 859.5M = 859520964

# get_macs_params(model_id="CompVis/stable-diffusion-v1-4", img_size=64, txt_emb_size=768, device=device)

== CompVis/stable-diffusion-v1-4 | 64x64 img generation ==
  [U-Net] MACs: 6.8G = 6773345280
  [U-Net] Params: 859.5M = 859520964

Thanks for your check. Unfortunately, we are unsure if we can provide further assistance at this moment, as we were not able to reproduce the issue you described.

bokyeong1015 added a commit that referenced this issue Feb 29, 2024

#53 add count_macs_params

801ec2e

bokyeong1015 mentioned this issue Feb 29, 2024

Add calculation of MACs and Params #54

Merged

bokyeong1015 linked a pull request Feb 29, 2024 that will close this issue

Add calculation of MACs and Params #54

Merged

bokyeong1015 closed this as completed in #54 Feb 29, 2024

bokyeong1015 reopened this Mar 1, 2024

bokyeong1015 closed this as completed Mar 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Could the author share the code for calculating the model parameters(Param.) and the model computational complexity(MACs) of the pipeline. #53

Could the author share the code for calculating the model parameters(Param.) and the model computational complexity(MACs) of the pipeline. #53

StormArcher commented Feb 29, 2024

bokyeong1015 commented Feb 29, 2024

StormArcher commented Mar 1, 2024

bokyeong1015 commented Mar 1, 2024

StormArcher commented Mar 1, 2024 •

edited

Loading

bokyeong1015 commented Mar 1, 2024 •

edited

Loading

StormArcher commented Mar 1, 2024 •

edited

Loading

bokyeong1015 commented Mar 1, 2024

Could the author share the code for calculating the model parameters(Param.) and the model computational complexity(MACs) of the pipeline. #53

Could the author share the code for calculating the model parameters(Param.) and the model computational complexity(MACs) of the pipeline. #53

Comments

StormArcher commented Feb 29, 2024

bokyeong1015 commented Feb 29, 2024

StormArcher commented Mar 1, 2024

bokyeong1015 commented Mar 1, 2024

StormArcher commented Mar 1, 2024 • edited Loading

@@ -0,0 +1,62 @@

------------------------------------------------------------------------------------

Copyright 2024. Nota Inc. All Rights Reserved.

------------------------------------------------------------------------------------

bokyeong1015 commented Mar 1, 2024 • edited Loading

StormArcher commented Mar 1, 2024 • edited Loading

bokyeong1015 commented Mar 1, 2024

StormArcher commented Mar 1, 2024 •

edited

Loading

bokyeong1015 commented Mar 1, 2024 •

edited

Loading

StormArcher commented Mar 1, 2024 •

edited

Loading