You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I implemented a script to test the acceleration by myself and tested it on 3090GPU, but it could not achieve the acceleration effect of Table 4 in the paper.
import torch, tomesd, random, time
import numpy as np
from diffusers import StableDiffusionPipeline, DDIMScheduler, PNDMScheduler
def seed_everything(seed):
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
random.seed(seed)
np.random.seed(seed)
seed = 2024
seed_everything(seed)
batch_size = 1
num_inference_steps = 50
pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16).to("cuda")
# pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)
pipe.scheduler = PNDMScheduler.from_config(pipe.scheduler.config)
prompt = "a photo of an astronaut riding a horse on mars"
generator = torch.Generator().manual_seed(seed)
# warmup gpu for testing time
print('Warm up of the gpu')
for i in range(2):
image = pipe([prompt] * batch_size, num_inference_steps=num_inference_steps)
#-------------------
# origin pipeline
start_time = time.time()
image = pipe([prompt] * batch_size, num_inference_steps=num_inference_steps, generator=generator).images[0]
end_time = time.time()
print("Origin Pipeline: {:.3f} seconds".format(end_time-start_time))
image.save("results/orgin.png")
# Apply ToMe with a 50% merging ratio
generator = torch.Generator().manual_seed(seed)
tomesd.apply_patch(pipe, ratio=0.5) # Can also use pipe.unet in place of pipe here
start_time = time.time()
image = pipe([prompt] * batch_size, num_inference_steps=num_inference_steps, generator=generator).images[0]
end_time = time.time()
print("ToMe: {:.3f} seconds".format(end_time-start_time))
image.save("results/orgin_ToMe.png")
You might not be giving your GPU enough work. Try increasing the batch size and image size and see what happens.
The results in the paper were done using the original stable diffusion repo, not diffusers. Diffusers itself has implemented a bunch of different optimizations that eat into the speed-up of ToMe, but you should still see gains at bigger image sizes.
I implemented a script to test the acceleration by myself and tested it on 3090GPU, but it could not achieve the acceleration effect of Table 4 in the paper.
The output from this script is:
In Table 4, the speedup is almost doubled when r=50. Why is this the case in my test?
Could you give me some advice?
The text was updated successfully, but these errors were encountered: