Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No speed up for stable-diffusion-x4-upscaler #21

Open
SapirW opened this issue Apr 19, 2023 · 0 comments
Open

No speed up for stable-diffusion-x4-upscaler #21

SapirW opened this issue Apr 19, 2023 · 0 comments

Comments

@SapirW
Copy link

SapirW commented Apr 19, 2023

Hi,

I am getting very little to no speed up when using this SD2 upscaler model through diffusers.
I'm wondering if you have any ideas or input about why this does not work.
Times were measured on A100, batch size 1, average on 100 repetitions. Input image size are 64,128,256.

torch==2.0.0
diffusers==0.15.0

Code:

import numpy as np
import matplotlib.pyplot as plt
import torch, tomesd
from diffusers import StableDiffusionPipeline, StableDiffusionUpscalePipeline
from tqdm import tqdm
from PIL import Image
from pprint import pprint
import requests
from io import BytesIO


pipe_full = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16).to("cuda")


url = "https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd2-upscale/low_res_cat.png"
response = requests.get(url)
low_res_img = Image.open(BytesIO(response.content)).convert("RGB")



generator = torch.Generator()
repetitions = 100
res_dicts = {}
for size in [64, 128, 256]:
    res_dict = {}
    print(size)
    for i in tqdm(range(0,6)):
        low_res_img = low_res_img.resize((size, size))
        generation_time = 0
        pipe_tomesd = StableDiffusionUpscalePipeline.from_pretrained("stabilityai/stable-diffusion-x4-upscaler", torch_dtype=torch.float16).to("cuda")
        tomesd.apply_patch(pipe_tomesd, ratio=i/10)
        for repetition in tqdm(range(repetitions)):
            generator.manual_seed(171198)
            start = torch.cuda.Event(enable_timing=True)
            end = torch.cuda.Event(enable_timing=True)
            start.record()
            res = pipe_tomesd(prompt="photograph", image=low_res_img, output_type="np").images[0]
            end.record()
            torch.cuda.synchronize()
            generation_time += start.elapsed_time(end)
        res_dict[f"{i/10}"] = (generation_time/repetitions) * 0.001
        print(f"Average for ratio {i/10}: {(generation_time/repetitions) * 0.001}")
        pprint(res_dict)
        del pipe_tomesd
    res_dicts[size] = res_dict
    pprint(res_dict)
pprint(res_dicts)

Results:

{64: {'0.0': 2.949374296875,
'0.1': 2.9533927880859374,
'0.2': 2.9572005322265627,
'0.3': 2.9545330175781253,
'0.4': 2.8961071630859374,
'0.5': 2.8824244775390624},
128: {'0.0': 3.3331537109375,
'0.1': 3.3419722558593747,
'0.2': 3.334708588867188,
'0.3': 3.3441083691406255,
'0.4': 3.3288914892578125,
'0.5': 3.3203551171875003},
256: {'0.0': 7.9503355078124995,
'0.1': 7.947029687500001,
'0.2': 7.94916759765625,
'0.3': 7.94697734375,
'0.4': 7.946936533203125,
'0.5': 7.944877978515625}}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant