Für die Ausführung dieses Notebooks wird ein HugginFace Account, ein Token und die Bestätigung der Lizenzen der folgenden Modelle benötigt um diese verwenden zu können.
* https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct

Wir müssen die diffusers Bibliothek manuell aus dem Git-Repository installieren, damit die HiDreamPipeline verfügbar ist. Anschließend muss der Kernel neugestartet werden.

In [1]:
!pip install git+https://github.com/huggingface/diffusers.git

Collecting git+https://github.com/huggingface/diffusers.git
  Cloning https://github.com/huggingface/diffusers.git to /tmp/pip-req-build-j_d8zfrq
  Running command git clone --filter=blob:none --quiet https://github.com/huggingface/diffusers.git /tmp/pip-req-build-j_d8zfrq
  Resolved https://github.com/huggingface/diffusers.git to commit 6ab62c743183fff206239af931921908ae3ce133
  Installing build dependencies ... [?done
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Preparing metadata (pyproject.toml) ... [?25ldone
Building wheels for collected packages: diffusers
  Building wheel for diffusers (pyproject.toml) ... done
[?25h  Created wheel for diffusers: filename=diffusers-0.34.0.dev0-py3-none-any.whl size=3601997 sha256=59adc167127f1bbf25e959bf6a9c01f32cc79c1b755a5e04fa8d282998820779
  Stored in directory: /tmp/pip-ephem-wheel-cache-yjm6z68h/wheels/d2/5c/5f/16639722ea17ecb73ab461b81718584bac08af2801619786b9
Successfully built diffusers
Installing collected pack

In [2]:
from IPython.core.display import HTML
HTML("<script>Jupyter.notebook.kernel.restart()</script>")

In [3]:
import torch
from huggingface_hub import login
from diffusers import DiffusionPipeline, HiDreamImagePipeline, FluxPipeline, StableDiffusion3Pipeline
from transformers import PreTrainedTokenizerFast, LlamaForCausalLM, BitsAndBytesConfig

Hier HF-Token bereithalten

In [4]:
login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [5]:
prompts = [
    "a hyperrealistic photo of a group of musicians playing various instruments in a band, set in a random location",
    "a hyperrealistic photo of a common living room with a variety of music instrument scattered around the room"
]

In [6]:
def run_model(pipe, model_prompts, model_name, guidance_scale, steps, move_to_gpu=True):
    if guidance_scale < 2.1:
        print('guidance_scale must be greater than 2.0')
    
    generator = torch.Generator("cuda").manual_seed(0)

    try:
        if move_to_gpu:
            pipe = pipe.to('cuda')
        g_scales = [guidance_scale-2, guidance_scale, guidance_scale+2]

        for index, prompt in enumerate(model_prompts):
            for gs in g_scales:
                image = pipe(
                    prompt,
                    height=1024,
                    width=1024,
                    guidance_scale=gs,
                    num_inference_steps=steps,
                    max_sequence_length=512,
                    generator=generator,
                ).images[0]
                steps_string = str(steps).replace('.', '-')
                image.save(f"./eval_files/{model_name}_prompt-{index}_guidance-{gs}_steps-{steps_string}.png")
                print('Run successfully')
    finally:
        print('Moving pipe to cpu and empty cache')
        pipe.to('cpu')
        torch.cuda.empty_cache()
        del pipe

In [7]:
# HiDream-ai/HiDream-I1-Full

tokenizer_4 = PreTrainedTokenizerFast.from_pretrained("meta-llama/Meta-Llama-3.1-8B-Instruct")
text_encoder_4 = LlamaForCausalLM.from_pretrained(
    "meta-llama/Meta-Llama-3.1-8B-Instruct",
    output_hidden_states=True,
    output_attentions=True,
    torch_dtype=torch.bfloat16,
)

hidream_full_pipe = HiDreamImagePipeline.from_pretrained(
    "HiDream-ai/HiDream-I1-Full",
    tokenizer_4=tokenizer_4,
    text_encoder_4=text_encoder_4,
    torch_dtype=torch.bfloat16,
)

hidream_full_pipe.enable_model_cpu_offload()
hidream_full_pipe.enable_vae_tiling()

run_model(hidream_full_pipe, prompts, "HiDream-I1-Full", 5.0, 50, False)
del hidream_full_pipe




Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

Fetching 33 files:   0%|          | 0/33 [00:00<?, ?it/s]

config.json:   0%|          | 0.00/876 [00:00<?, ?B/s]

Loading pipeline components...:   0%|          | 0/11 [00:00<?, ?it/s]

Loading checkpoint shards:   0%|          | 0/7 [00:00<?, ?it/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]



  0%|          | 0/50 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/50 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/50 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/50 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/50 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/50 [00:00<?, ?it/s]

Run successfully
Moving pipe to cpu and empty cache


In [8]:
# black-forest-labs/FLUX.1-dev

flux_pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16)

run_model(flux_pipe, prompts, "FLUX-1-dev", 3.5, 50)
del flux_pipe

Loading pipeline components...:   0%|          | 0/7 [00:00<?, ?it/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers


Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

  0%|          | 0/50 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/50 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/50 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/50 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/50 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/50 [00:00<?, ?it/s]

Run successfully
Moving pipe to cpu and empty cache


In [9]:
# stabilityai/stable-diffusion-3.5-large

sd35_pipe = StableDiffusion3Pipeline.from_pretrained("stabilityai/stable-diffusion-3.5-large", torch_dtype=torch.bfloat16)

run_model(sd35_pipe, prompts, "sd35", 3.5, 28)
del sd35_pipe

Loading pipeline components...:   0%|          | 0/9 [00:00<?, ?it/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

  0%|          | 0/28 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/28 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/28 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/28 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/28 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/28 [00:00<?, ?it/s]

Run successfully
Moving pipe to cpu and empty cache


In [10]:
class CombinedSDXLPipe:
    def __init__(self, high_noise_frac):
        self.high_noise_frac = high_noise_frac
        self.base = DiffusionPipeline.from_pretrained(
            "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, variant="fp16", use_safetensors=True
        )
        
        self.refiner = DiffusionPipeline.from_pretrained(
            "stabilityai/stable-diffusion-xl-refiner-1.0",
            text_encoder_2=self.base.text_encoder_2,
            vae=self.base.vae,
            torch_dtype=torch.float16,
            use_safetensors=True,
            variant="fp16",
        )

    def to(self, device):
        self.base.to(device)
        self.refiner.to(device)
        return self

    def __call__(self, prompt, width, height, guidance_scale, num_inference_steps, max_sequence_length, generator):
        image = self.base(
            prompt=prompt,
            width=width,
            height=height,
            guidance_scale=guidance_scale,
            num_inference_steps=num_inference_steps,
            denoising_end=self.high_noise_frac,
            output_type="latent",
            generator=generator
        ).images
        return self.refiner(
            prompt=prompt,
            width=width,
            height=height,
            guidance_scale=guidance_scale,
            num_inference_steps=num_inference_steps,
            denoising_start=self.high_noise_frac,
            image=image,
            generator=generator
        )
        

In [11]:
# stabilityai/stable-diffusion-xl-base-1.0

sdxl_pipe = CombinedSDXLPipe(0.8)
run_model(sdxl_pipe, prompts, "sdxlrefiner", 7.5, 40)
del sdxl_pipe

Loading pipeline components...:   0%|          | 0/7 [00:00<?, ?it/s]

Loading pipeline components...:   0%|          | 0/5 [00:00<?, ?it/s]

  0%|          | 0/32 [00:00<?, ?it/s]

  0%|          | 0/8 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/32 [00:00<?, ?it/s]

  0%|          | 0/8 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/32 [00:00<?, ?it/s]

  0%|          | 0/8 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/32 [00:00<?, ?it/s]

  0%|          | 0/8 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/32 [00:00<?, ?it/s]

  0%|          | 0/8 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/32 [00:00<?, ?it/s]

  0%|          | 0/8 [00:00<?, ?it/s]

Pipelines loaded with `dtype=torch.float16` cannot run with `cpu` device. It is not recommended to move them to `cpu` as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, due to the lack of support for`float16` operations on this device in PyTorch. Please, remove the `torch_dtype=torch.float16` argument, or use another device for inference.


Run successfully
Moving pipe to cpu and empty cache


Pipelines loaded with `dtype=torch.float16` cannot run with `cpu` device. It is not recommended to move them to `cpu` as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, due to the lack of support for`float16` operations on this device in PyTorch. Please, remove the `torch_dtype=torch.float16` argument, or use another device for inference.
Pipelines loaded with `dtype=torch.float16` cannot run with `cpu` device. It is not recommended to move them to `cpu` as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, due to the lack of support for`float16` operations on this device in PyTorch. Please, remove the `torch_dtype=torch.float16` argument, or use another device for inference.
Pipelines loaded with `dtype=torch.float16` cannot run with `cpu` device. It is not recommended to move them to `cpu` as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, du