Für die Ausführung dieses Notebooks wird ein HugginFace Account, ein Token und die Bestätigung der Lizenzen der folgenden Modelle benötigt um diese verwenden zu können.
* https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct

Wir müssen die diffusers Bibliothek manuell aus dem Git-Repository installieren, damit die HiDreamPipeline verfügbar ist. Anschließend muss der Kernel neugestartet werden.

In [22]:
!pip install git+https://github.com/huggingface/diffusers.git

Collecting git+https://github.com/huggingface/diffusers.git
  Cloning https://github.com/huggingface/diffusers.git to /tmp/pip-req-build-xxhh0rf8
  Running command git clone --filter=blob:none --quiet https://github.com/huggingface/diffusers.git /tmp/pip-req-build-xxhh0rf8
  Resolved https://github.com/huggingface/diffusers.git to commit 0dec414d5bf2c7fe77684722b0a97324798bd7b3
  Installing build dependencies ... [?done
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Preparing metadata (pyproject.toml) ... [?25ldone


In [23]:
!pip install -U bitsandbytes



In [24]:
from IPython.core.display import HTML
HTML("<script>Jupyter.notebook.kernel.restart()</script>")

In [27]:
import torch
from huggingface_hub import login
from diffusers import DiffusionPipeline, HiDreamImagePipeline, FluxPipeline, StableDiffusion3Pipeline
from transformers import PreTrainedTokenizerFast, LlamaForCausalLM, BitsAndBytesConfig

Hier HF-Token bereithalten

In [29]:
login(add_to_git_credential=False)

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [30]:
eval_image_count = 4

In [31]:
prompts = [
    "photo of a music band playing at a public place",
    "a photo of a music store with different music instruments hanging on the wall or standing on the ground"
]

In [36]:
def run_model(pipe, image_count, model_prompts, model_name, guidance_scale, steps, move_to_gpu=True):
    
    generator = torch.Generator("cuda").manual_seed(0)

    try:
        if move_to_gpu:
            pipe = pipe.to('cuda')

        for index, prompt in enumerate(model_prompts):
            for image_counter in range(image_count):
                image = pipe(
                    prompt,
                    height=1024,
                    width=1024,
                    guidance_scale=guidance_scale,
                    num_inference_steps=steps,
                    max_sequence_length=512,
                    generator=generator,
                ).images[0]
                steps_string = str(steps).replace('.', '-')
                image.save(f"./eval_files/{model_name}_prompt-{index}_guidance-{guidance_scale}_steps-{steps_string}_{image_counter}.png")
                print('Run successfully')
    finally:
        print('Moving pipe to cpu and empty cache')
        pipe.to('cpu')
        torch.cuda.empty_cache()
        del pipe

In [37]:
# HiDream-ai/HiDream-I1-Full

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,  # Aktiviert das 8-Bit-Laden
    quantization_method="fp4",  # Optionale Quantisierungsmethode, falls erforderlich
    device_map="auto"
)

tokenizer_4 = PreTrainedTokenizerFast.from_pretrained("meta-llama/Meta-Llama-3.1-8B-Instruct")
text_encoder_4 = LlamaForCausalLM.from_pretrained(
    "meta-llama/Meta-Llama-3.1-8B-Instruct",
    output_hidden_states=True,
    output_attentions=True,
    torch_dtype=torch.bfloat16,
)

hidream_full_pipe = HiDreamImagePipeline.from_pretrained(
    "HiDream-ai/HiDream-I1-Full",
    tokenizer_4=tokenizer_4,
    text_encoder_4=text_encoder_4,
    torch_dtype=torch.bfloat16,
)

hidream_full_pipe.enable_model_cpu_offload()
hidream_full_pipe.enable_vae_tiling()

run_model(hidream_full_pipe, eval_image_count, prompts, "HiDream-I1-Full", 5.0, 50, False)
del hidream_full_pipe


Unused kwargs: ['quantization_method', 'device_map']. These kwargs are not used in <class 'transformers.utils.quantization_config.BitsAndBytesConfig'>.


Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

Loading pipeline components...:   0%|          | 0/11 [00:00<?, ?it/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Loading checkpoint shards:   0%|          | 0/7 [00:00<?, ?it/s]



  0%|          | 0/50 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/50 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/50 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/50 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/50 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/50 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/50 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/50 [00:00<?, ?it/s]

Run successfully
Moving pipe to cpu and empty cache


In [38]:
# black-forest-labs/FLUX.1-dev

flux_pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16)

run_model(flux_pipe, eval_image_count, prompts, "FLUX-1-dev", 3.5, 50)
del flux_pipe

Loading pipeline components...:   0%|          | 0/7 [00:00<?, ?it/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers


Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

  0%|          | 0/50 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/50 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/50 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/50 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/50 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/50 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/50 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/50 [00:00<?, ?it/s]

Run successfully
Moving pipe to cpu and empty cache


In [39]:
# stabilityai/stable-diffusion-3.5-large

sd35_pipe = StableDiffusion3Pipeline.from_pretrained("stabilityai/stable-diffusion-3.5-large", torch_dtype=torch.bfloat16)

run_model(sd35_pipe, eval_image_count, prompts, "sd35", 3.5, 28)
del sd35_pipe

Loading pipeline components...:   0%|          | 0/9 [00:00<?, ?it/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

  0%|          | 0/28 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/28 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/28 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/28 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/28 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/28 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/28 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/28 [00:00<?, ?it/s]

Run successfully
Moving pipe to cpu and empty cache


In [40]:
class CombinedSDXLPipe:
    def __init__(self, high_noise_frac):
        self.high_noise_frac = high_noise_frac
        self.base = DiffusionPipeline.from_pretrained(
            "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, variant="fp16", use_safetensors=True
        )
        
        self.refiner = DiffusionPipeline.from_pretrained(
            "stabilityai/stable-diffusion-xl-refiner-1.0",
            text_encoder_2=self.base.text_encoder_2,
            vae=self.base.vae,
            torch_dtype=torch.float16,
            use_safetensors=True,
            variant="fp16",
        )

    def to(self, device):
        self.base.to(device)
        self.refiner.to(device)
        return self

    def __call__(self, prompt, width, height, guidance_scale, num_inference_steps, max_sequence_length, generator):
        image = self.base(
            prompt=prompt,
            width=width,
            height=height,
            guidance_scale=guidance_scale,
            num_inference_steps=num_inference_steps,
            denoising_end=self.high_noise_frac,
            output_type="latent",
            generator=generator
        ).images
        return self.refiner(
            prompt=prompt,
            width=width,
            height=height,
            guidance_scale=guidance_scale,
            num_inference_steps=num_inference_steps,
            denoising_start=self.high_noise_frac,
            image=image,
            generator=generator
        )
        

In [41]:
# stabilityai/stable-diffusion-xl-base-1.0

sdxl_pipe = CombinedSDXLPipe(0.8)
run_model(sdxl_pipe, eval_image_count, prompts, "sdxlrefiner", 7.5, 40)
del sdxl_pipe

Loading pipeline components...:   0%|          | 0/7 [00:00<?, ?it/s]

Loading pipeline components...:   0%|          | 0/5 [00:00<?, ?it/s]

  0%|          | 0/32 [00:00<?, ?it/s]

  0%|          | 0/8 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/32 [00:00<?, ?it/s]

  0%|          | 0/8 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/32 [00:00<?, ?it/s]

  0%|          | 0/8 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/32 [00:00<?, ?it/s]

  0%|          | 0/8 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/32 [00:00<?, ?it/s]

  0%|          | 0/8 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/32 [00:00<?, ?it/s]

  0%|          | 0/8 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/32 [00:00<?, ?it/s]

  0%|          | 0/8 [00:00<?, ?it/s]

Run successfully


  0%|          | 0/32 [00:00<?, ?it/s]

  0%|          | 0/8 [00:00<?, ?it/s]

Pipelines loaded with `dtype=torch.float16` cannot run with `cpu` device. It is not recommended to move them to `cpu` as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, due to the lack of support for`float16` operations on this device in PyTorch. Please, remove the `torch_dtype=torch.float16` argument, or use another device for inference.


Run successfully
Moving pipe to cpu and empty cache


Pipelines loaded with `dtype=torch.float16` cannot run with `cpu` device. It is not recommended to move them to `cpu` as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, due to the lack of support for`float16` operations on this device in PyTorch. Please, remove the `torch_dtype=torch.float16` argument, or use another device for inference.
Pipelines loaded with `dtype=torch.float16` cannot run with `cpu` device. It is not recommended to move them to `cpu` as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, due to the lack of support for`float16` operations on this device in PyTorch. Please, remove the `torch_dtype=torch.float16` argument, or use another device for inference.
Pipelines loaded with `dtype=torch.float16` cannot run with `cpu` device. It is not recommended to move them to `cpu` as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, du