<a href="https://colab.research.google.com/github/pramit46/LLMTry/blob/main/ImageGeneration/Image_Generation_Using_StableDiffusion_%26_Other_Models.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Installing Roboflow and other dependencies**

We'll be using [Roboflow](https://roboflow.com/?ref=studiolab) to push our images up to after we have generated them for annotating (and, optionally, to use the [Roboflow Annotate tool](https://roboflow.com/annotate).

The roboflow [pip package](https://blog.roboflow.com/pip-install-roboflow/) will allow us to upload our batch of generated images.

In [1]:
%%sh
pip install -q --upgrade pip
pip install -q --upgrade diffusers transformers scipy ftfy huggingface_hub roboflow

   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/1.8 MB 20.5 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.9/9.9 MB 76.4 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 41.2/41.2 MB 33.0 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.0/3.0 MB 81.6 MB/s eta 0:00:00


ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
gensim 4.3.3 requires scipy<1.14.0,>=1.7.0, but you have scipy 1.14.1 which is incompatible.


# **Authenticating with the Hugging Face Hub**

We'll be using [Hugging Face](https://huggingface.co/) to pull down our Stable Diffusion model, so we must authenticate using our [Hugging Face Access Token](https://huggingface.co/docs/hub/security-tokens)

When we run the below cell, we enter our token, click login and we are authenticated.

**Note: You don't get a confirmation of token accepted once you click login. You can however confirm you are authenticated by looking at the SageMaker Studio Lab logs in the terminal at the bottom of your screen.**

In [2]:
from google.colab import userdata
import os
os.environ['HF_TOKEN']=userdata.get('huggingface')
#!huggingface-cli login

In [3]:
from huggingface_hub import notebook_login

# Required to get access to stable diffusion model
notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

Note: Environment variable`HF_TOKEN` is set and is the current active token independently from the token you've just configured.


# **Accepting License Terms**

Before we load this model from the Hugging Face Hub, we have to make sure that we accept the license of the runwayml/stable-diffusion-v1-5 project. You can accept the license by clicking on the Agree and access repository button on the [model page](https://huggingface.co/runwayml/stable-diffusion-v1-5).

# **Using the Hugging Face StableDiffusionPipeline Class**

Here we will create our [Hugging Face Stable Diffusion](https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion) pipeline, as well as ensure we are running on cuda. Hugging Face pipelines are an easy way to use your Hugging Face models for [inference](https://huggingface.co/docs/transformers/main_classes/pipelines)

In [None]:
import torch
from diffusers import StableDiffusionPipeline

model_id = "sd-legacy/stable-diffusion-v1-5"

pipeline = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)

#pipeline = StableDiffusionPipeline.from_pretrained(
#    "runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, revision="fp16"
#)

pipeline = pipeline.to("cuda")

# **Creating our Generate Images Function**

After we have created our pipeline, we will create our function to generate images. Here we define some parameters:

* prompt = The prompt used to generate your images
* num_images_to_generate = Total number of images to generate
* num_images_per_prompt = The number of images to generate in one iteration
* guidance_scale = The guidance scale defines how much freedom you want to give the model. Higher guidance scale encourages the model to generate images that are closely linked to the text prompt, usually at the expense of lower image quality. [Guidance scale as defined in Classifier-Free Diffusion Guidance](https://arxiv.org/abs/2207.12598)
* output_dir = This is the location you want to save the images to (The location will be created when creating the images)
* display_images = Defines if you want to display the images inline after creation

You can read more about the parameters associated with the Stable Diffusion pipeline [here](https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion)

In the function we iterate through all images created, based on our total defined images

In [None]:
import os

from IPython.display import Image, display


def generate_images(
    prompt,
    num_images_to_generate,
    num_images_per_prompt=4,
    guidance_scale=8,
    output_dir="generated_images",
    display_images=False,
):

    num_iterations = num_images_to_generate // num_images_per_prompt
    os.makedirs(output_dir, exist_ok=True)

    for i in range(num_iterations):
        images = pipeline(
            prompt, num_images_per_prompt=num_images_per_prompt, guidance_scale=guidance_scale
        )
        for idx, image in enumerate(images.images):
            image_name = f"{output_dir}/image_{(i*num_images_per_prompt)+idx}.png"
            image.save(image_name)
            if display_images:
                display(Image(filename=image_name, width=128, height=128))

In [4]:
# 1000 images takes 2-3 hours on a SageMaker Studio Lab GPU instance.
# You can adjust the total image number below
prompt="a lion in a jungle killing the other animals."

In [None]:
#image=generate_images(prompt, 12, guidance_scale=4, display_images=True)
generate_images(prompt, 12, guidance_scale=4, display_images=True)

In [None]:
image = pipeline(prompt).images[0]

image.save(prompt.replace(" ","_")+".png")

In [None]:
import torch
from diffusers import StableDiffusion3Pipeline

model_id = "stabilityai/stable-diffusion-3-medium-diffusers"
pipeline = StableDiffusion3Pipeline.from_pretrained(model_id, torch_dtype=torch.float16)

pipeline = pipeline.to("cuda")

In [None]:
image = pipeline(
    prompt,
    negative_prompt="",
    num_inference_steps=28,
    guidance_scale=7.0,
).images[0]

image.save(prompt.replace(" ","_")+".png")

In [7]:
import torch
from diffusers import FluxPipeline

pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16)
#pipe.enable_model_cpu_offload() #save some VRAM by offloading the model to CPU. Remove this if you have enough GPU power

Loading pipeline components...:   0%|          | 0/7 [00:00<?, ?it/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [None]:
prompt = "A cat holding a sign that says hello world"
image = pipe(
    prompt,
    height=512,
    width=512,
    guidance_scale=3.5,
    num_inference_steps=25,
    max_sequence_length=128,
    generator=torch.Generator("cpu").manual_seed(0)
).images[0]
image.save("flux-dev.png")

  0%|          | 0/25 [00:00<?, ?it/s]