<a href="https://colab.research.google.com/github/aayush1693/Flux-Text-to-Image/blob/main/Flux_Text_to_Image.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Text-to-Image using Flux**
# Introduction
In this notebook, we will use the Flux model from Black Forest Labs to generate images from text prompts. The Flux model is available on Hugging Face and can be accessed via the diffusers library.

Huggingface Model: https://huggingface.co/black-forest-labs

Checkout the site: https://blackforestlabs.ai/

Checkout GitHuh: https://github.com/black-forest-labs/flux

# Prerequisites
Before we start, ensure you have the necessary libraries installed. We will use diffusers for the model and torch for tensor operations.

In [6]:
!pip install -U diffusers
!pip install torch



# Import Libraries
First, we need to import the required libraries.

In [7]:
import torch
from diffusers import FluxPipeline, DPMSolverMultistepScheduler

# Load the Model
We will load the Flux model using the FluxPipeline class from the diffusers library. Make sure to use the appropriate model checkpoint.

In [8]:
import torch
from diffusers import FluxPipeline
from huggingface_hub import notebook_login

# Log in to your Hugging Face account
notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [4]:
# Check and use GPU if available
if torch.cuda.is_available():
    device = "cuda"
else:
    device = "cpu"
    print("Warning: GPU not detected. Using CPU instead. Performance may be slower.")

In [5]:
# Load the model with appropriate precision and scheduler for faster inference
pipe = FluxPipeline.from_pretrained(
    "black-forest-labs/FLUX.1-dev",
    torch_dtype=torch.bfloat16,  # Use bfloat16 for faster computation on GPU
    revision="main",  # Use the 'main' branch for updated model
    scheduler=DPMSolverMultistepScheduler.from_pretrained(
        "black-forest-labs/FLUX.1-dev", subfolder="scheduler"
    ),
).to(device)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Loading pipeline components...:   0%|          | 0/7 [00:00<?, ?it/s]

You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

OutOfMemoryError: CUDA out of memory. Tried to allocate 90.00 MiB. GPU 0 has a total capacity of 14.75 GiB of which 23.06 MiB is free. Process 138860 has 14.72 GiB memory in use. Of the allocated memory 14.61 GiB is allocated by PyTorch, and 17.07 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

# Define the Prompt
Next, we define the text prompt that we want to convert into an image.
We will use a random seed to ensure reproducibility. The generator will be set to use CUDA for faster computation.

In [6]:
prompt = "A cat holding a sign that says Hello India"

generator = torch.Generator('cuda').manual_seed(0)

# Generate the Image
Now, we generate the image using the pipe object. We can specify various parameters such as image dimensions, guidance scale, and the number of inference steps.

In [None]:
# Generate the image with reduced dimensions, steps, and lower guidance scale
image = pipe(
    prompt,
    height=512,
    width=512,
    guidance_scale=2,
    num_inference_steps=25,
    max_sequence_length=512,
    generator=generator
).images[0]

# Displaying the image

In [None]:
from IPython.display import display

# Embed the image
display(image)

# Save the Image
Finally, we save the generated image to a file.

In [None]:
image.save("flux-dev.png")

## Conclusion

This notebook demonstrated the process of generating images from text prompts using the Flux model from Black Forest Labs. We explored the necessary steps, including installing required libraries, loading the model, defining prompts, and generating images.

**Key Takeaways:**

* **Flux Model:** We successfully utilized the Flux model, a powerful text-to-image AI, to create images from descriptive text prompts.
* **Hugging Face Integration:** The integration with Hugging Face's `diffusers` library simplified the model access and usage.
* **Customization:** We explored various parameters like image dimensions, guidance scale, and inference steps to control the image generation process.
* **Hardware Acceleration:** Leveraging GPU acceleration significantly reduced the image generation time.
* **Image Embedding:** We learned how to embed the generated images directly into the Colab notebook for immediate visualization.

**Potential Applications:**

This technology opens up numerous possibilities, including:

* **Creative Content Generation:** Artists and designers can use text-to-image AI to quickly generate ideas and visual concepts.
* **Content Marketing:** Marketers can create unique visuals for their campaigns based on textual descriptions.
* **Prototyping and Design:** Product designers can visualize their ideas by generating images from product descriptions.
* **Education and Research:** Researchers can use text-to-image models to explore the relationship between language and visual representation.

**Challenges and Future Directions:**

While impressive, text-to-image technology is still evolving. Some challenges include:

* **Fine-grained Control:** Achieving precise control over the generated image details remains an area of improvement.
* **Bias and Ethical Considerations:** Addressing potential biases in the training data and ensuring responsible use of the technology are critical.
* **Computational Resources:** Generating high-resolution images requires significant computational power.

Despite these challenges, the rapid advancements in this field hold immense promise for future applications.

**Acknowledgments:**

We extend our gratitude to the developers of the Flux model at Black Forest Labs for making this powerful technology accessible. Their contributions to open-source AI research are invaluable. We also thank the Hugging Face team for providing the `diffusers` library, which simplified the integration and usage of the Flux model within this notebook.
