# Setup

In [1]:
!pip install -qU gradio diffusers transformers accelerate

This command uses **pip**, Python's package installer, to install or upgrade four specific Python libraries from within a notebook environment like Jupyter or Google Colab.

---

## Command Breakdown

* **`!`**: This prefix allows you to run a shell command directly in a code cell.
* **`pip install`**: This is the core command to install Python packages.
* **`-q`**: This flag stands for **quiet**. It reduces the amount of text output during installation, showing only essential information like errors.
* **`-U`**: This flag stands for **upgrade**. It ensures that if the packages are already installed, they are upgraded to the latest available versions.

---

## Libraries Being Installed

The command installs the following four libraries, which are commonly used for machine learning and AI applications, particularly with models from Hugging Face:

1.  **`gradio`**: A library for quickly creating simple web-based user interfaces (UIs) for your machine learning models, making them easy to test and share.
2.  **`diffusers`**: Provides tools and pre-trained models for creating images and audio from text descriptions, based on a technique called diffusion.
3.  **`transformers`**: A foundational library that gives access to a vast collection of pre-trained models for tasks like text summarization, translation, and question answering.
4.  **`accelerate`**: A utility that simplifies running PyTorch code on different hardware setups, such as multiple GPUs or TPUs, without needing to write complex code.

In [2]:
# Standard library imports
import itertools
import os
from random import sample

# Third-party imports
import gradio as gr
import torch
from diffusers import FluxPipeline
from IPython.display import Image

This code imports various Python libraries and specific functions. The imports are organized into two groups: standard Python libraries and specialized third-party libraries.

---

## Standard Library Imports

These modules are part of Python's built-in collection of tools.

* `import itertools`: Imports a module that provides functions for creating complex loops and combinations of data in an efficient way.
* `import os`: Imports a module that allows the program to interact with the operating system, for tasks like managing files and directories.
* `from random import sample`: Imports the `sample` function from the `random` module. This function is used to select a specific number of unique, random items from a list or sequence.

---

## Third-Party Imports

These are external libraries that must be installed separately and are commonly used in the field of AI and data science.

* `import gradio as gr`: Imports the **Gradio** library, which is used to build and share simple web-based user interfaces for machine learning models. It's aliased as `gr` for easier use.
* `import torch`: Imports **PyTorch**, a major deep learning framework that provides tools for building and training neural networks, especially with GPU acceleration.
* `from diffusers import FluxPipeline`: Imports the `FluxPipeline` class from the **Diffusers** library. This library, created by Hugging Face, specializes in diffusion models, which are state-of-the-art for generating high-quality images from text descriptions. `FluxPipeline` is a specific tool for running the FLUX image generation model.
* `from IPython.display import Image`: Imports the `Image` class, which is used specifically within environments like Jupyter Notebooks or Google Colab to display images directly in the output of a code cell.

In [3]:
class CFG:
    model = "black-forest-labs/FLUX.1-dev"
    device = 'cuda'
    dtype = torch.bfloat16
    variant = "fp16"
    howmany = 1 # nof images per prompt
    seed = 42
    infsteps = 30

os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"

This code defines a configuration class and sets an environment variable to prepare for running a machine learning model, specifically an image generation model from Hugging Face.

---

## Configuration Class (`CFG`)

The code creates a class named `CFG` to act as a container for all the settings and parameters for the script. This practice keeps all the important variables organized and easy to modify in one place.

* `model`: Specifies the **model identifier** on the Hugging Face Hub. Here, it's set to `"black-forest-labs/FLUX.1-dev"`, which is a specific version of the FLUX image generation model.
* `device`: Sets the computation device to `'cuda'`, which means the program will use an NVIDIA GPU for processing. This is essential for accelerating the performance of deep learning models.
* `dtype`: Sets the **data type** for the model's tensors to `torch.bfloat16`. This is a 16-bit floating-point format that reduces memory usage and speeds up calculations compared to the standard 32-bit float, while maintaining good numerical precision.
* `variant`: Specifies that the `fp16` (16-bit floating point) version of the model weights should be downloaded. This reduces the download size.
* `howmany`: A custom setting to generate **1 image** per prompt.
* `seed`: Sets the **random seed** to `42`. Using a specific seed makes the image generation process reproducible, meaning the same input will always create the exact same output.
* `infsteps`: Defines the number of **inference steps** as `30`. For diffusion models, this controls how many refinement steps are taken to generate the final image. A higher number can improve quality but takes more time.

---

## Environment Variable

* `os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"`: This line modifies an **environment variable** for the operating system. By setting `HF_HUB_ENABLE_HF_TRANSFER` to `"1"`, it activates a faster, multi-threaded library for downloading files from the Hugging Face Hub. This significantly speeds up the initial model download.

In [4]:
# fix randomness
g = torch.Generator(device = CFG.device).manual_seed(CFG.seed)

This line creates a specific instance of a PyTorch random number generator to ensure that any operation involving randomness is reproducible.

It works in two steps:

1.  **`torch.Generator(device = CFG.device)`**: This creates a **Generator** object. Instead of using a global random state, this object manages its own sequence of random numbers. The `device = CFG.device` part is crucial, as it places the generator on the GPU (`'cuda'`). This ensures that random numbers used in GPU computations are controlled by this specific generator.

2.  **`.manual_seed(CFG.seed)`**: This method sets the starting point for the random number sequence to a specific value, which is `42` as defined in the `CFG` class. By setting this **seed**, the sequence of numbers the generator produces will be exactly the same every time the code is run.

When this generator `g` is passed to a function that has a random component (like creating the initial noise for an image), it guarantees the output will be identical across multiple runs.

# Functions

In [5]:
def create_image(prompt):
    image = pipe(
      prompt=prompt, guidance_scale= 3.5,
      height=768, width=1360,
      num_inference_steps= CFG.infsteps,).images[0]
    return image

This code defines a Python function named `create_image` that generates a single image based on a text description. The function takes one argument:
* `prompt`: A string of text that describes the desired image content. 📝

---

## Process

1.  **`pipe(...)`**: The function calls an object named `pipe`, which is the image generation pipeline (in our instance `FluxPipeline`). It passes several settings to control the generation process:
    * **`prompt=prompt`**: The user's text description is passed directly to the model.
    * **`guidance_scale=3.5`**: This value controls how strictly the model adheres to the prompt. A moderate value like 3.5 balances creativity with following instructions.
    * **`height=768, width=1360`**: These set the dimensions of the output image to 1360x768 pixels, a widescreen format.
    * **`num_inference_steps=CFG.infsteps`**: It uses the number of inference steps (`30`) defined earlier in the `CFG` configuration class.

2.  **`.images[0]`**: The pipeline returns its output in a list format, even if it only generates one image. This code selects the very first item from that list.

3.  **`return image`**: The function returns the final generated image object.

# Aplikacja

In [None]:
pipe = FluxPipeline.from_pretrained(CFG.model, torch_dtype = CFG.dtype)
pipe.enable_model_cpu_offload()

This code loads a pre-trained image generation model and then applies a memory-saving optimization to it.

---

## Loading the Pre-trained Model

`pipe = FluxPipeline.from_pretrained(CFG.model, torch_dtype = CFG.dtype)`

This line downloads and initializes the image generation model.

* **`FluxPipeline.from_pretrained(...)`**: This is a command from the `diffusers` library that loads a model that has already been trained.
* **`CFG.model`**: It uses the model identifier (`"black-forest-labs/FLUX.1-dev"`) specified in the `CFG` class to find and download the correct model from the Hugging Face Hub.
* **`torch_dtype = CFG.dtype`**: It instructs the library to load the model's parameters using the `torch.bfloat16` data type. This reduces the model's memory footprint and can speed up calculations.

The fully loaded and configured model is then assigned to the variable `pipe`.

---

## Optimizing for Memory

`pipe.enable_model_cpu_offload()`

This line enables an optimization called **CPU offloading**. When a model is too large to fit entirely into the GPU's memory (VRAM), this function allows the system to intelligently shuttle parts of the model between the GPU and the main system RAM.

Only the specific components needed for a given computation step are moved to the GPU, while the rest wait on the CPU. This allows you to run very large models on hardware with limited VRAM.

In [None]:
demo = gr.Interface(fn=create_image, inputs="text", outputs="image")

demo.launch(debug = True)

This code uses the **Gradio** library to create and run a simple web-based user interface for the `create_image` function.

---

## Creating the User Interface

`demo = gr.Interface(fn=create_image, inputs="text", outputs="image")`

This line builds the web interface.

* **`gr.Interface(...)`**: This is the primary function from the Gradio library for creating a UI.
* **`fn=create_image`**: It specifies that the `create_image` function will be the backend logic. When a user enters something in the UI, this function is the one that gets called.
* **`inputs="text"`**: This tells Gradio to create a **text box** as the input field, so a user can type in their prompt.
* **`outputs="image"`**: This tells Gradio to create an **image component** to display the result returned by the `create_image` function.

The complete, configured UI object is then stored in the `demo` variable.

---

## Launching the Web App

`demo.launch(debug = True)`

This line starts the web server, making the user interface accessible.

* **`demo.launch()`**: This method activates the UI and generates a local URL. You can open this URL in your web browser to interact with the image generation model.
* **`debug = True`**: This optional parameter runs the application in debug mode. If any errors occur while the app is running, it will provide more detailed error messages, which is very useful for troubleshooting.