In [None]:
!pip install getdaft --pre --extra-index-url https://pypi.anaconda.org/daft-nightly/simple
!pip install min-dalle torch Pillow

In [None]:
CI = False

In [None]:
# Flip this flag if you want to see the performance of running on CPU vs GPU
USE_GPU = False if CI else True

PARQUET_PATH = "https://huggingface.co/datasets/ChristophSchuhmann/improved_aesthetics_6.5plus/resolve/main/data/train-00000-of-00001-6f24a7497df494ae.parquet"

```{hint}
✨✨✨ **Run this notebook on Google Colab** ✨✨✨

You can [run this notebook yourself with Google Colab](https://colab.research.google.com/github/Eventual-Inc/Daft/blob/main/tutorials/text_to_image/text_to_image_generation.ipynb)!
```

# Generating Images from Text with DALL-E

In this tutorial, we will be using the DALL-E model to generate images from text. We will explore how to use GPUs with Daft to accelerate computations.

To run this tutorial:

1. You will need access to a GPU. If you are on Google Colab, you may switch to a GPU runtime by going to the menu `Runtime -> Change runtime type -> Hardware accelerator -> GPU -> Save`.

Let's get started!

## Setting Up

First, let's load a Parquet file into Daft. This particular file is hosted in HuggingFace at a https URL.

In [None]:
import daft
daft.context.set_runner_py(use_thread_pool=False)

parquet_df = daft.read_parquet(PARQUET_PATH)

Let's go ahead and `.collect()` this DataFrame. This will download the Parquet file and materialize the data in memory so that all our subsequent operations will be cached!

In [None]:
parquet_df.collect()

In [None]:
parquet_df = parquet_df.select(parquet_df["URL"], parquet_df["TEXT"], parquet_df["AESTHETIC_SCORE"])

## Downloading Images

Like many datasets, instead of storing the actual images in the dataset's files it looks like the Dataset authors have instead opted to store a URL to the image.

Let's use Daft's builtin functionality to download the images and open them as PIL Images - all in just a few lines of code!

In [None]:
# Filter for images with longer descriptions
parquet_df_with_long_strings = parquet_df.where(parquet_df["TEXT"].str.length() > 50)

# Download images
images_df = parquet_df_with_long_strings.with_column(
    "image",
    parquet_df["URL"].url.download().image.decode(),
)

In [None]:
images_df.show(5)

Great! Now we have a pretty good idea of what our dataset looks like.

# Running the Mini DALL-E model on a GPU using Daft UDFs

Let's now run the Mini DALL-E model over the `"TEXT"` column, and generate images for those texts!

Using GPUs with Daft UDFs is simple. Just specify `num_gpus=N`, where `N` is the number of GPUs that your UDF is going to use.

In [None]:
import torch
from min_dalle import MinDalle

from daft import ResourceRequest


@daft.udf(return_dtype=daft.DataType.python())
class GenerateImageFromText:
    def __init__(self):
        self.model = MinDalle(
            models_root='./pretrained',
            dtype=torch.float32,
            # Tell the min-dalle library to load model on GPU or GPU
            device="cuda" if USE_GPU else "cpu",
            is_mega=False, 
            is_reusable=True
        )
        
    def __call__(self, text_col):
        return [
            self.model.generate_image(
                t,
                seed=-1,
                grid_size=1,
                is_seamless=False,
                temperature=1,
                top_k=256,
                supercondition_factor=32,
            ) for t in text_col.to_pylist()
        ]

resource_request = ResourceRequest(num_gpus=1) if USE_GPU else ResourceRequest()
images_df.with_column(
    "generated_image",
    GenerateImageFromText(images_df["TEXT"]),
    resource_request=resource_request,
).show(1)