# 🍫 Building a Controlnet pipeline for interior design with Fondant

> ⚠️ Please note that this notebook **is not** compatible with **Google Colab**. To complete the tutorial, you must
> initiate Docker containers. Starting Docker containers within Google Colab is not supported.

This example demonstrates an end-to-end fondant pipeline to collect and process data for the fine-tuning of a [ControlNet](https://github.com/lllyasviel/ControlNet) model, focusing on images related to interior design.


### Pipeline overview


There are 5 components in total, these are:

1. [**Prompt Generation**](components/generate_prompts.py): This component generates a set of seed prompts using a rule-based approach that combines various rooms and styles together, like “a photo of a {room_type} in the style of {style_type}”. As input, it takes in a list of room types (bedroom, kitchen, laundry room, ..), a list of room styles (contemporary, minimalist, art deco, ...) and a list of prefixes (comfortable, luxurious, simple). These lists can be easily adapted to other domains. The output of this component is a list of seed prompts.

2. [**Image URL Retrieval**](https://fondant.ai/en/latest/components/hub/#retrieve_from_faiss_by_prompt#description): This component retrieves images from a image dataset based on the seed prompts. The retrieval itself is done based on CLIP embeddings similarity between the prompt sentences and the captions in the image dataset. This component doesn’t return the actual images yet, only the URLs. The next component in the pipeline will then download these images.
The LIAON dataset is currently unavailable. Therefore, we have published a similar dataset based on the [DataComp 12M dataset](https://www.datacomp.ai/).

3. [**Download Images**](https://github.com/ml6team/fondant/tree/main/components/download_images): This component downloads the actual images based on the URLs retrieved by the previous component. It takes in the URLs as input and returns the actual images, along with some metadata (like their height and width).

4. [**Add Captions**](https://github.com/ml6team/fondant/tree/main/components/caption_images): This component captions all images using [BLIP](https://huggingface.co/docs/transformers/model_doc/blip). This model takes in the image and generates a caption that describes the content of the image. This component takes in a Hugging Face model ID, so it can use any [Hugging Face Hub model](https://huggingface.co/models).

5. [**Add Segmentation Maps**](https://github.com/ml6team/fondant/tree/main/components/segment_images): This component segments the images using the [UPerNet](https://huggingface.co/docs/transformers/model_doc/upernet) model. Each segmentation map contains segments of 150 possible categories listed [here](https://huggingface.co/openmmlab/upernet-convnext-small/blob/main/config.json#L110).

6. [**Write to file**](https://github.com/ml6team/fondant/tree/main/components/write_to_file): 
   Write the results to local disk or remote storage.

## Environment

#### This section checks the prerequisites of your environment. Read any errors or warnings carefully.

**Ensure a Python between version 3.8 and 3.10 is available**

In [1]:
import sys
if sys.version_info < (3, 8, 0) or sys.version_info >= (3, 12, 0):
    raise Exception(f"A Python version between 3.8 and 3.11 is required. You are running {sys.version}")

**Check if docker compose is installed and the docker daemon is running**

In [2]:
!docker compose version >/dev/null
!docker info >/dev/null

**Check if GPU is available**

In [1]:
import logging
import subprocess

try:
    subprocess.check_output('nvidia-smi')
    logging.info("Found GPU, using it!")
    number_of_accelerators = 1
    accelerator_name = "GPU"
except Exception:
    logging.warning("We recommend to run this pipeline on a GPU, but none could be found, using CPU instead")
    number_of_accelerators = None
    accelerator_name = None

**Make sure Fondant is installed**

In [None]:
!pip install -r ../requirements.txt -q --disable-pip-version-check

## Build the dataset

### Creating a component

The first component is used to initialise the datsaet. The component is the `generate_prompts` component, which generates seed prompts. We will implement it as a [lightweight component](https://fondant.ai/en/latest/components/lightweight_components/).

In [2]:
import typing as t

import dask.dataframe as dd
import pandas as pd
import pyarrow as pa

from fondant.component import DaskLoadComponent
from fondant.dataset import lightweight_component


@lightweight_component(produces={"prompt": pa.string()})
class GeneratePromptsComponent(DaskLoadComponent):

    interior_styles = [
        "art deco",
        "bauhaus",
        "bouclé",
        "maximalist",
        "brutalist",
        "coastal",
        "minimalist",
        "rustic",
        "hollywood regency",
        "midcentury modern",
        "modern organic",
        "contemporary",
        "modern",
        "scandinavian",
        "eclectic",
        "bohemiam",
        "industrial",
        "traditional",
        "transitional",
        "farmhouse",
        "country",
        "asian",
        "mediterranean",
        "rustic",
        "southwestern",
        "coastal",
    ]

    interior_prefix = [
        "comfortable",
        "luxurious",
        "simple",
    ]

    rooms = [
        "Bathroom",
        "Living room",
        "Hotel room",
        "Lobby",
        "Entrance hall",
        "Kitchen",
        "Family room",
        "Master bedroom",
        "Bedroom",
        "Kids bedroom",
        "Laundry room",
        "Guest room",
        "Home office",
        "Library room",
        "Playroom",
        "Home Theater room",
        "Gym room",
        "Basement room",
        "Garage",
        "Walk-in closet",
        "Pantry",
        "Gaming room",
        "Attic",
        "Sunroom",
        "Storage room",
        "Study room",
        "Dining room",
        "Loft",
        "Studio room",
        "Appartement",
    ]

    def __init__(self, *, n_rows_to_load: t.Optional[int]) -> None:
        """
        Generate a set of initial prompts that will be used to retrieve images from the
        LAION-5B dataset.

        Args:
            n_rows_to_load: Optional argument that defines the number of rows to load.
                Useful for testing pipeline runs on a small scale
        """
        self.n_rows_to_load = n_rows_to_load

    @staticmethod
    def make_interior_prompt(room: str, prefix: str, style: str) -> str:
        """Generate a prompt for the interior design model.

        Args:
            room: room name
            prefix: prefix for the room
            style: interior style

        Returns:
            prompt for the interior design model
        """
        return f"{prefix.lower()} {room.lower()}, {style.lower()} interior design"

    def load(self) -> dd.DataFrame:
        import itertools

        room_tuples = itertools.product(self.rooms, self.interior_prefix, self.interior_styles)
        prompts = map(lambda x: self.make_interior_prompt(*x), room_tuples)

        pandas_df = pd.DataFrame(prompts, columns=["prompt"])

        if self.n_rows_to_load:
            pandas_df = pandas_df.head(self.n_rows_to_load)

        df = dd.from_pandas(pandas_df, npartitions=1)

        return df

Now we can initialize the dataset, which includes specifying a name for your dataset.

Therefore we can use the `.create()` method. We can pass in arguments to change the behavior of the component. Here we are passing in `n_rows_to_load: 10`, which limits the amount of data that is generated for the purpose of this example.

In [3]:
from pathlib import Path
import pyarrow as pa
from fondant.dataset import Dataset, Resources

prompts = Dataset.create(
    ref=GeneratePromptsComponent,
    arguments={
        "n_rows_to_load": 50 # Set to 10 for small scale testing, set to None to load all rows
    },
    produces={
        "prompt": pa.string()
    },
    dataset_name="controlnet-dataset", # The name of our dataset
)

Now, our dataset workflow consists of a single component that generates prompts.

### Adding more (reusable) components

We can now proceed to add more components. 

We will use components available on the [Fondant Hub](https://fondant.ai/en/latest/components/hub/), we chain the components together by calling `.apply()` on the previous component.

In [4]:
image_urls = prompts.apply(
    "retrieve_from_faiss_by_prompt",
    arguments={
        "url_mapping_path":"hf://datasets/fondant-ai/datacomp-small-clip/id_mapping",
        "faiss_index_path":"hf://datasets/fondant-ai/datacomp-small-clip/faiss",
        "num_images": 2
    },
)

images = image_urls.apply(
    "download_images",
    arguments={
        "timeout": 1,
        "retries": 0,
        "image_size": 512,
        "resize_mode": "center_crop",
        "resize_only_if_bigger": False,
        "min_image_size": 0,
        "max_aspect_ratio": 2.5,
    },
)

captions = images.apply(
    "caption_images",
    arguments={
        "model_id": "Salesforce/blip-image-captioning-base",
        "batch_size": 8,
        "max_new_tokens": 50,
    },
    resources=Resources(
        accelerator_number=number_of_accelerators,
        accelerator_name=accelerator_name,
    ),
)

segmentations = captions.apply(
    "segment_images",
    arguments={
        "model_id": "openmmlab/upernet-convnext-small",
        "batch_size": 8,
    },
    resources=Resources(
        accelerator_number=number_of_accelerators,
        accelerator_name=accelerator_name,
    ),
)

## Optional: writing the dataset to the Hugging Face Hub 

To write the final dataset to HF hub, we will use the `write_to_hf_hub` component from the [Fondant Hub](https://fondant.ai/en/latest/components/hub/).

You'll need a Hugging Face Hub account for this. If you don't have one, you can either create one, or skip this step.

In [None]:
USERNAME = 
HF_TOKEN = 

`write_to_hf_hub` is a special type of reusable Fondant component which is **generic**. This means that it can handle different data schemas, but we have to tell it which schema to use.

We do this by adding a `consumes={}` block (see below). This is how we tell the component which fields to use to run the component:

In [None]:
segmentations.write(
    "write_to_hf_hub",
    arguments={
        "username": HF_USER,
        "dataset_name": "fondant-controlnet-dataset",
        "hf_token": HF_TOKEN,
        "image_column_names": ["image"],
    },
    consumes={
        "image": pa.binary(),
        "image_width": pa.int32(),
        "image_height": pa.int32()
    }
)

## Materialize the dataset

This pipeline will generate prompts, retrieve matching images in the laion dataset, download then and generate corresponding captions and segmentations. If you added the optional `write_to_hf_hub` component, it will write the resulting dataset to the HF hub.

Fondant provides multiple runners to run our pipeline:
- A Docker runner for local execution
- A Vertex AI runner for managed execution on Google Cloud
- A SageMaker runner for managed execution on AWS
- A Kubeflow Pipelines runner for execution anywhere

Here we will use the `DockerRunner` for local execution, which utilizes docker-compose under the hood.

The runner will first build the custom component and download the reusable components from the component hub. Afterwards, you will see the components execute one by one.

In [10]:
!pwd

/home/jupyter/fondant-usecase-controlnet/src


In [None]:
from fondant.dataset.runner import DockerRunner

DockerRunner().run(dataset=segmentations, working_directory="./data-artifacts")

Docker version:
(20, 10, 17)
Starting workflow run...


 caption_images Pulling 
 generatepromptscomponent Pulling 
 download_images Pulling 
 retrieve_from_faiss_by_prompt Pulling 
 segment_images Pulling 
 segment_images Pulled 
 download_images Pulled 
 caption_images Pulled 
 generatepromptscomponent Pulled 
 retrieve_from_faiss_by_prompt Pulled 
 Container controlnet-dataset-generatepromptscomponent-1  Recreate
 Container controlnet-dataset-generatepromptscomponent-1  Recreated
 Container controlnet-dataset-retrieve_from_faiss_by_prompt-1  Recreate
 Container controlnet-dataset-retrieve_from_faiss_by_prompt-1  Recreated
 Container controlnet-dataset-download_images-1  Recreate
 Container controlnet-dataset-download_images-1  Recreated
 Container controlnet-dataset-caption_images-1  Recreate
 Container controlnet-dataset-caption_images-1  Recreated
 Container controlnet-dataset-segment_images-1  Recreate
 Container controlnet-dataset-segment_images-1  Recreated


Attaching to caption_images-1, download_images-1, generatepromptscomponent-1, retrieve_from_faiss_by_prompt-1, segment_images-1


generatepromptscomponent-1       | [2024-04-09 12:40:38,959 | fondant.cli | INFO] Component `GeneratePromptsComponent` found in module main
generatepromptscomponent-1       | [2024-04-09 12:40:38,966 | fondant.component.executor | INFO] Caching is currently temporarily disabled.
generatepromptscomponent-1       | [2024-04-09 12:40:38,966 | fondant.component.executor | INFO] No matching execution for component detected
generatepromptscomponent-1       | [2024-04-09 12:40:38,967 | root | INFO] Executing component
generatepromptscomponent-1       | [2024-04-09 12:40:39,833 | distributed.http.proxy | INFO] To route to workers diagnostics web server please install jupyter-server-proxy: python -m pip install jupyter-server-proxy
generatepromptscomponent-1       | [2024-04-09 12:40:39,886 | distributed.scheduler | INFO] State start
generatepromptscomponent-1       | [2024-04-09 12:40:39,891 | distributed.scheduler | INFO]   Scheduler at:     tcp://127.0.0.1:40335
generatepromptscomponent-1   

generatepromptscomponent-1 exited with code 0


retrieve_from_faiss_by_prompt-1  | [2024-04-09 12:40:59,045 | faiss.loader | INFO] Loading faiss with AVX2 support.
retrieve_from_faiss_by_prompt-1  | [2024-04-09 12:40:59,610 | faiss.loader | INFO] Successfully loaded faiss with AVX2 support.


## Exploring the dataset 

You can also explore the dataset using the fondant explorer, this enables you to visualize your output dataset at each component step.

In [None]:
from fondant.explore import run_explorer_app

run_explorer_app(base_path=BASE_PATH)

To stop the Explorer and continue the notebook, press the stop button at the top of the notebook.

## Creating your own dataset

To create your own dataset, you can update the `generate_prompts` component to generate prompts describing the images you want.

Make the changes you want to the component above and reexecute the notebook.

## Scaling up

If you're happy with your dataset, it's time to scale up. Check [our documentation](https://fondant.ai/en/latest/pipeline/#compiling-and-running-a-pipeline) for more information about the available runners.