## 🍫 Building a Controlnet pipeline for interior design with Fondant

This example demonstrates an end-to-end fondant pipeline to collect and process data for the fine-tuning of a [ControlNet](https://github.com/lllyasviel/ControlNet) model, focusing on images related to interior design.


### Pipeline overview


There are 5 components in total, these are:

1. [**Prompt Generation**](components/generate_prompts): This component generates a set of seed prompts using a rule-based approach that combines various rooms and styles together, like “a photo of a {room_type} in the style of {style_type}”. As input, it takes in a list of room types (bedroom, kitchen, laundry room, ..), a list of room styles (contemporary, minimalist, art deco, ...) and a list of prefixes (comfortable, luxurious, simple). These lists can be easily adapted to other domains. The output of this component is a list of seed prompts.

2. [**Image URL Retrieval**](https://github.com/ml6team/fondant/tree/main/components/prompt_based_laion_retrieval): This component retrieves images from the [LAION-5B](https://laion.ai/blog/laion-5b/) dataset based on the seed prompts. The retrieval itself is done based on CLIP embeddings similarity between the prompt sentences and the captions in the LAION dataset. This component doesn’t return the actual images yet, only the URLs. The next component in the pipeline will then download these images.

3. [**Download Images**](https://github.com/ml6team/fondant/tree/main/components/download_images): This component downloads the actual images based on the URLs retrieved by the previous component. It takes in the URLs as input and returns the actual images, along with some metadata (like their height and width).

4. [**Add Captions**](https://github.com/ml6team/fondant/tree/main/components/caption_images): This component captions all images using [BLIP](https://huggingface.co/docs/transformers/model_doc/blip). This model takes in the image and generates a caption that describes the content of the image. This component takes in a Hugging Face model ID, so it can use any [Hugging Face Hub model](https://huggingface.co/models).

5. [**Add Segmentation Maps**](https://github.com/ml6team/fondant/tree/main/components/segment_images): This component segments the images using the [UPerNet](https://huggingface.co/docs/transformers/model_doc/upernet) model. Each segmentation map contains segments of 150 possible categories listed [here](https://huggingface.co/openmmlab/upernet-convnext-small/blob/main/config.json#L110).

**Prerequisite:**

- Ensure Python version 3.8 to 3.10 is installed on your system.
- Install and configure Docker on your system.
- Ensure that you have a GPU for running the GPU-based component of the pipeline.


In [47]:
# Setup your environment 
!pip install "fondant[docker]==0.6.2" -q

## Implement the pipeline

First of all, we need to initialize the pipeline, which includes specifying a name for your pipeline, providing a description, and setting a base_path. The base_path is used to store the pipeline artifacts and data generated by the components

In [48]:
import os 
from pathlib import Path

import fsspec

from fondant.pipeline import ComponentOp, Pipeline
from fondant.compiler import DockerCompiler
from fondant.runner import DockerRunner

# General configs
BASE_PATH = "./data_dir"
N_ROWS_TO_LOAD = 10  # Set to None to load all rows

# Create data directory if it doesn't exist and if it's a local path
if fsspec.core.url_to_fs(BASE_PATH)[0].protocol == ('file', 'local'):
    Path(BASE_PATH).mkdir(parents=True, exist_ok=True)

pipeline = Pipeline(
    pipeline_name="controlnet-pipeline",
    pipeline_description="Pipeline that collects data to train ControlNet",
    base_path=BASE_PATH
)

To start off, we will use the `generate_prompts` component to generate our seed prompts:


In [49]:
generate_prompts_op = ComponentOp(
    component_dir="components/generate_prompts",
    arguments={"n_rows_to_load": N_ROWS_TO_LOAD},
)

pipeline.add_op(generate_prompts_op)

Now, our pipeline consists of a single component that loads the dataset from HuggingFace Hub. We can proceed to add the other components. All of them are reusable components, and we can initialize them using the `ComponentOp.from_registry(...)` method.

NOTE: The public clip service can only handle a few requets at a time, if you run into [timeout issues](https://github.com/rom1504/clip-retrieval/issues/267) then you might want to host your own clip service following this [guide](https://github.com/rom1504/clip-retrieval/blob/main/docs/laion5B_h14_back.md)

In [55]:
laion_retrieval_op = ComponentOp.from_registry(
    name="prompt_based_laion_retrieval",
    arguments={
        "num_images": 3,
        "aesthetic_score": 9,
        "aesthetic_weight": 0.5,
        "url": "https://knn.laion.ai/knn-service"
    },
)
download_images_op = ComponentOp.from_registry(
    name="download_images",
    arguments={
        "timeout": 1,
        "retries": 0,
        "image_size": 512,
        "resize_mode": "center_crop",
        "resize_only_if_bigger": False,
        "min_image_size": 0,
        "max_aspect_ratio": 2.5,
    },
)
caption_images_op = ComponentOp.from_registry(
    name="caption_images",
    arguments={
        "model_id": "Salesforce/blip-image-captioning-base",
        "batch_size": 8,
        "max_new_tokens": 50,
    },
    number_of_accelerators=1,
    accelerator_name="GPU",
)
segment_images_op = ComponentOp.from_registry(
    name="segment_images",
    arguments={
        "model_id": "openmmlab/upernet-convnext-small",
        "batch_size": 8,
    },
    number_of_accelerators=1,
    accelerator_name="GPU",
)

Now, we can use the components in our pipeline. It is important to note that we will define dependencies between the pipeline steps.

In [57]:
pipeline.add_op(laion_retrieval_op, dependencies=generate_prompts_op)
pipeline.add_op(download_images_op, dependencies=laion_retrieval_op)
pipeline.add_op(caption_images_op, dependencies=download_images_op)
pipeline.add_op(segment_images_op, dependencies=caption_images_op)

# Optional: writing the dataset to HF hub 

To write the final dataset to HF hub, we will use the `write_to_hf_hub` component. This component is a reusable Fondant component which is **generic**. This implies that we still need to customize the component specification file. We have to modify the dataframe schema defined in the `consumes` section of the component.

To achieve this, we can create a `fondant_component.yaml` file in the directory `components/write_to_hf_hub` with the following content:

```yaml
name: Write to hub
description: Component that writes a dataset to the hub
image: fndnt/write_to_hf_hub:0.6.2

consumes:
  images:
    fields:
      data:
        type: binary

  captions:
    fields:
      text:
        type: string

  segmentations:
    fields:
      data:
        type: binary

args:
  hf_token:
    description: The hugging face token used to write to the hub
    type: str
  username:
    description: The username under which to upload the dataset
    type: str
  dataset_name:
    description: The name of the dataset to upload
    type: str
  image_column_names:
    description: A list containing the image column names. Used to format to image to HF hub format
    type: list
    default: []
  column_name_mapping:
    description: Mapping of the consumed fondant column names to the written hub column names
    type: dict
    default: {}
```

In [46]:
USERNAME = 
HF_TOKEN = 

write_to_hub_controlnet = ComponentOp(
    component_dir="components/write_to_hub_controlnet",
    arguments={
        "username": USERNAME ,
        "hf_token": HF_TOKEN ,
        "dataset_name": "controlnet-interior-design",
        "image_column_names": ["images_data"],
    },
)

SyntaxError: invalid syntax (3589538109.py, line 1)

In [31]:
pipeline.add_op(write_to_hub_controlnet, dependencies=segment_images_op)

## Execute the pipeline

The pipeline will generate the prompts, retreive matching images in the laion dataset and download then and finally will generate corresponding captions and segmentations needed before writing the dataset to the HF hub.

We can execute our pipeline. Fondant provides various executors, and in this case, we are using the `DockerRunner` for local execution, which utilizes docker-compose under the hood.

In [52]:
compiler = DockerCompiler()
runner = DockerRunner()

compiler.compile(pipeline=pipeline, output_path = "docker-compose.yml")
DockerRunner().run("docker-compose.yml")

[2023-11-02 11:16:13,398 | fondant.compiler | INFO] Compiling controlnet-pipeline to docker-compose.yml
[2023-11-02 11:16:13,399 | fondant.compiler | INFO] Base path found on local system, setting up ./data_dir as mount volume
[2023-11-02 11:16:13,399 | fondant.pipeline | INFO] Sorting pipeline component graph topologically.
[2023-11-02 11:16:13,401 | fondant.pipeline | INFO] All pipeline component specifications match.
[2023-11-02 11:16:13,402 | fondant.compiler | INFO] Compiling service for generate_prompts
[2023-11-02 11:16:13,402 | fondant.compiler | INFO] Found Dockerfile for generate_prompts, adding build step.
[2023-11-02 11:16:13,403 | fondant.compiler | INFO] Compiling service for laion_retrieval
[2023-11-02 11:16:13,403 | fondant.compiler | INFO] Compiling service for download_images
[2023-11-02 11:16:13,410 | fondant.compiler | INFO] Successfully compiled to docker-compose.yml
 laion_retrieval Pulling 
 download_images Pulling 
 download_images Pulled 
 laion_retrieval Pulle

#1 [generate_prompts internal] load .dockerignore
#1 transferring context: 2B done
#1 DONE 0.0s

#2 [generate_prompts internal] load build definition from Dockerfile
#2 transferring dockerfile: 740B done
#2 DONE 0.0s

#3 [generate_prompts internal] load metadata for docker.io/library/python:3.8-slim
#3 DONE 0.0s

#4 [generate_prompts 1/9] FROM docker.io/library/python:3.8-slim
#4 DONE 0.0s

#5 [generate_prompts internal] load build context
#5 transferring context: 131B done
#5 DONE 0.0s

#6 [generate_prompts 6/9] RUN pip3 install fondant[aws,azure,gcp]@git+https://github.com/ml6team/fondant@main
#6 CACHED

#7 [generate_prompts 4/9] RUN python3 -m pip install --upgrade pip
#7 CACHED

#8 [generate_prompts 8/9] COPY src/ .
#8 CACHED

#9 [generate_prompts 2/9] RUN apt-get update &&     apt-get upgrade -y &&     apt-get install git -y
#9 CACHED

#10 [generate_prompts 7/9] WORKDIR /component/src
#10 CACHED

#11 [generate_prompts 5/9] RUN pip3 install --no-cache-dir -r requirements.txt
#11 CA

 Container controlnet-pipeline-generate_prompts-1  Recreated
 Container controlnet-pipeline-laion_retrieval-1  Recreate
 Container controlnet-pipeline-laion_retrieval-1  Recreated
 Container controlnet-pipeline-download_images-1  Recreate
 Container controlnet-pipeline-download_images-1  Recreated
controlnet-pipeline-generate_prompts-1  | [2023-11-02 10:16:15,919 | fondant.cli | INFO] Component `GeneratePromptsComponent` found in module main
controlnet-pipeline-generate_prompts-1  | [2023-11-02 10:16:15,921 | fondant.executor | INFO] Dask default local mode will be used for further executions.Our current supported options are limited to 'local' and 'default'.
controlnet-pipeline-generate_prompts-1  | [2023-11-02 10:16:15,921 | fondant.executor | INFO] No matching execution for component detected
controlnet-pipeline-generate_prompts-1  | [2023-11-02 10:16:15,922 | root | INFO] Executing component
controlnet-pipeline-generate_prompts-1  | [2023-11-02 10:16:15,947 | root | INFO] Creating 

[########################################] | 100% Completed | 100.63 ms
controlnet-pipeline-generate_prompts-1 exited with code 0


controlnet-pipeline-laion_retrieval-1   | [2023-11-02 10:16:17,574 | fondant.cli | INFO] Component `LAIONRetrievalComponent` found in module main
controlnet-pipeline-laion_retrieval-1   | [2023-11-02 10:16:17,577 | fondant.executor | INFO] Dask default local mode will be used for further executions.Our current supported options are limited to 'local' and 'default'.
controlnet-pipeline-laion_retrieval-1   | [2023-11-02 10:16:17,578 | fondant.executor | INFO] Previous component `generate_prompts` is not cached. Invalidating cache for current and subsequent components
controlnet-pipeline-laion_retrieval-1   | [2023-11-02 10:16:17,578 | fondant.executor | INFO] Caching disabled for the component
controlnet-pipeline-laion_retrieval-1   | [2023-11-02 10:16:17,578 | root | INFO] Executing component
controlnet-pipeline-laion_retrieval-1   | [2023-11-02 10:16:17,597 | fondant.data_io | INFO] Loading subset prompts with fields ['text']...
controlnet-pipeline-laion_retrieval-1   | [2023-11-02 10:

[########################################] | 100% Completed | 1.51 sms


controlnet-pipeline-laion_retrieval-1   | [2023-11-02 10:16:19,130 | fondant.executor | INFO] Saving output manifest to /data_dir/controlnet-pipeline/controlnet-pipeline-20231102111613/laion_retrieval/manifest.json
controlnet-pipeline-laion_retrieval-1   | [2023-11-02 10:16:19,130 | fondant.executor | INFO] Writing cache key to /data_dir/controlnet-pipeline/cache/f8f2a4b417651a0faebd6becc808f2d1.txt


controlnet-pipeline-laion_retrieval-1 exited with code 0


controlnet-pipeline-download_images-1   | [2023-11-02 10:16:20,837 | fondant.cli | INFO] Component `DownloadImagesComponent` found in module main
controlnet-pipeline-download_images-1   | [2023-11-02 10:16:20,840 | fondant.executor | INFO] Dask default local mode will be used for further executions.Our current supported options are limited to 'local' and 'default'.
controlnet-pipeline-download_images-1   | [2023-11-02 10:16:20,841 | fondant.executor | INFO] Previous component `laion_retrieval` is not cached. Invalidating cache for current and subsequent components
controlnet-pipeline-download_images-1   | [2023-11-02 10:16:20,841 | fondant.executor | INFO] Caching disabled for the component
controlnet-pipeline-download_images-1   | [2023-11-02 10:16:20,841 | root | INFO] Executing component
controlnet-pipeline-download_images-1   | [2023-11-02 10:16:20,859 | fondant.data_io | INFO] Loading subset images with fields ['url']...
controlnet-pipeline-download_images-1   | [2023-11-02 10:16:

[########################################] | 100% Completed | 4.41 ss


controlnet-pipeline-download_images-1   | [2023-11-02 10:16:25,604 | fondant.executor | INFO] Saving output manifest to /data_dir/controlnet-pipeline/controlnet-pipeline-20231102111613/download_images/manifest.json
controlnet-pipeline-download_images-1   | [2023-11-02 10:16:25,604 | fondant.executor | INFO] Writing cache key to /data_dir/controlnet-pipeline/cache/646bfdf5a89355ec4af67b2d28caef2e.txt


controlnet-pipeline-download_images-1 exited with code 0


## Exploring the dataset 

You can also explore the dataset using the fondant explorer, this enables you to visualize your output dataset at each component step

In [53]:
os.environ['BASE_PATH'] = BASE_PATH

In [54]:
! fondant explore -b $BASE_PATH

[2023-11-02 11:16:35,395 | root | INFO] Using local base path: ./data_dir
[2023-11-02 11:16:35,395 | root | INFO] This directory will be mounted to /artifacts in the container.
[2023-11-02 11:16:35,395 | root | INFO] Running image from registry: fndnt/data_explorer with tag: latest on port: 8501
[2023-11-02 11:16:35,395 | root | INFO] Access the explorer at http://localhost:8501
latest: Pulling from fndnt/data_explorer
Digest: sha256:8f317b795798f24f37cb287355d6223c9cca94eb6f12e3535790d1faa79735ec
Status: Image is up to date for fndnt/data_explorer:latest
^C
