Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add interior design ControlNet pipeline readme #150

Merged
merged 25 commits into from
May 23, 2023
Merged

Conversation

j-branigan
Copy link
Contributor

First draft done. Feedback on image is particularly welcome.

@j-branigan j-branigan linked an issue May 19, 2023 that may be closed by this pull request
Copy link
Member

@RobbeSneyders RobbeSneyders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jamesbraniganml6!

  • Can we add a visual example of the data?
  • Can we add a section on how to reuse this pipeline for a different use case? They should only recreate the prompt generation component.
  • The pipeline is currently drawn with parallel captioning and segmentation, which is not the case. For clarity, it might be better to make these sequential.

examples/pipelines/controlnet-interior-design/README.md Outdated Show resolved Hide resolved
@RobbeSneyders RobbeSneyders changed the title Add pipeline doc Add interior design ControlNet pipeline readme May 22, 2023
PhilippeMoussalli and others added 11 commits May 22, 2023 18:21
PR that adds missing data types required for defining needed nested data
types for the embedding component

Changes:
* The string values of the enum types have been changed to pyarrow types
to make it easier to define complex schema
* utf8 types defined in the components have been changed to strings to
make them more intuitive

We will need to make more changes in the future to handle different
nested data types as suggested by @GeorgesLorre

https://swagger.io/docs/specification/data-models/data-types/#:~:text=the%20null%20value.-,Arrays,-Arrays%20are%20defined

Enums allow us to define nice constants that are typed but we'll need to
define many in order to accommodate for all different types of nested
structures. We might need to move away to dynamically typed data types
with a dictionary but this will require quite some changes to the json
schemas and the code so better leave it for later
PR that adds the image embedding component. Largely inspired by Niel's
PR #111 (inference and batching with dask).
Added the logo svg's 🎉
This branch is based on the image-embedding branch which has a lot of
changes. I would suggest to first merge that PR which will give a much
smaller Diff here.

This component implements the LAION image retrieval component which uses
CLIP embeddings from the input subset to query the LAION database. It
returns an images subset with URLs, similar to the other prompt based
Clip Retrieval component. These URLs should then be downloaded by the
already-made image-downloading component.

---------

Co-authored-by: Philippe Moussalli <philippe.moussalli95@gmail.com>
This PR contains the Image Cropping component.

The component looks for the most common color in the border. It uses
this color to calculate how much of the image border can be cropped out.
If the crop is not square, it will paste a border on the shortest side
to make it square again.


![d4e35776-3ce1-4157-ac1f-5b2f18ff2ad4](https://github.com/ml6team/fondant/assets/92580873/314ec0d3-3ab6-418e-8051-d9f464496b0e)

![82eeae2d-c63c-42cb-881c-3707971d043c](https://github.com/ml6team/fondant/assets/92580873/6754b418-7922-4744-8ef3-59978b07ee9d)

---------

Co-authored-by: Philippe Moussalli <philippe.moussalli95@gmail.com>
@j-branigan
Copy link
Contributor Author

j-branigan commented May 22, 2023

Thanks @jamesbraniganml6!

  • Can we add a visual example of the data?
  • Can we add a section on how to reuse this pipeline for a different use case? They should only recreate the prompt generation component.
  • The pipeline is currently drawn with parallel captioning and segmentation, which is not the case. For clarity, it might be better to make these sequential.

Thanks for the feedback @RobbeSneyders.
I've updated the image and added small docs for the components for convenience. On the other two issues:

  • I have added a line in the pipeline README linking to the generate_prompts file. It's very short but before adding more detail I think that the component itself should be updated to make it more general. The function names and prompt generation are very specifically related to interior design. I can do it myself but I'd want to open it as a separate PR if you agree.

  • What did you have in mind as a visual example of the data? Do you just mean an actual image from LAION. Are you talking about a visual representation of the structure of the dataframe?

Copy link
Collaborator

@GeorgesLorre GeorgesLorre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice James !

Since we focus on resuablity and want to inspire people to use fondant we could still make it more visual by adding more data examples or example picures (like the resizing, captioning etc). But that is maybe something we can still improve outside of this PR.

examples/pipelines/controlnet-interior-design/README.md Outdated Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The image is not telling much IMO, maybe a small sentence or 2 per step to explain how the data is being extended and enriched.

(also database is very vague, maybe call it dataset ready for fine-tuning or something)

## Introduction
This example demonstrates an end-to-end fondant pipeline to collect and process data for the training of a [ControlNet](https://github.com/lllyasviel/ControlNet) model, focusing on images related to interior design.

### What is Controlnet?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### What is Controlnet?
### What is ControlNet?


### What is Controlnet?

Controlnet is an image generation model developed by https://arxiv.org/abs/2302.05543 that gives the user more control over the image generation process. It is based on the Stable Diffusion model, which generates images based on a caption and an image. The Controlnet model adds a third input, a conditioning image, that can be used for specifying specific wanted elements in the generated image.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Controlnet is an image generation model developed by https://arxiv.org/abs/2302.05543 that gives the user more control over the image generation process. It is based on the Stable Diffusion model, which generates images based on a caption and an image. The Controlnet model adds a third input, a conditioning image, that can be used for specifying specific wanted elements in the generated image.
ControlNet is an image generation model developed by [Zhang etl a., 2023](https://arxiv.org/abs/2302.05543) that gives the user more control over the image generation process. It is based on the [Stable Diffusion](https://stability.ai/blog/stable-diffusion-public-release) model, which generates images based on text and an optional image. The ControlNet model adds a third input, a conditioning image, that can be used for specifying specific wanted elements in the generated image.

Comment on lines 40 to 43
Useful links:
* https://github.com/lllyasviel/ControlNet
* https://huggingface.co/docs/diffusers/main/en/api/pipelines/stable_diffusion/controlnet
* https://arxiv.org/abs/2302.05543
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Useful links:
* https://github.com/lllyasviel/ControlNet
* https://huggingface.co/docs/diffusers/main/en/api/pipelines/stable_diffusion/controlnet
* https://arxiv.org/abs/2302.05543
Useful links:
* https://github.com/lllyasviel/ControlNet
* https://huggingface.co/docs/diffusers/main/en/api/pipelines/stable_diffusion/controlnet
* https://arxiv.org/abs/2302.05543

It might be you need to include an enter here for this to show appropriately

Copy link
Member

@RobbeSneyders RobbeSneyders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ChristiaensBert!

Looks good, left some comments. You'll also have to rebase / merge since some of the components have moved on main.

examples/pipelines/controlnet-interior-design/README.md Outdated Show resolved Hide resolved
examples/pipelines/controlnet-interior-design/README.md Outdated Show resolved Hide resolved
examples/pipelines/controlnet-interior-design/README.md Outdated Show resolved Hide resolved
Comment on lines 119 to 122
1. Building the images for each of the pipeline components
```
bash build_images.sh -c all
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They should set the --namespace and --repo as well to push the images to their own github container registry.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@philippe-ml6 Do you have the full command that they have to use?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's in the bash script's help function

bash build_images.sh --help
Usage: build_images.sh [options]
Options:
  -c, --component <value>  Set the component name. Pass the component folder name to build a certain components or 'all' to build all components in the current directory (required)
  -n, --namespace <value>  Set the namespace (default: ml6team)
  -r, --repo <value>       Set the repo (default: fondant)
  -t, --tag <value>        Set the tag (default: latest)
  -h, --help               Display this help message

examples/pipelines/controlnet-interior-design/README.md Outdated Show resolved Hide resolved
# caption_images

### Description
This component captions inputted images using [BLIP](https://huggingface.co/docs/transformers/model_doc/blip).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This component takes a model id as input, so it can use any HF Hub model

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a database or the hub that we should have at the end?


| Input image | Output image |
|----------------------------------------------------------------|------------------------------------------------------------------|
| ![input image](docs/art/interior_design_controlnet_input1.png) | ![output image](docs/art/interior_design_controlnet_output1.jpg) |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those images are not rendered properly

Copy link
Member

@RobbeSneyders RobbeSneyders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Bert!

Can you remove the images that are not used? I see you added some more in the docs/art folder.

@RobbeSneyders RobbeSneyders merged commit 015fd25 into main May 23, 2023
@RobbeSneyders RobbeSneyders deleted the add_pipeline_doc branch May 23, 2023 13:09
Hakimovich99 pushed a commit that referenced this pull request Oct 16, 2023
First draft done. Feedback on image is particularly welcome.

---------

Co-authored-by: Philippe Moussalli <philippe.moussalli95@gmail.com>
Co-authored-by: khaerensml6 <92426912+khaerensml6@users.noreply.github.com>
Co-authored-by: ChristiaensBert <92580873+ChristiaensBert@users.noreply.github.com>
Co-authored-by: Bert Christiaens <bert.christiaens@ml6.eu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add ControlNet example README
7 participants