Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Diffusers] Add text-guided image to image #223

Merged
merged 14 commits into from
Apr 19, 2023
Merged

Conversation

patrickvonplaten
Copy link
Contributor

@patrickvonplaten patrickvonplaten commented Apr 4, 2023

@Narsil @osanseviero would be great to get a review here :-)

@patrickvonplaten patrickvonplaten changed the title [Diffusers] Add image to image [Diffusers] Add text-guided image to image Apr 4, 2023
@patrickvonplaten
Copy link
Contributor Author

@Narsil would be amazing to get a quick review here to better see what still needs to be done.

I rebased the PR after https://github.com/huggingface/api-inference-community/pull/230/files was merged.

patrickvonplaten and others added 2 commits April 18, 2023 16:38
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
@Narsil
Copy link
Contributor

Narsil commented Apr 18, 2023

@patrickvonplaten the model seems incorrect :

{"error":"unknown error","warnings":["There was an inference error: The size of tensor a (148) must match the size of tensor b (37) at non-singleton dimension 3"]}

https://huggingface.co/hf-internal-testing/tiny-controlnet

Did I do something wrong ?

@patrickvonplaten
Copy link
Contributor Author

@patrickvonplaten the model seems incorrect :

{"error":"unknown error","warnings":["There was an inference error: The size of tensor a (148) must match the size of tensor b (37) at non-singleton dimension 3"]}

https://huggingface.co/hf-internal-testing/tiny-controlnet

Did I do something wrong ?

I'll check tomorrow!

Copy link
Member

@osanseviero osanseviero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool! Thanks a lot 🔥

with open(config_file, "r") as f:
config_dict = json.load(f)

is_controlnet = config_dict.get("_class_name", None) == "ControlNetModel"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_class_name could be added to the ModelInfo object so you don't need to load the whole config here. Internal PR for that https://github.com/huggingface/moon-landing/pull/6067

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok waiting for #6067

TESTABLE_MODELS: Dict[str, str] = {"text-to-image": "CompVis/ldm-text2im-large-256"}
TESTABLE_MODELS: Dict[str, str] = {
"text-to-image": "hf-internal-testing/tiny-stable-diffusion-pipe",
"image-to-image": "hf-internal-testing/tiny-controlnet",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be great to add a second test for image-to-image that is not controlNet. An example of testing multiple models can be seen here, but it will need some slight changes in the tests https://github.com/huggingface/api-inference-community/blob/main/docker_images/speechbrain/tests/test_api.py#L11

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

docker_images/diffusers/tests/test_api_image_to_image.py Outdated Show resolved Hide resolved
@coyotte508

This comment was marked as resolved.

**kwargs,
)
else:
self.ldm = DiffusionPipeline.from_pretrained(
Copy link
Contributor Author

@patrickvonplaten patrickvonplaten Apr 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can load both StableDiffusionInstructPix2PixPipeline, StableDiffusionImg2ImgPipeline and AltDiffusionImg2ImgPipeline depending on how the model_index.json is defined

@patrickvonplaten
Copy link
Contributor Author

patrickvonplaten commented Apr 19, 2023

@patrickvonplaten the model seems incorrect :

{"error":"unknown error","warnings":["There was an inference error: The size of tensor a (148) must match the size of tensor b (37) at non-singleton dimension 3"]}

https://huggingface.co/hf-internal-testing/tiny-controlnet

Did I do something wrong ?

Sorry there was a problem with the config! Should be fixed now:

from diffusers import ControlNetModel, StableDiffusionControlNetPipeline
import numpy
from PIL import Image

array = numpy.random.rand(64, 64, 3) * 255
image = Image.fromarray(array.astype('uint8')).convert('RGB')

model_id = "hf-internal-testing/tiny-stable-diffusion-pipe-no-safety"  # == tags?.base_model

controlnet = ControlNetModel.from_pretrained("hf-internal-testing/tiny-controlnet")

pipeline = StableDiffusionControlNetPipeline.from_pretrained(model_id, controlnet=controlnet)


out_img = pipeline("hey", image).images[0]

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
@patrickvonplaten
Copy link
Contributor Author

The second main use case (stable diffusion image to image and pix2pix) can be seen in this example:

#!/usr/bin/env python3
from diffusers import UNet2DConditionModel, StableDiffusionInstructPix2PixPipeline
import numpy
from PIL import Image

array = numpy.random.rand(64, 64, 3) * 255
image = Image.fromarray(array.astype('uint8')).convert('RGB')

model_id = "hf-internal-testing/tiny-stable-diffusion-pix2pix"

pipeline = StableDiffusionInstructPix2PixPipeline.from_pretrained(model_id)

out_img = pipeline("hey", image).images[0]

Copy link
Member

@osanseviero osanseviero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot! Approving pending the internal PR and change to model_info

"sentence-similarity",
"fill-mask",
"table-question-answering",
"summarization",
"text2text-generation",
"text-classification",
"text-to-image",
"text-to-speech",
"token-classification",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • The removed ones should be reverted
  • We should also add image-to-image here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are duplicates "text-to-image" exists in there already. Cleaning up ;-)

@patrickvonplaten
Copy link
Contributor Author

Ok things work now! @Narsil was so kind to show me all the commands!

./manage.py start hf-internal-testing/tiny-controlnet --gpu
url -X POST --data-binary "@tests/samples/plane.jpg" http://localhost:8000

@patrickvonplaten
Copy link
Contributor Author

@Narsil, feel free to merge once ok for you

@Narsil
Copy link
Contributor

Narsil commented Apr 19, 2023

Tests are failing uniquely because tiny-controlnet is slow to run on CPU. But it's working.

@Narsil Narsil merged commit fcf17c8 into main Apr 19, 2023
@Narsil Narsil deleted the add_img2img_diffusers branch April 19, 2023 12:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants