Kandinsky_v22_yiyi #3936

yiyixuxu · 2023-07-04T04:34:53Z

🚨🚨🚨 Note: The main author of this PR is @cene555 and the Kandinsky team. For simplicity the original PR was continued here. Thanks a mille for the contribution @cene555 🚨🚨🚨

Authors of this PR:
Arseniy Shakhmatov
Anton Razzhigaev
Aleksandr Nikolich
Igor Pavlov
Andrey Kuznetsov
Denis Dimitrov

finishing up #3903

To-do:

add tests for text2img, img2img, inpaint, prior
test controlnet + prior_emb2emb
add doc

import torch
import numpy as np

from diffusers import KandinskyV22PriorEmb2EmbPipeline, KandinskyV22ControlnetImg2ImgPipeline
from transformers import pipeline
from diffusers.utils import load_image

def make_hint(image, depth_estimator):
  image = depth_estimator(image)['depth']
  image = np.array(image)
  image = image[:, :, None]
  image = np.concatenate([image, image, image], axis=2)
  detected_map = torch.from_numpy(image).float() / 255.0
  hint = detected_map.permute(2, 0, 1)
  return hint

depth_estimator = pipeline('depth-estimation')

pipe_prior = KandinskyV22PriorEmb2EmbPipeline.from_pretrained('kandinsky-community/kandinsky-2-2-prior',torch_dtype=torch.float16)
pipe_prior = pipe_prior.to("cuda")

pipe = KandinskyV22ControlnetImg2ImgPipeline.from_pretrained('kandinsky-community/kandinsky-2-2-controlnet-depth', torch_dtype=torch.float16)
pipe = pipe.to("cuda")


img = load_image(
             "https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main"
            "/kandinsky/cat.png"
        ).resize((768, 768))


hint = make_hint(img, depth_estimator).unsqueeze(0).half().to('cuda')

prompt = 'A robot, 4k photo'
negative_prior_prompt ='lowres, text, error, cropped, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, out of frame, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck, username, watermark, signature'

generator = torch.Generator(device='cuda').manual_seed(43)

# run prior pipeline  

img_emb = pipe_prior(prompt=prompt, image=img, strength=0.85, generator=generator)
negative_emb = pipe_prior(prompt=negative_prior_prompt, image=img, strength=1, generator=generator)

# run controlnet img2img pipeline
images = pipe(
    image=img, 
    strength=0.5, 
    image_embeds=img_emb.image_embeds, 
    negative_image_embeds=negative_emb.image_embeds, 
    hint=hint, 
    num_inference_steps=50, 
    generator=generator,
    height=768, 
    width=768).images

images[0].save("robot_cat.png")

HuggingFaceDocBuilderDev · 2023-07-04T04:41:24Z

The documentation is not available anymore as the PR was closed or merged.

src/diffusers/schedulers/scheduling_unclip.py

tests/pipelines/kandinsky_v22/test_kandinsky.py

yiyixuxu · 2023-07-06T07:00:21Z

@patrickvonplaten

finished all the to-dos from you

Remove the _decoder suffix from all file names to shorten the file name
Rename self.vae to self.movq since a MoVQ is used here again
Add copied from statements whenever it makes sense and rename the get_new_h_w better as explained above
Give Kandinsky 2.2 its own section in the docs

will send PR to the repo to change vae -> movq and then model cards maybe

docs/source/en/api/pipelines/kandinsky.mdx

pcuenca

Very cool!

docs/source/en/api/pipelines/kandinsky.mdx

src/diffusers/pipelines/versatile_diffusion/modeling_text_unet.py

Lime-Cakes · 2023-07-07T06:44:48Z

Thanks for the great work! Though, it seems that model weights for v2.2 isn't released? "kandinsky-community/kandinsky-2-2-controlnet-depth" can't be found. So the examples can't be run atm.

patrickvonplaten · 2023-07-12T19:24:51Z

See: https://huggingface.co/docs/diffusers/v0.18.2/en/api/pipelines/kandinsky#kandinsky-22 - it was open-sourced today!

Lime-Cakes · 2023-07-12T19:38:14Z

See: https://huggingface.co/docs/diffusers/v0.18.2/en/api/pipelines/kandinsky#kandinsky-22 - it was open-sourced today!

Thanks! I see it now! Looks great.

* Kandinsky2_2 * fix init kandinsky2_2 * kandinsky2_2 fix inpainting * rename pipelines: remove decoder + 2_2 -> V22 * Update scheduling_unclip.py * remove text_encoder and tokenizer arguments from doc string * add test for text2img * add tests for text2img & img2img * fix * add test for inpaint * add prior tests * style * copies * add controlnet test * style * add a test for controlnet_img2img * update prior_emb2emb api to accept image_embedding or image * add a test for prior_emb2emb * style * remove try except * example * fix * add doc string examples to all kandinsky pipelines * style * update doc * style * add a top about 2.2 * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * vae -> movq * vae -> movq * style * fix the #copied from * remove decoder from file name * update doc: add a section for kandinsky 2.2 * fix * fix-copies * add coped from * add copies from for prior * add copies from for prior emb2emb * copy from for img2img * copied from for inpaint * more copied from * more copies from * more copies * remove the yiyi comments * Apply suggestions from code review * Self-contained example, pipeline order * Import prior output instead of redefining. * Style * Make VQModel compatible with model offload. * Fix copies --------- Co-authored-by: Shahmatov Arseniy <62886550+cene555@users.noreply.github.com> Co-authored-by: yiyixuxu <yixu310@gmail,com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

cene555 and others added 14 commits June 29, 2023 23:20

Kandinsky2_2

c40e286

fix init kandinsky2_2

c082fdd

kandinsky2_2 fix inpainting

392cff0

rename pipelines: remove decoder + 2_2 -> V22

5296322

Update scheduling_unclip.py

8e6134d

remove text_encoder and tokenizer arguments from doc string

5834a82

add test for text2img

62af41a

add tests for text2img & img2img

0603e5a

fix

64b95f4

add test for inpaint

80df9c0

add prior tests

82d76df

style

f72b53d

copies

8d05dbf

Merge remote-tracking branch 'ru-diffusers/main' into kandinsky22-yiyi

bbe07ba

yiyixuxu added 14 commits July 4, 2023 14:55

add controlnet test

374f237

style

365fac5

add a test for controlnet_img2img

cec9160

update prior_emb2emb api to accept image_embedding or image

a27b520

add a test for prior_emb2emb

b4189a1

style

4a5c6ac

remove try except

8fc24e6

example

1480cdc

fix

ce7ea47

add doc string examples to all kandinsky pipelines

935614d

style

4c8c3ca

update doc

e737939

style

883a852

add a top about 2.2

dda70da

patrickvonplaten reviewed Jul 5, 2023

View reviewed changes

src/diffusers/schedulers/scheduling_unclip.py Show resolved Hide resolved

update doc: add a section for kandinsky 2.2

30c0c9f

yiyixuxu commented Jul 6, 2023

View reviewed changes

tests/pipelines/kandinsky_v22/test_kandinsky.py Outdated Show resolved Hide resolved

yiyixuxu added 10 commits July 6, 2023 05:30

fix

307de02

fix-copies

80d85d5

add coped from

453fed2

add copies from for prior

6959e60

add copies from for prior emb2emb

7bfe3e7

copy from for img2img

81c5c77

copied from for inpaint

39a49db

more copied from

9586192

more copies from

16440d8

more copies

145ef68

remove the yiyi comments

ff1a204

pcuenca reviewed Jul 6, 2023

View reviewed changes

docs/source/en/api/pipelines/kandinsky.mdx Outdated Show resolved Hide resolved

pcuenca approved these changes Jul 6, 2023

View reviewed changes

pcuenca added 7 commits July 6, 2023 13:28

Apply suggestions from code review

060488e

Self-contained example, pipeline order

fb3d0bb

Import prior output instead of redefining.

6d5e70d

Style

6a06ed4

Make VQModel compatible with model offload.

00c4981

Merge remote-tracking branch 'origin/main' into kandinsky22-yiyi

1c8fdd9

Fix copies

63e7795

patrickvonplaten merged commit 7462156 into main Jul 6, 2023

patrickvonplaten deleted the kandinsky22-yiyi branch July 6, 2023 13:17

Kandinsky_v22_yiyi #3936

Kandinsky_v22_yiyi #3936

Uh oh!

Conversation

yiyixuxu commented Jul 4, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🚨🚨🚨 Note: The main author of this PR is @cene555 and the Kandinsky team. For simplicity the original PR was continued here. Thanks a mille for the contribution @cene555 🚨🚨🚨

Uh oh!

HuggingFaceDocBuilderDev commented Jul 4, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yiyixuxu commented Jul 6, 2023

Uh oh!

Uh oh!

pcuenca left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Lime-Cakes commented Jul 7, 2023

Uh oh!

patrickvonplaten commented Jul 12, 2023

Uh oh!

Lime-Cakes commented Jul 12, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

yiyixuxu commented Jul 4, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Jul 4, 2023 •

edited

Loading