-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unCLIP image variation #1781
unCLIP image variation #1781
Conversation
The documentation is not available anymore as the PR was closed or merged. |
241d482
to
54176fc
Compare
32a14e2
to
12833eb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, thanks a lot for adding the pipeline! Would be nice to add copied from ..
comments wherever possible.
src/diffusers/pipelines/unclip/pipeline_unclip_image_variation.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome!
@@ -0,0 +1,454 @@ | |||
# Copyright 2022 Kakao Brain and The HuggingFace Team. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this so? Or is it just Hugging Face for the code? Just wondering, no idea how those things work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the licensing @patrickvonplaten added to the text to image pipeline. We should probably clarify with him :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think it's fine to mention Kakao Brain, since we use their code as a reference when implementing it here.
src/diffusers/pipelines/unclip/pipeline_unclip_image_variation.py
Outdated
Show resolved
Hide resolved
src/diffusers/pipelines/unclip/pipeline_unclip_image_variation.py
Outdated
Show resolved
Hide resolved
import argparse | ||
|
||
from diffusers import UnCLIPImageVariationPipeline, UnCLIPPipeline | ||
from transformers import CLIPImageProcessor, CLIPVisionModelWithProjection | ||
|
||
|
||
if __name__ == "__main__": | ||
parser = argparse.ArgumentParser() | ||
|
||
parser.add_argument("--dump_path", default=None, type=str, required=True, help="Path to the output model.") | ||
|
||
parser.add_argument( | ||
"--txt2img_unclip", | ||
default="kakaobrain/karlo-v1-alpha", | ||
type=str, | ||
required=False, | ||
help="The pretrained txt2img unclip.", | ||
) | ||
|
||
args = parser.parse_args() | ||
|
||
txt2img = UnCLIPPipeline.from_pretrained(args.txt2img_unclip) | ||
|
||
feature_extractor = CLIPImageProcessor() | ||
image_encoder = CLIPVisionModelWithProjection.from_pretrained("openai/clip-vit-large-patch14") | ||
|
||
img2img = UnCLIPImageVariationPipeline( | ||
decoder=txt2img.decoder, | ||
text_encoder=txt2img.text_encoder, | ||
tokenizer=txt2img.tokenizer, | ||
text_proj=txt2img.text_proj, | ||
feature_extractor=feature_extractor, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very informative, but I'm not sure we store this kind of scripts in the repo. The ones in the folder are usually about converting weights from other checkpoints. What do you think @patil-suraj?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to remove and just put in the PR description! lmk @patil-suraj
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, think no need to have this script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is better than not having a script at all. Think it's totally fine to leave it here as is. The main purpose of the scripts is really so that the user can convert the checkpoints themselves - I'm fine with the way it is. Better would be to directly convert from the original checkpoint, but for me this is ok as well and def better than not having anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Also, think it would be nice to add a doc page explaining the unCLIP pipelines. It's the first cascaded pipeline in diffusers, so would be nice to document the different components and how they work.
@@ -0,0 +1,454 @@ | |||
# Copyright 2022 Kakao Brain and The HuggingFace Team. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think it's fine to mention Kakao Brain, since we use their code as a reference when implementing it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice looks good to me!
* unCLIP image variation * remove prior comment re: @pcuenca * stable diffusion -> unCLIP re: @pcuenca * add copy froms re: @patil-suraj
Adds an unclip image variation pipeline
Converting the text to image pipeline to image variation
I uploaded the pipeline to https://huggingface.co/fusing/karlo-image-variations-diffusers if you want to skip this
From the diffusers root directory:
Using the model