Fooocus 2.1.0 Image Prompts (Midjourney Image Prompts) #557

lllyasviel · 2023-10-07T23:04:16Z

lllyasviel
Oct 7, 2023
Maintainer

Fooocus 2.1.0 has completed the implementation of image prompts. Because after this version, almost all features of Midjourney are included, the version directly jump to 2.1.0.

Image Prompt is one of the most important feature of Midjourney. Below is the banner from Midjourney:

In Fooocus, it looks like this:

Technically, this feature is based on a mixture of IP-Adapter, and a pre-computed negative embedding from Fooocus team, an attention hacking algorithm from Fooocus team, and an adaptive balancing/weighting algorithm from Fooocus team.

The motivation of these efforts is to achieve a best match to the Midjourney Image Prompt. In other software like A1111/ComfyUI/InvokeAI, the IP-Adapter still has some open problems like ignoring text prompts, or over-burned results when multiple images are used. These problems are solved in Fooocus and users can enjoy Midjourney-like experience of Image Prompt.

The detailed differences are in the below table:

	Midjourney Image Pompt	IP-Adapter + A1111/ComfyUI/InvokeAI	Fooocus Image Prompt
Work together with text prompts	Text prompts and image prompts will be mixed	Tends to ignore text prompts	Text prompts and image prompts will be mixed
Multiple images as input	Result quality does not decrease for multiple image inputs	Using more images leads to worse result quality	Result quality does not decrease for multiple image inputs
When the method fails (single image inputs)	Give unrelated but still high-quality image	Give related but low-quality and over-burned image	Give unrelated but still high-quality image
When the method fails (multiple image inputs)	Partially ignore some images that it cannot understand, still give high-quality results	Give related but low-quality and over-burned image	Partially ignore some images that it cannot understand, still give high-quality results
Quality Influence	Using image prompt does not influence the output quality	Using image prompt influences the quality of base model	Using image prompt does not influence the output quality, almostly
Result diversity	Results are still diverse after using image prompts	Results tend to have small and minimized variations	Results are still diverse after using image prompts

Using this method will download 2.5GB files at the first time!

Example: Single Image Prompt without Text Prompts

(Non-cherrypicked random batch, default parameters, real results should be better if tuned)

(seed 1234, here is the image)

(this example uses default style and Fooocus V2 style)

Example: Single Image Prompt with Text Prompts

Note that mixing text and IP-Adapter is extremely difficult in ComfyUI/A1111. Fooocus does not have this problem.

(Non-cherrypicked random batch, default parameters, real results should be better if tuned)

(this example uses default style and Fooocus V2 style)

Example: Multiple Images without Text Prompts

Note that mixing multiple IP-Adapters is likely to cause lower result quality in ComfyUI/A1111. Using Fooocus can resolve this to some extents.

(Non-cherrypicked random batch, default parameters, real results should be better if tuned)

(this example uses default style and Fooocus V2 style)

Example: Multiple Images with Text Prompts and Even Multiple Styles

This is almost impossible in A1111/ComfyUI since mixing text and IP-Adapter is extremely difficult in ComfyUI/A1111, and mixing multiple IP-Adapters is likely to cause lower result quality in ComfyUI/A1111.

(Non-cherrypicked random batch, default parameters, real results should be better if tuned)

This image is too complicated to understand so I annotated here:

So mixing too many things make it hard to recognize but everything is there and it does not fail or causing quality decerase, unlike ComfyUI/A1111/InvokeAI.

Fooocus Image Prompt (Advanced)

If you check “advanced”, you will be able to use two structure controls:

PyraCanny: A pyramid-based Canny edge control. The reason is that SDXL uses 1024px images and standard Canny tends to miss some image details from time to time, at such a high resolution. This method uses multiple resolutions to detect canny edges and then combine them softly, so that more structures are captured (than canny). The pyramid part is from “Edge Drawing: A combined real-time edge and segment detector”. You will download 350MB control models when using it.

CPDS: A structure extraction algorithm from “Contrast Preserving Decolorization (CPD)”. The “CPDS” means CPD Structure. The control model is modified by Fooocus team – it starts from SAI’s depth control-lora. The reason for using this method is for the fast speed and download-free preprocessor. Note that we only use the structure part of images, and it is not really “decolorization”. You will download 350MB control models when using it.

(Non-cherrypicked random batch, default parameters, real results should be better if tuned)

(this example uses default style and Fooocus V2 style)

For developers:

In Developer Debug Mode, you can mix the upscale/vary/inpaint with all above features if you know what you are doing and REALLY need it (the denoising strength can also be set in Developer mode). You can also get the preprocessor result by checking the “debug preprocessor”.

But keep in mind:

If you accidentally get satisfying results in Fooocus by tuning a lot of advanced parameters, you should try to copy your positive prompt, reopen Fooocus, do not change anything, and paste the prompt. You will find that results are even better, and all those tunings are unnecessary. (The only exception is probably changing base model in “Advanced”.)

eric23 · 2023-10-08T06:21:32Z

eric23
Oct 8, 2023

Awesome!

0 replies

YANGlattez · 2023-10-08T07:30:17Z

YANGlattez
Oct 8, 2023

cool！

0 replies

woshitoutouge · 2023-10-08T07:30:28Z

woshitoutouge
Oct 8, 2023

OMG!

0 replies

sherozbek1706 · 2023-10-08T07:44:10Z

sherozbek1706
Oct 8, 2023

Using more images leads to worse result quality

1 reply

lllyasviel Oct 8, 2023
Maintainer Author

Different from A1111 or ComfyUI, using more images does not lead to worse result quality in Fooocus.
If you find unexpected examples, may directly open an issue.

mirek190 · 2023-10-08T16:33:17Z

mirek190
Oct 8, 2023

wow .... fooocus is getting better and better

0 replies

3Diva · 2023-10-08T21:17:51Z

3Diva
Oct 8, 2023

This looks like an incredible update! Could someone help me use it please? When I try to put an image in the image prompt and render I just get an error message:

ValueError: Query/Key/Value should all have the same dtype
query.dtype: torch.float16
key.dtype: torch.float32
key.dtype: torch.float32

EDIT: It works fine if I select "PyraCanny" or "CPDS" but if I try to use "Image Prompt" is gives me that ValueError.

5 replies

lllyasviel Oct 8, 2023
Maintainer Author

fixed in 2.1.19

3Diva Oct 9, 2023

@lllyasviel That works great now! Thank you so much!

ayushyadav547 Jan 5, 2024

Mine still shows error in FaceSwap

felipesmaia Jan 25, 2024

Me too.. Error in any function in input image

puntomaupunto Jan 25, 2024

same for me. In the output of the script there is a ctl-C.

mirek190 · 2023-10-08T22:09:25Z

mirek190
Oct 8, 2023

hard to say ...for me works ok

0 replies

lllyasviel · 2023-10-08T22:56:03Z

lllyasviel
Oct 8, 2023
Maintainer Author

2.1.19:

PyraCanny improved a bit
Note that CPDS may be changed in the future.

0 replies

lanyusan · 2023-10-09T00:22:32Z

lanyusan
Oct 9, 2023

@lllyasviel the new release is awesome.

Is there anyways to do style blending like this?

Or like what is done here:

https://www.tensorflow.org/tutorials/generative/style_transfer

I have tried the latest image prompt but couldn't get similar results.

3 replies

lanyusan Oct 9, 2023

So I have tried a few more times and manged to get some thing like this. I have to uncheck all styles.

lllyasviel Oct 9, 2023
Maintainer Author

you can get this if you do not use bad parameters and just keep everything as default, but not sure if this is what is expected because the style itself is messy

(seed 710945904

(v2.1.22

lanyusan Oct 9, 2023

I have tried a few more times with different samples. It seems for text prompt free style transfer, PyraCanny works the best.

lllyasviel · 2023-10-09T01:24:15Z

lllyasviel
Oct 9, 2023
Maintainer Author

fixed some errors in CPDS in 2.1.24

0 replies

3Diva · 2023-10-09T01:38:52Z

3Diva
Oct 9, 2023

The new features are so cool. The "Image Prompt" one seams a bit like "Revision". I'm enjoying the pose control we get from "PyraCanny" too! Very nice!

Thank you for the sweet new features!

0 replies

lllyasviel · 2023-10-09T01:53:00Z

lllyasviel
Oct 9, 2023
Maintainer Author

2.1.24:

You can now drag here to resize the area:

for example

1 reply

lslsl3q Dec 31, 2023

you are my hero

barepixels · 2023-10-09T01:58:20Z

barepixels
Oct 9, 2023

my try, wish there is a way to strengthen his face likeness, no style selected

here are the source images. maybe someone can do better

I think this is a better version. Wish I can just load the image with metadata and resume where I left off.

5 replies

lanyusan Oct 9, 2023

This s great. Would you please share a screen shot of your configuration for generating it?

Thanks.

barepixels Oct 9, 2023

There is no metadata embedded in the png. Unlike Foocus-MRE one can not restore with all the settings to continue where left off. This is all I have in my log. Log did not record new Image Prompt settings.

lanyusan Oct 9, 2023

Thanks. Did you use PyraCanny or CPDS or just plain image input?

barepixels Oct 9, 2023

I did a whole bunch of testing I don't remember

Marcelx8 Oct 11, 2023

I'm very new to this, but I'm playing around with Fooocus daily since about 2 weeks ago. You mentioned metadata - would it be possible to first do an upscale version of the image, or subtle variant, and then only try and combine the 2 styles with the output images of the variants/upscales? Would that process generate the meta required? If you respond with "nope" 😅, then I'll first need to learn more about these things, hehe.

The reason I asked is because I think that is how I got an image to create a better variant from the original. Again, maybe that was just by coincidence on my end.

lllyasviel · 2023-10-09T02:26:48Z

lllyasviel
Oct 9, 2023
Maintainer Author

Not an update but another message

This is a bird image

(generated by prompt "bird" all default parameters)

turn it into a "dragon", all default parameters, seed 123

You can see that the mixture between text and image is extremely robust and works in 100% cases.

Some users may think that they can just change the weight of IP-Adapter in other software like A1111/ComfyUI/InvokeAI or even diffusers to get similar performance and similar robustness. Unfortunatly, this is not possible, to the best of my knowledge.

If you know how to achieve such robustness in other software, please let me know.

6 replies

lllyasviel Oct 9, 2023
Maintainer Author

interesting.

Although we only want to compare open-source software, comparing with commercial online services is also possible.

lanyusan Oct 9, 2023

Promeai follows the first image's face more faithfully. It might be more preferable for ordinary users to have fun with their photos.

Would be great if Fooocus can detect faces and keep faces more faithfully.

lllyasviel Oct 9, 2023
Maintainer Author

Promeai's results are a typical case of "fail to make variations".

I do not think their generated face is the uploaded face. It is just a random face. But their model fail to produce enough variations so that users have less choices.

Fooocus also generate in a random way. But fooocus succeeded in making variations at least.

lanyusan Oct 9, 2023

Yes I understand.

I am thinking about ordinary users' cases. One common case is to restyle a portrait but keep face faithfully.

It will make Foocus a lot more appealing for those people if the img2img or image prompt functions have an option to preserve human faces more faithfully.

Just an idea.

lllyasviel Oct 9, 2023
Maintainer Author

You can try Anvanced -> Advanced -> Developer Debug Mode -> Softness of ControlNet -> set to lower value or 0 if really need this.

lllyasviel · 2023-10-09T03:00:34Z

lllyasviel
Oct 9, 2023
Maintainer Author

Hint:

you do not need to turn off "Fooocus V2" in most cases.

"Fooocus V2" is handled in a different way than text prompts. You do not need to worry about unwanted texts are added to your prompts.

1 reply

Greshnic07 May 17, 2024

Помогите плиз, никто не может ответить на мой вопрос: как например готовое фото девушки (после удаления BG ) поставить на готовый фон (например поле кукурузы) чтобы она гармонично вписалась туда ?
Всегда сохраняется только поза(

SHOSHINXX · 2024-01-06T17:03:39Z

SHOSHINXX
Jan 6, 2024

Hey guys!

I am playing around Fooocus and tried everything in my mind right now. I also didnt found any Video or thread what could help me. Maybe someone else solved the problem already or have some ideas to get this done.

I have Images of Flat T-Shirt/Hoodies which are not weared by a person. I want to add them to a generated person with the brand logo/name. I already tried every weights, inpaint variants and mix img2img + inpaint. Sadly nothing worked correctly. The best results are when I use the Inpaint variant and inpaint the whole Picture (It would be much better if there is a option to use images without background but this doestn work at all because you have to make one dot somewhere to start inpaint).So its pretty hard to catch the whole parts of the Image with Inpaint and also its takes a lot of time.

I hope someone can help me with that :)

Kind regards

3 replies

shalinpather Jan 9, 2024

Hi.

I am also trying to generate an image of a model using AI. Then, I want to input images of real clothes and accessories for the model to wear. I am also very much interested to know how to do this.

Seemingly, there are certain limitations with the current setup. That's just an aspect of the AI itself until an improved SDXL model comes along.

For now, you can use certain clothing items (sunglasses, shirts, etc.) that the AI can understand and put on the model without issues. The AI just can't draw certain clothing items. Moreover, use FaceSwap with the default settings if you already have your model. Finally, I used ImagePrompt for the clothing item. Weights between 1 and 1,2 worked best for me. Inpaint can fix the rest of the details, as far as the AI is capable of.

Understandably, this does not seem like much of a solution. Indeed, it is a workaround to follow if you are serious about this and would rather not wait until an improvement possibly comes along. This is what I followed, and hopefully, some insight comes of it.

Kind regards,
Shalin

SHOSHINXX Jan 10, 2024

Sadly for me its not working with the Weights between 1 and 1,2. The brand logo is still getting manupilated and also there is no model appearing, its just the t-shirt.
If you want we can connect via Discord, this is my name, you can add me and help each other. Maybe we can find a solution for this problem: baiseprint

shalinpather Jan 14, 2024

Hi,

I'm sorry I took this long to respond. I know it's quite unprofessional on my part. I've just been swamped with work lately.

Yes, I would like that. I will admit, I am quite new to this. Programming is familiar to me, but I am only starting to learn AI. I'm nowhere near fluent in it and this is very much a side project. Still, I'm learning quickly, and we can contribute to the community by working on this together. I have sent you a friend request. My presence isn't guaranteed, but I'll do my best.

I look forward to our collaboration.

puntomaupunto · 2024-01-16T08:18:12Z

puntomaupunto
Jan 16, 2024

(I hope this is the right thread: otherwise please direct me to the right discussion)
I cannot manage to position two people in the right position. Usually they are viewing front, sometimes they are face to face, but I would like for example to see them from behind, maybe with the head a bit tilted. I fear that my command of English is not enough to find the actual prompt: could someone help me?

2 replies

thiner Jan 16, 2024

Prompt: winter sunset, a couple sit side by side at lakeside, shot from back
The output:

puntomaupunto Jan 16, 2024

thank you very much!

artistry-intelligence-solutions · 2024-01-22T18:22:55Z

artistry-intelligence-solutions
Jan 22, 2024

hello everyone, I hope you are having a nice day. how do I use a custom .safetensor model when rendering an image from image prompt? So where do I put the .safetensor model. I got the example from civitai.com and I'm not able to make sense of it. an example of the prompt would be, "prompt, prompt, prompt lora:model_v01:0.7

2 replies

3Diva Jan 22, 2024

@artistry-intelligence-solutions If you're talking about using Lora models with Fooocus they'll need to be SDXL Loras, as SD1.5 Loras don't work with Fooocus. To use a SDXL Lora in Fooocus, download the model and then add it to the Lora folder in your Fooocus install folder (Fooocus >> models >> loras). After that launch Fooocus and the Lora should be in the Model tab under the Loras dropdown (click "Advanced" >> "Models" tab >> "LoRA 2" dropdown) and you should be able to select the Lora model from there if you put it in that Lora folder.

I hope that helps.

artistry-intelligence-solutions Jan 23, 2024

@3Diva it certainly does help, thank you very much

artistry-intelligence-solutions · 2024-02-08T09:33:07Z

artistry-intelligence-solutions
Feb 8, 2024

hello Fooocus community, I hope you are well. Could someone assist in figuring out how to teach the engine to render images with the full body of the subject visible as opposed to should crop image which it seems to default to. Ive tried declaring long shot, full body and putting close up, medium shot, shoulder crop in the negative prompt, but the only thing that seems to make a difference is changing the resolution to 768x1280 which affects the composition and the pose

1 reply

tybiboune Feb 8, 2024

hiya! I'd start by using image prompting in a very basic way, which would be a plain black image with a white part, make the white part blurry so that it wouldn't "enforce" a precise contouring and restrict the use of the prompting image (adjust strength & end). The white part would represent the place for your character, like a "ghost". Try inverse too (black on white background)
Or, and that's if you have a precise idea of the position you want to see for your character, you can make a quick mockup of your image by finding a silhouette (or even a photo) of a random character, place it where it belongs and at the size you need it to be, then use it as image prompt along with your regular prompt.
That should guide the generation a bit more towards what you want

madsteed · 2024-02-18T08:14:31Z

madsteed
Feb 18, 2024

image prompt不支持webp图像？

0 replies

zanaviska · 2024-02-23T15:17:17Z

zanaviska
Feb 23, 2024

Is there way to "move person to another location"? For example I give image prompt of a person in a car, and text prompt will be something like "make the person to be a teacher in a classroom"? Every time I try, it produces some different person

1 reply

mashb1t Feb 23, 2024
Collaborator

you most likely are better off using a prompt describing the target situation instead of "chsnge x to y" => "teacher" in combination with image prompt input.

baiyaj · 2024-02-29T03:21:53Z

baiyaj
Feb 29, 2024

The first picture is a single photo, the second picture is a multi-person photo, how to replace a certain face in the second picture with the first picture

0 replies

aflyrt · 2024-03-09T04:20:16Z

aflyrt
Mar 9, 2024

Cool, love it, I want apply to join Fooocus team. What should I do to apply to join?

1 reply

mashb1t Mar 9, 2024
Collaborator

@aflyrt as far as i could see you haven't contributed yet any code to Fooocus. Start by checking the open feature requests or issue reports and implementing them to get part of the community first.

artistry-intelligence-solutions · 2024-03-10T19:51:02Z

artistry-intelligence-solutions
Mar 10, 2024

hello everyone. I hope you are well. I am generating images of models using variuos prompts for what I want and using a face swap image prompt. It's quite important that the face is exactly the same as the one I've supplied but it seems that all the AI can do is a close approximation. Does anyone havy any ideas how to over come this?

1 reply

mashb1t Mar 10, 2024
Collaborator

@artistry-intelligence-solutions the best way to currently do this is either by training a LoRA, blending with Photoshop or similar or using Image Prompt with probably high(er)weight and Stop At 1. Please try and open a separate discussion in Q&A if more clarification is needed.

MADELAT · 2024-03-22T19:05:33Z

MADELAT
Mar 22, 2024

Connection problem

Hello everyone, I'm using focus and after a couple of hours of using it, it disconnects and it doesn't allow me to open the platform for around twelve hours. I'm accessing trough fooocus colab. Can someone help me?

Thank you

1 reply

ILikeToasters Mar 22, 2024

You are hitting the limits for Colab.

Blogwelt · 2024-03-30T15:04:41Z

Blogwelt
Mar 30, 2024

Wow what are you doing to me, never again ComfyUI or Stable. It's crazy what results you get with Fooocus. Please continue with the development. Thank you Thank you

0 replies

Ravinduflash · 2024-04-18T00:55:12Z

Ravinduflash
Apr 18, 2024

Can you develop this cool AI to run on our local computer? Google Colab's free tier GPU unit counts are being exceeded too quickly. Please help me.

1 reply

ILikeToasters Apr 18, 2024

Fooocus is designed to run locally on your computer. But you do need a computer that is powerful enough to run it. Just check on the main Github page and you can find out the requirements and there is a link to download.

kad-studio · 2024-05-08T08:12:08Z

kad-studio
May 8, 2024

Hello,
We need convert hand drawing sketch (architectural) to render or photorealism but do not change the building concept, only make different variation of materials and add some environment & lighting etc., but no matter how hard I try it is changing the building concepts, what should we do?
Usually in generation preview until step 10 it is very close to the main concept, but suddenly totally changes the design/environment
Original sketch:

Results with/without prompt, any setting such as control net/sketch ,mode etc.
Results:

6 replies

tybiboune May 8, 2024

ALso, if you want photorealistic results you need to use photorealistic models...the previous one was the very first attempt, not cherry picked, here's another:

tybiboune May 8, 2024

that one was without prompt (the previous ones weren't)

kad-studio May 11, 2024

Hello again and really appreciated for your reply.
The first image is very Ideal for us, but honestly I didn't get well what to do.
I did set like below image (enabled Image Input & Advanced, PyraCanny) and increased value, and with/without commands Positive/negative but I dont get good or even close similar result to yours,

May I see your commands and settings screen shop please, I am so noob in this job and I do any try not getting fine result, your first image is so so close to what we need to achieve.
Here is my result which is so bad with top image settings:

Also I tweaked Stop At values but not getting fine result

kad-studio May 11, 2024

Also here is my model tab settings maybe it helps

mashb1t May 11, 2024
Collaborator

@kad-studio let's move this discussion to a new thread as every time something is posted here all previous commenters do receive an email.
=> #2900

rasheednoorpk · 2024-05-24T11:15:40Z

rasheednoorpk
May 24, 2024

overwhelming!

0 replies

Shahfahad7866 · 2024-05-27T10:29:30Z

Shahfahad7866
May 27, 2024

Not Sure but faceswap never worked for me. Any suggestions?

0 replies

Fooocus 2.1.0 Image Prompts (Midjourney Image Prompts) #557

lllyasviel Oct 7, 2023 Maintainer

Example: Single Image Prompt without Text Prompts

Example: Single Image Prompt with Text Prompts

Example: Multiple Images without Text Prompts

Example: Multiple Images with Text Prompts and Even Multiple Styles

Fooocus Image Prompt (Advanced)

For developers:

Replies: 79 comments · 118 replies

lllyasviel Oct 8, 2023 Maintainer Author

lllyasviel Oct 8, 2023 Maintainer Author

lllyasviel Oct 8, 2023 Maintainer Author

lllyasviel Oct 9, 2023 Maintainer Author

lllyasviel Oct 9, 2023 Maintainer Author

lllyasviel Oct 9, 2023 Maintainer Author

lllyasviel Oct 9, 2023 Maintainer Author

lllyasviel Oct 9, 2023 Maintainer Author

lllyasviel Oct 9, 2023 Maintainer Author

lllyasviel Oct 9, 2023 Maintainer Author

lllyasviel Oct 9, 2023 Maintainer Author

mashb1t Feb 23, 2024 Collaborator

lllyasviel
Oct 7, 2023
Maintainer

Replies: 79 comments 118 replies

lllyasviel Oct 8, 2023
Maintainer Author

lllyasviel Oct 8, 2023
Maintainer Author

lllyasviel
Oct 8, 2023
Maintainer Author

lllyasviel Oct 9, 2023
Maintainer Author

lllyasviel
Oct 9, 2023
Maintainer Author

lllyasviel
Oct 9, 2023
Maintainer Author

lllyasviel
Oct 9, 2023
Maintainer Author

lllyasviel Oct 9, 2023
Maintainer Author

lllyasviel Oct 9, 2023
Maintainer Author

lllyasviel Oct 9, 2023
Maintainer Author

lllyasviel
Oct 9, 2023
Maintainer Author

mashb1t Feb 23, 2024
Collaborator