Fooocus 2.1.0 Image Prompts (Midjourney Image Prompts) #557
Replies: 79 comments 118 replies
-
Awesome! |
Beta Was this translation helpful? Give feedback.
-
Using more images leads to worse result quality |
Beta Was this translation helpful? Give feedback.
-
wow .... fooocus is getting better and better |
Beta Was this translation helpful? Give feedback.
-
This looks like an incredible update! Could someone help me use it please? When I try to put an image in the image prompt and render I just get an error message:
EDIT: It works fine if I select "PyraCanny" or "CPDS" but if I try to use "Image Prompt" is gives me that ValueError. |
Beta Was this translation helpful? Give feedback.
-
hard to say ...for me works ok |
Beta Was this translation helpful? Give feedback.
-
2.1.19: PyraCanny improved a bit |
Beta Was this translation helpful? Give feedback.
-
@lllyasviel the new release is awesome. Is there anyways to do style blending like this? Or like what is done here: https://www.tensorflow.org/tutorials/generative/style_transfer I have tried the latest image prompt but couldn't get similar results. |
Beta Was this translation helpful? Give feedback.
-
fixed some errors in CPDS in 2.1.24 |
Beta Was this translation helpful? Give feedback.
-
The new features are so cool. The "Image Prompt" one seams a bit like "Revision". I'm enjoying the pose control we get from "PyraCanny" too! Very nice! Thank you for the sweet new features! |
Beta Was this translation helpful? Give feedback.
-
2.1.24: |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Hint: you do not need to turn off "Fooocus V2" in most cases. "Fooocus V2" is handled in a different way than text prompts. You do not need to worry about unwanted texts are added to your prompts. |
Beta Was this translation helpful? Give feedback.
-
Hey guys! I am playing around Fooocus and tried everything in my mind right now. I also didnt found any Video or thread what could help me. Maybe someone else solved the problem already or have some ideas to get this done. I have Images of Flat T-Shirt/Hoodies which are not weared by a person. I want to add them to a generated person with the brand logo/name. I already tried every weights, inpaint variants and mix img2img + inpaint. Sadly nothing worked correctly. The best results are when I use the Inpaint variant and inpaint the whole Picture (It would be much better if there is a option to use images without background but this doestn work at all because you have to make one dot somewhere to start inpaint).So its pretty hard to catch the whole parts of the Image with Inpaint and also its takes a lot of time. I hope someone can help me with that :) Kind regards |
Beta Was this translation helpful? Give feedback.
-
(I hope this is the right thread: otherwise please direct me to the right discussion) |
Beta Was this translation helpful? Give feedback.
-
hello everyone, I hope you are having a nice day. how do I use a custom .safetensor model when rendering an image from image prompt? So where do I put the .safetensor model. I got the example from civitai.com and I'm not able to make sense of it. an example of the prompt would be, "prompt, prompt, prompt lora:model_v01:0.7 |
Beta Was this translation helpful? Give feedback.
-
hello Fooocus community, I hope you are well. Could someone assist in figuring out how to teach the engine to render images with the full body of the subject visible as opposed to should crop image which it seems to default to. Ive tried declaring long shot, full body and putting close up, medium shot, shoulder crop in the negative prompt, but the only thing that seems to make a difference is changing the resolution to 768x1280 which affects the composition and the pose |
Beta Was this translation helpful? Give feedback.
-
image prompt不支持webp图像? |
Beta Was this translation helpful? Give feedback.
-
Is there way to "move person to another location"? For example I give image prompt of a person in a car, and text prompt will be something like "make the person to be a teacher in a classroom"? Every time I try, it produces some different person |
Beta Was this translation helpful? Give feedback.
-
The first picture is a single photo, the second picture is a multi-person photo, how to replace a certain face in the second picture with the first picture |
Beta Was this translation helpful? Give feedback.
-
Cool, love it, I want apply to join Fooocus team. What should I do to apply to join? |
Beta Was this translation helpful? Give feedback.
-
hello everyone. I hope you are well. I am generating images of models using variuos prompts for what I want and using a face swap image prompt. It's quite important that the face is exactly the same as the one I've supplied but it seems that all the AI can do is a close approximation. Does anyone havy any ideas how to over come this? |
Beta Was this translation helpful? Give feedback.
-
Connection problem Hello everyone, I'm using focus and after a couple of hours of using it, it disconnects and it doesn't allow me to open the platform for around twelve hours. I'm accessing trough fooocus colab. Can someone help me? Thank you |
Beta Was this translation helpful? Give feedback.
-
Wow what are you doing to me, never again ComfyUI or Stable. It's crazy what results you get with Fooocus. Please continue with the development. Thank you Thank you |
Beta Was this translation helpful? Give feedback.
-
Can you develop this cool AI to run on our local computer? Google Colab's free tier GPU unit counts are being exceeded too quickly. Please help me. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
overwhelming! |
Beta Was this translation helpful? Give feedback.
-
Not Sure but faceswap never worked for me. Any suggestions? |
Beta Was this translation helpful? Give feedback.
-
Fooocus 2.1.0 has completed the implementation of image prompts. Because after this version, almost all features of Midjourney are included, the version directly jump to 2.1.0.
Image Prompt is one of the most important feature of Midjourney. Below is the banner from Midjourney:
In Fooocus, it looks like this:
Technically, this feature is based on a mixture of IP-Adapter, and a pre-computed negative embedding from Fooocus team, an attention hacking algorithm from Fooocus team, and an adaptive balancing/weighting algorithm from Fooocus team.
The motivation of these efforts is to achieve a best match to the Midjourney Image Prompt. In other software like A1111/ComfyUI/InvokeAI, the IP-Adapter still has some open problems like ignoring text prompts, or over-burned results when multiple images are used. These problems are solved in Fooocus and users can enjoy Midjourney-like experience of Image Prompt.
The detailed differences are in the below table:
Using this method will download 2.5GB files at the first time!
Example: Single Image Prompt without Text Prompts
(Non-cherrypicked random batch, default parameters, real results should be better if tuned)
(seed 1234, here is the image)
(this example uses default style and Fooocus V2 style)
Example: Single Image Prompt with Text Prompts
Note that mixing text and IP-Adapter is extremely difficult in ComfyUI/A1111. Fooocus does not have this problem.
(Non-cherrypicked random batch, default parameters, real results should be better if tuned)
(this example uses default style and Fooocus V2 style)
Example: Multiple Images without Text Prompts
Note that mixing multiple IP-Adapters is likely to cause lower result quality in ComfyUI/A1111. Using Fooocus can resolve this to some extents.
(Non-cherrypicked random batch, default parameters, real results should be better if tuned)
(this example uses default style and Fooocus V2 style)
Example: Multiple Images with Text Prompts and Even Multiple Styles
This is almost impossible in A1111/ComfyUI since mixing text and IP-Adapter is extremely difficult in ComfyUI/A1111, and mixing multiple IP-Adapters is likely to cause lower result quality in ComfyUI/A1111.
(Non-cherrypicked random batch, default parameters, real results should be better if tuned)
This image is too complicated to understand so I annotated here:
So mixing too many things make it hard to recognize but everything is there and it does not fail or causing quality decerase, unlike ComfyUI/A1111/InvokeAI.
Fooocus Image Prompt (Advanced)
If you check “advanced”, you will be able to use two structure controls:
PyraCanny: A pyramid-based Canny edge control. The reason is that SDXL uses 1024px images and standard Canny tends to miss some image details from time to time, at such a high resolution. This method uses multiple resolutions to detect canny edges and then combine them softly, so that more structures are captured (than canny). The pyramid part is from “Edge Drawing: A combined real-time edge and segment detector”. You will download 350MB control models when using it.
CPDS: A structure extraction algorithm from “Contrast Preserving Decolorization (CPD)”. The “CPDS” means CPD Structure. The control model is modified by Fooocus team – it starts from SAI’s depth control-lora. The reason for using this method is for the fast speed and download-free preprocessor. Note that we only use the structure part of images, and it is not really “decolorization”. You will download 350MB control models when using it.
(Non-cherrypicked random batch, default parameters, real results should be better if tuned)
(this example uses default style and Fooocus V2 style)
(this example uses default style and Fooocus V2 style)
(this example uses default style and Fooocus V2 style)
For developers:
In Developer Debug Mode, you can mix the upscale/vary/inpaint with all above features if you know what you are doing and REALLY need it (the denoising strength can also be set in Developer mode). You can also get the preprocessor result by checking the “debug preprocessor”.
But keep in mind:
If you accidentally get satisfying results in Fooocus by tuning a lot of advanced parameters, you should try to copy your positive prompt, reopen Fooocus, do not change anything, and paste the prompt. You will find that results are even better, and all those tunings are unnecessary. (The only exception is probably changing base model in “Advanced”.)
Beta Was this translation helpful? Give feedback.
All reactions