Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

😵‍💫 Face Models Comparison and Suggestions #195

Closed
cubiq opened this issue Jan 3, 2024 · 187 comments
Closed

😵‍💫 Face Models Comparison and Suggestions #195

cubiq opened this issue Jan 3, 2024 · 187 comments
Labels
documentation Improvements or additions to documentation

Comments

@cubiq
Copy link
Owner

cubiq commented Jan 3, 2024

⚠️ Preliminary Data ⚠️

Face Models Comparison

I started collecting data about all the face models available for IPAdapter. I'm generating thousands of images and comparing them with a face descriptor model. The result is subtracted to the original reference image. A value of 0 means 100% same person, 1.0 completely different.

BIAS! Important: please read!

The comparison is meant just as an overall help in choosing the right models. They are just numbers, they do not represent the actual image quality let alone the artistic value.

The face descriptor can be skewed by many factors and a face that is actually very good could get a low score for a number of reasons (head position, a weird shadow, ...). Don't take the following data as gospel, you still need to experiment.

Additionally the images are generated over a single pass of 30 steps. Better results could be probably achieved with a second pass and upscaling, but that would require a lot more time.

I think this data still has value to at least remove the worst offenders from your tests.

Round 1: skim the data

First step is to find the best performing checkpoints and IPAdapter face models (and face models combination). With that established we can move to the second phase which is running even more data concentrated on the best performers.

These are all the IPAdapter models that I've tested in random order, best performers are bold and will go to the next round.

  • PlusFace
  • FullFace
  • FaceID
  • FaceID + FullFace
  • FaceID + PlusFace
  • FaceID Plus
  • FaceID Plus + FaceID
  • FaceID Plus + PlusFace
  • FaceID Plus + FullFace
  • FaceID Plus v2 w=0.6
  • FaceID Plus v2 w=1
  • FaceID Plus v2 w=1.5
  • FaceID Plus v2 w=2
  • FaceID Plus v2 + PlusFace
  • FaceID Plus v2 + FullFace
  • FaceID Plus v2 + FaceID
  • FaceID Plus v2 + FaceIDPlus

These are the Checkpoints in random order, best performers are 🏆 bold.

  • 🏆 Deliberate_v3
  • Reliberate
  • absolutereality_v181
  • dreamshaper_8
  • icbinpICantBelieveIts_seco
  • 🏆 realisticVisionV51_v51VAE
  • realisticVisionV6_B1
  • juggernaut_reborn
  • epicrealism_naturalSin
  • edgeOfRealism_eorV20Fp16BakedVAE
  • 🏆 cyberrealistic_v41BackToBasics
  • 🏆 lifeLikeDiffusionV30

Dreamshaper will be excluded from photo-realistic models but I will run it again with other "illustration" style checkpoints.

The preliminary data is available in a google sheet: https://docs.google.com/spreadsheets/d/1NhOBZbSPmtBY9p52PRFsSYj76XDDc65QjcRIhb8vfIE/edit?usp=sharing

Round 2: Refining the data

In this phase I took the best performers from the previous round and ran more tests. Best results bold

  • FaceIDPlusv2 + PlusFace
  • FaceIDPlusv2 + FaceIDPlus
  • FaceIDPlusv2 + FullFace
  • FaceIDPlusv2 + FaceID
  • FaceIDPlusv2 2
  • FaceIDPlus + PlusFace
  • FaceIDPlus + FaceID
  • FaceID + FullFace

Basically more embeds, better results.

realisticVisionV51_v51VAE (NOT V6) Is overall the best performer but life like diffusion has often the single best result; meaning that the average is not as good as realistic vision, but sometimes you get that one result that is really good.

I tested both euclidean and 1-cosine and the result are surprisingly the same.

Since it seems that more embeddings give better results I'll also try to send multiple images of the same person to each model. I don't think it will help, but happy to be proven wrong.

The data for round 2 can be found here: https://docs.google.com/spreadsheets/d/1Mi2Pu9T3Hqz3Liq9Fdgs953fOD1f0mieBWUI6AN-kok/edit?usp=sharing

Preliminary SDXL

Combinations tested:

  • SDXL FaceID PlusFace
  • SDXL FaceIDPlusV2 PlusFace
  • 🏆 SDXL FaceIDPlusV2 FaceID

A the moment the best models seem to be:

  • 🏆 Juggernaut XL
  • 🏆 Realism Engine
  • base SDXL
  • ColossusProject
  • Realistic Stock Photo
  • Protovision XL
  • 🏆 TurboVision XL

Predictably V2+PlusFace again are the best performers. The best average is still .36.

Interestingly TurboVision XL performs very well.

Data: https://docs.google.com/spreadsheets/d/1hjiGB-QnKRYXTS6zTAuacRUfYUodUAdL6vZWTG4HZyc/edit?usp=sharing

Round 3: Testing multiple reference images

Processing...

Round 4: Higher resolution

Upscaling SD1.5 512×512 images is not advisable if you want to keep the likeliness as high as possible. Even using low denoise and high IPAdapter weight the base checkpoints are simply not good enough to keep the resemblance.

In my tests I lose about .5 likeliness after every upscale.

Fortunately you can still upscale SD1.5 models with SDXL FaceID + PlusFace (I used Juggernaut which is the best performer in the SDXL round). The results are very good. LifeLifeDiffusion and RealisticVision5 are still the best performers.

The average is still around 0.35 (which is lower than I'd like) but sometimes you get very good results (0.27), so it's worth running a few seeds and try with different reference images.

Result data here: https://docs.google.com/spreadsheets/d/1uVWJOcDxaEjRks-Lz0DE9A3DCCFX2qsvdpKi3bCSE2c/edit?usp=sharing

Methodology

I tried many libraries for feature extraction/face detection. In the aggregated results I find that the difference is relatively small, so at the moment I'm using Dlib and euclidean similarity. I'm trying to keep the generated images as close as possible in color/position/contrast to the original to have minimal skew.

I tried 1-consine and the results don't differ much from what is presented here so I take that the data is pretty strong. I will keep testing and update if there are any noticeable differences.

All primary embedding weights are set at .8, all secondary weights are set at .4.

@cubiq cubiq added the documentation Improvements or additions to documentation label Jan 3, 2024
@cubiq cubiq pinned this issue Jan 3, 2024
@xiaohu2015
Copy link

which face descriptor you used?

@cubiq
Copy link
Owner Author

cubiq commented Jan 3, 2024

I tried a few... we could run an average maybe? dlib, MTCNN, RetinaFace are decent and pretty fast. Insighface seems to be biased since you trained with that.

@xiaohu2015
Copy link

the metric is 1-cos similarity”?
in fact, I used another insightface model (not the training used one) to evaluate

@cubiq
Copy link
Owner Author

cubiq commented Jan 3, 2024

the metric is 1-cos similarity”? in fact, I used another insightface model (not the training used one) to evaluate

I tried both euclidean and 1-cos. The numbers are of course different but the result is more or less the same.

This is euc vs 1-cos. The final result doesn't change much.
image

Do you get vastly different results?

@xiaohu2015
Copy link

the metric is 1-cos similarity”? in fact, I used another insightface model (not the training used one) to evaluate

I tried both euclidean and 1-cos. The numbers are of course different but the result is more or less the same.

This is euc vs 1-cos. The final result doesn't change much. image

Do you get vastly different results?

FaceNet?

@cubiq
Copy link
Owner Author

cubiq commented Jan 3, 2024

yes, facenet. Again, I've tried a few options but the result seems more or less the same. FaceID Plus v2 at weight=2 is always at the top.

Interestingly FaceIDPlus and a second pass with PlusFace or FullFace is also very effective. That makes me think that there are more combinations that we haven't explored.

You seem very interested, I'm glad about that. Please feel free to share your experience/ideas if you want.

@xiaohu2015
Copy link

xiaohu2015 commented Jan 3, 2024

yes, i am very interested, because a good metric is important to develop a good model.

you are right, you can also try FaceID + FaceID Plus

thresholds = {
"VGG-Face": {"cosine": 0.40, "euclidean": 0.60, "euclidean_l2": 0.86},
"Facenet": {"cosine": 0.40, "euclidean": 10, "euclidean_l2": 0.80},
"Facenet512": {"cosine": 0.30, "euclidean": 23.56, "euclidean_l2": 1.04},
"ArcFace": {"cosine": 0.68, "euclidean": 4.15, "euclidean_l2": 1.13},
"Dlib": {"cosine": 0.07, "euclidean": 0.6, "euclidean_l2": 0.4},
"SFace": {"cosine": 0.593, "euclidean": 10.734, "euclidean_l2": 1.055},
"OpenFace": {"cosine": 0.10, "euclidean": 0.55, "euclidean_l2": 0.55},
"DeepFace": {"cosine": 0.23, "euclidean": 64, "euclidean_l2": 0.64},
"DeepID": {"cosine": 0.015, "euclidean": 45, "euclidean_l2": 0.17},
}

@cubiq
Copy link
Owner Author

cubiq commented Jan 3, 2024

is that the minimum threshold? You set it very high. Almost only FaceID alone performs that low. At least in my testing

@xiaohu2015
Copy link

by the way, do you have some ideas or suggestions on improving the result, which maybe helpful to me.

@xiaohu2015
Copy link

xiaohu2015 commented Jan 3, 2024

is that the minimum threshold? You set it very high. Almost only FaceID alone performs that low. At least in my testing

yes, from deepface repo

in fact, I found face ID embedding is very powerful, i think I should find better training tricks l.

@cubiq
Copy link
Owner Author

cubiq commented Jan 4, 2024

I have tried FaceID Plus v2 + FaceID and it generally outperforms everything else.

Also tried FaceID Plus v2 at weight=2.5, some checkpoints react well to it but in general it's not a big difference.

@xiaohu2015
Copy link

I have tried FaceID Plus v2 + FaceID and it generally outperforms everything else.

Also tried FaceID Plus v2 at weight=2.5, some checkpoints react well to it but in general it's not a big difference.

what do you think of this https://twitter.com/multimodalart/status/1742575121057841468 (multi image)

@xiaohu2015
Copy link

SDXL FaceID preview
sdxl_faceid

in my benchmark,the cos similarity is a little better than sd 1.5 FaceID

@cubiq
Copy link
Owner Author

cubiq commented Jan 4, 2024

what do you think of this https://twitter.com/multimodalart/status/1742575121057841468 (multi image)

I've seen people send multiple images trying to increase the likeliness. I'm not convinced it actually works, there's a lot of bias in "face" recognition. I will run some tests, honestly I think it's laziness. I was able to reach 0.27 likeliness with a good combination of IPAdapter models at low resolution.

Combining 2 IPAdapter models I think it's more effective than sending multiple images to the same model. But I'll make some tests.

PS: looking forward to the SDXL model!

@cubiq
Copy link
Owner Author

cubiq commented Jan 4, 2024

@xiaohu2015 do you already have the code for SDXL? So I can update it and we are ready at launch 😄

@xiaohu2015
Copy link

xiaohu2015 commented Jan 4, 2024

@xiaohu2015 do you already have the code for SDXL? So I can update it and we are ready at launch 😄

it same as SD 1.5 FaceID: face embedding + LoRA

but I am not sure if SDXL version really better than the SD 1.5 version, because evaluation metrics are often unreliable

@cubiq
Copy link
Owner Author

cubiq commented Jan 4, 2024

okay I ran more tests, any combination of Plusv2 with any other model is definitely a winner.

These are all good:

  • FaceIDPlusv2 + PlusFace
  • FaceIDPlusv2 + FaceIDPlus
  • FaceIDPlusv2 + FullFace
  • FaceIDPlusv2 + FaceID

The only other NOT v2 combination that seems to be working well is FaceIDPlus+FaceID.

I'll update the first post when I have more data

PS: I got a 0.26 today at low resolution! Looking forward to do some high resolution test 😄

@xiaohu2015
Copy link

I will update SDXL model now, you can also test it

@xiaohu2015
Copy link

@cubiq update at https://huggingface.co/h94/IP-Adapter-FaceID#ip-adapter-faceid-sdxl

but you should convert the lora part

@cubiq
Copy link
Owner Author

cubiq commented Jan 4, 2024

great thanks!

I just updated the first post with new info. Data for round 2 is here: https://docs.google.com/spreadsheets/d/1Mi2Pu9T3Hqz3Liq9Fdgs953fOD1f0mieBWUI6AN-kok/edit?usp=sharing

I'll check SDXL later 😄 and run dedicated tests on it too.

@cubiq
Copy link
Owner Author

cubiq commented Jan 5, 2024

I just had a look at the key structure of the SDXL lora and it's a darn mess 😄 do you have a conversion mapping @xiaohu2015 ?

@xiaohu2015
Copy link

xiaohu2015 commented Jan 5, 2024

#145 (comment)

I think we can refer to this. You can find a normal sdxl lora weight and load it, print its keys, then you can get diff2ckpt for sdxl

In the future version, lora should be not needed

@cubiq
Copy link
Owner Author

cubiq commented Jan 5, 2024

the structure is pretty different and I couldn't find a relationship at first sight. But I'll check better later. I'm a bit busy this week, I might be able to work on it next Monday.

0.to_q_lora.down.weight
0.to_q_lora.up.weight
0.to_k_lora.down.weight
0.to_k_lora.up.weight
0.to_v_lora.down.weight
0.to_v_lora.up.weight
0.to_out_lora.down.weight
0.to_out_lora.up.weight
1.to_q_lora.down.weight
1.to_q_lora.up.weight
1.to_k_lora.down.weight
1.to_k_lora.up.weight
1.to_v_lora.down.weight
1.to_v_lora.up.weight
1.to_out_lora.down.weight
1.to_out_lora.up.weight
1.to_k_ip.weight
1.to_v_ip.weight
2.to_q_lora.down.weight
2.to_q_lora.up.weight
2.to_k_lora.down.weight
2.to_k_lora.up.weight
2.to_v_lora.down.weight
2.to_v_lora.up.weight
...
139.to_v_ip.weight

On SDXL

lora_unet_input_blocks_1_0_emb_layers_1.alpha
lora_unet_input_blocks_1_0_emb_layers_1.lora_down.weight
lora_unet_input_blocks_1_0_emb_layers_1.lora_up.weight
lora_unet_input_blocks_1_0_in_layers_2.alpha
lora_unet_input_blocks_1_0_in_layers_2.lora_down.weight
lora_unet_input_blocks_1_0_in_layers_2.lora_up.weight
lora_unet_input_blocks_1_0_out_layers_3.alpha
lora_unet_input_blocks_1_0_out_layers_3.lora_down.weight
lora_unet_input_blocks_1_0_out_layers_3.lora_up.weight
lora_unet_input_blocks_2_0_emb_layers_1.alpha
lora_unet_input_blocks_2_0_emb_layers_1.lora_down.weight
lora_unet_input_blocks_2_0_emb_layers_1.lora_up.weight
lora_unet_input_blocks_2_0_in_layers_2.alpha
lora_unet_input_blocks_2_0_in_layers_2.lora_down.weight
lora_unet_input_blocks_2_0_in_layers_2.lora_up.weight
lora_unet_input_blocks_2_0_out_layers_3.alpha
lora_unet_input_blocks_2_0_out_layers_3.lora_down.weight
lora_unet_input_blocks_2_0_out_layers_3.lora_up.weight
lora_unet_input_blocks_3_0_op.alpha
lora_unet_input_blocks_3_0_op.lora_down.weight
lora_unet_input_blocks_3_0_op.lora_up.weight
lora_unet_input_blocks_4_0_emb_layers_1.alpha
lora_unet_input_blocks_4_0_emb_layers_1.lora_down.weight
lora_unet_input_blocks_4_0_emb_layers_1.lora_up.weight
lora_unet_input_blocks_4_0_in_layers_2.alpha
lora_unet_input_blocks_4_0_in_layers_2.lora_down.weight
lora_unet_input_blocks_4_0_in_layers_2.lora_up.weight
lora_unet_input_blocks_4_0_out_layers_3.alpha
...
lora_unet_output_blocks_8_0_skip_connection.lora_up.weight

So it looks a little more complicated than that 😄

@xiaohu2015
Copy link

@laksjdjf can you help

@xiaohu2015
Copy link

the structure is pretty different and I couldn't find a relationship at first sight. But I'll check better later. I'm a bit busy this week, I might be able to work on it next Monday.

0.to_q_lora.down.weight
0.to_q_lora.up.weight
0.to_k_lora.down.weight
0.to_k_lora.up.weight
0.to_v_lora.down.weight
0.to_v_lora.up.weight
0.to_out_lora.down.weight
0.to_out_lora.up.weight
1.to_q_lora.down.weight
1.to_q_lora.up.weight
1.to_k_lora.down.weight
1.to_k_lora.up.weight
1.to_v_lora.down.weight
1.to_v_lora.up.weight
1.to_out_lora.down.weight
1.to_out_lora.up.weight
1.to_k_ip.weight
1.to_v_ip.weight
2.to_q_lora.down.weight
2.to_q_lora.up.weight
2.to_k_lora.down.weight
2.to_k_lora.up.weight
2.to_v_lora.down.weight
2.to_v_lora.up.weight
...
139.to_v_ip.weight

On SDXL

lora_unet_input_blocks_1_0_emb_layers_1.alpha
lora_unet_input_blocks_1_0_emb_layers_1.lora_down.weight
lora_unet_input_blocks_1_0_emb_layers_1.lora_up.weight
lora_unet_input_blocks_1_0_in_layers_2.alpha
lora_unet_input_blocks_1_0_in_layers_2.lora_down.weight
lora_unet_input_blocks_1_0_in_layers_2.lora_up.weight
lora_unet_input_blocks_1_0_out_layers_3.alpha
lora_unet_input_blocks_1_0_out_layers_3.lora_down.weight
lora_unet_input_blocks_1_0_out_layers_3.lora_up.weight
lora_unet_input_blocks_2_0_emb_layers_1.alpha
lora_unet_input_blocks_2_0_emb_layers_1.lora_down.weight
lora_unet_input_blocks_2_0_emb_layers_1.lora_up.weight
lora_unet_input_blocks_2_0_in_layers_2.alpha
lora_unet_input_blocks_2_0_in_layers_2.lora_down.weight
lora_unet_input_blocks_2_0_in_layers_2.lora_up.weight
lora_unet_input_blocks_2_0_out_layers_3.alpha
lora_unet_input_blocks_2_0_out_layers_3.lora_down.weight
lora_unet_input_blocks_2_0_out_layers_3.lora_up.weight
lora_unet_input_blocks_3_0_op.alpha
lora_unet_input_blocks_3_0_op.lora_down.weight
lora_unet_input_blocks_3_0_op.lora_up.weight
lora_unet_input_blocks_4_0_emb_layers_1.alpha
lora_unet_input_blocks_4_0_emb_layers_1.lora_down.weight
lora_unet_input_blocks_4_0_emb_layers_1.lora_up.weight
lora_unet_input_blocks_4_0_in_layers_2.alpha
lora_unet_input_blocks_4_0_in_layers_2.lora_down.weight
lora_unet_input_blocks_4_0_in_layers_2.lora_up.weight
lora_unet_input_blocks_4_0_out_layers_3.alpha
...
lora_unet_output_blocks_8_0_skip_connection.lora_up.weight

So it looks a little more complicated than that 😄

ok, I will also upload a lora weight next week

@cubiq
Copy link
Owner Author

cubiq commented Jan 5, 2024

It seems to be working pretty well together with plusface, but results are a bit random (either very good or very bad). I'll run some stats on that too.

ComfyUI_temp_lffkp_00011_

reference image:
theron

@ultimatech-cn
Copy link

It is really great work!
I heard that a lot of people complain about similarity of double-chin face, big face, wearing glasses etc. Is there any test for these? Or some solution for these face shapes?

@cubiq
Copy link
Owner Author

cubiq commented Mar 27, 2024

@xiaohu2015 that's very interesting I tried with IPAdapter, it kinda works. Only SDXL though, any idea if it can be applied to SD15 too?

style_tr12
style_tr11
style_tr10

@StellarBeing25
Copy link

Hey @xiaohu2015 when will you release the updated IPAdapter Plus Face model you mentioned earlier?

@CapsAdmin
Copy link

by the way @xiaohu2015 maybe you are interested in this.
Bottom right is the reference. The other images are all generated with the same model (PLUS) at the same weight, but applying the weight differently to the unet blocks. It's pretty fascinating
weight_types2

yes,I think different blocks maybe control different contents.

https://b-lora.github.io/B-LoRA/ this work found the 4th control content and 5th control style

It feels the same as model merging. I've experimented a lot with a modified version of ipadapter that has the ability to change the weight for each layer.

My finding is that around 1-6 affect composition, 6 to 8 has the most impact on the face or subject, 8-10 or so affect skin and the remaining up affect detail. Perhaps more in a gradient. The 7th layer has the highest impact.

Same goes with the the lora weights.

SDXL contain a lot more layers, but it's kinda similar. If you reduce the range to be the same as sd15, (meaning 1 slider affect multiple layers) it's more or less the same.

@xiaohu2015
Copy link

xiaohu2015 commented Mar 28, 2024

下载 (1)
下载 (2)

add sdxl faceid Portrait, maybe you can test

@cubiq
Copy link
Owner Author

cubiq commented Mar 28, 2024

SDXL portrait?

@xiaohu2015
Copy link

xiaohu2015 commented Mar 28, 2024

SDXL portrait?

yes, it same as sd 15 https://huggingface.co/h94/IP-Adapter-FaceID/blob/main/ip-adapter-faceid-portrait_sdxl.bin. work better with 5+ face images. no lora. for text style, maybe lower the weight.

@xiaohu2015
Copy link

Hey @xiaohu2015 when will you release the updated IPAdapter Plus Face model you mentioned earlier?
it has some Limitation, hence currently I will not release

@cubiq
Copy link
Owner Author

cubiq commented Mar 28, 2024

SDXL portrait?

yes, it same as sd 15 https://huggingface.co/h94/IP-Adapter-FaceID/blob/main/ip-adapter-faceid-portrait_sdxl.bin. work better with 5+ face images. no lora. for text style, maybe lower the weight.

it works very well, I'll make a comparison with InstantID maybe. For the style you really need to lower the weight though. It depends a lot on the checkpoint and the kind of style you want

@xiaohu2015
Copy link

SDXL portrait?

yes, it same as sd 15 https://huggingface.co/h94/IP-Adapter-FaceID/blob/main/ip-adapter-faceid-portrait_sdxl.bin. work better with 5+ face images. no lora. for text style, maybe lower the weight.

it works very well, I'll make a comparison with InstantID maybe. For the style you really need to lower the weight though. It depends a lot on the checkpoint and the kind of style you want

it should be not so good as instantid, but more light

Repository owner deleted a comment from JorgeR81 Mar 28, 2024
Repository owner deleted a comment from JorgeR81 Mar 28, 2024
Repository owner deleted a comment from JorgeR81 Mar 28, 2024
Repository owner deleted a comment from julien-blanchon Mar 28, 2024
@cubiq
Copy link
Owner Author

cubiq commented Mar 28, 2024

please don't use this thread for chit-chat, open another if you want

@JorgeR81
Copy link

JorgeR81 commented Mar 28, 2024

@cubiq Sorry about that. I didn't mean to deviate from the conversation.

But you also deleted my first post, with a question about FaceID

Will it be possible to use embeds for FaceID ?

I can use a batch, but I'd like to set the weight for each image, like we do with embeds.

@cubiq
Copy link
Owner Author

cubiq commented Mar 28, 2024

yes, it's in my to-do

@JorgeR81
Copy link

JorgeR81 commented Apr 2, 2024

Using FaceID + InstantID together may not improve face likeness ...

But it gives interesting results when you're just generating new faces from a batch of random images.

test1

@mockinbirdy
Copy link

IP-Adapter-FaceID-Portrait supports up to 5 images.

But currently there is no way to add a bunch of images to process them through portrait sdxl in comfyui? May it be implemented? Preferably up to 5 images.

@cubiq
Copy link
Owner Author

cubiq commented Apr 3, 2024

you can send as many images as you want with an image batch node

@xiaohu2015
Copy link

@cubiq very amazing work! https://www.youtube.com/watch?v=b6TbdBJBI4Q&t=2s

@JorgeR81
Copy link

JorgeR81 commented Apr 3, 2024

#195 (comment)

Bottom right is the reference. The other images are all generated with the same model (PLUS) at the same weight, but applying the weight differently to the unet blocks. It's pretty fascinating

Is this the [ weight_type ] option on the new node ?

wt

@JorgeR81
Copy link

JorgeR81 commented Apr 5, 2024

I'm having great results with Juggernaut XL 9 lightning at 6 to 8 steps.
https://civitai.com/models/133005/juggernaut-xl

This is the same settings as the post above:
Same 5 Image Batch + InstantID + FaceID

But now with higher resolution ( 896x1344 and 768x1152 instead of 512x768 )
So I tried a lightning model for faster generations, but quality also seems to improve, at least with my current settings.

768 x 1152 image.jpg
896 x 1344 image.jpg
768 x 1152 ( with ControlNet ) image.jpg
768 x 1152 ( with ControlNet ) ( 6 steps vs 8 steps ) image.jpg
896 x 1344 ( with ControlNet ) ( 6 steps vs 8 steps ) image.jpg

@JorgeR81
Copy link

JorgeR81 commented Apr 5, 2024

Here is the same three way comparison of FaceID + InstantID, with Juggernaut XL 9 lightning, at high resolution.
I'm not really testing face likeness, I just used a batch of random images, from different characters.
I find the results more interesting when I use them together, but if could use only one, I would prefer FaceID, for this use case.

768 x 1152 ( 8 steps ) image.jpg
896 x 1344 ( 6 steps ) image.jpg

@mockinbirdy
Copy link

@JorgeR81 hi, mind sharing your workflow .json, if possible?

@JorgeR81
Copy link

JorgeR81 commented Apr 7, 2024

I've made a simplified version of the workflow, without custom nodes from other suites.
But it has all the nodes needed to create the images I shared.
You just need to load the image references you like.

Let me know if it's working correctly.
InstantID_FaceID.json

It still has ControlNet Preprocessor nodes, but you can delete them, if you don't want to install them.
https://github.com/Fannovel16/comfyui_controlnet_aux

This is set for SDXL Lightning.
For other checkpoints, you can use 20 steps and cfg 6.5 or 7.

You can bypass FaceID, InstantID or both ( see workflow image ).
FaceID seems to be better at blending different character styles.
But I also use InstantID together with FaceID, because sometimes I find the results more interesting.

If you want to use InstantID, you need to use an SDXL checkpoint.

The workflow is also in the preview image.

show workflow image image.jpg

@JorgeR81
Copy link

JorgeR81 commented Apr 7, 2024

I've also made a version of this workflow without InstantID, and with IPAdapter embeds, after the FaceID nodes.

FaceID_IPA_embeds.json

The workflow is also in the preview image.

show workflow image

@JorgeR81
Copy link

JorgeR81 commented Apr 7, 2024

After this fix, I updated the workflows I just shared here.
Now ControlNet is used after InstantID ( not before ), because performance doesn't seem to be affected anymore.

cubiq/ComfyUI_InstantID#121

@cubiq
Copy link
Owner Author

cubiq commented Apr 11, 2024

I'm closing this one and moving the discussion here

@cubiq cubiq closed this as completed Apr 11, 2024
@cubiq cubiq unpinned this issue Apr 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests