Ideal settings for face PLUS body training on current version #1173

User1231300 · 2022-12-27T15:35:10Z

Hello,

first of all thank you for what you are doing for free. This topic is not meant to be a complain and I hope it is clear.

I would like to open a thread and hopefully get good answers from people and from TheLastBen about what are the ideal settings in order to train the model on a single subject and having the best results possible:

Assuming we have access to unlimited images and unlimited time
Wanting to train on both subject face + and face body shots . This can be achieved by including various angles and distances from the subject (including more or less of the body in different images).

Thank you to everyone that will contribute.

iqddd · 2022-12-28T06:08:41Z

I join the question. There are images: face only; body only; body and face. In most "body and face" cases, the body is cropped due to the 512px square limit. In rare cases, a part of the face is cropped.
In most of the images, the "body" is "standing upright".
But some of the images can be described as "lying on" or "posing".

TheLastBen · 2022-12-28T07:51:02Z

The ideal settings is to stay below 15 for instance images, make sure they are diverse, you can reach results in less than 800 steps, that's less than 15 minutes, so you can comfortably try different settings until you get the desired result.

when you want to resume training, try reducing the learning rate slightly to concentrate on the small details of the picture.

tpcdaz · 2022-12-28T10:25:29Z

I personally tried the new settings as I only use dreambooth for faces and the only way I get any good results are by using the previous settings. so 3000 steps for around 20+ photos, 2e-6 unet for both images and text. Trying with 10 / 15 / 20 images with 800 steps or under gives me questionable results, and although people say "just keep adding to the training" not many people can do that as colab has limits. Even though I am a premium user the 15 minute training suddenly takes an hour because of all the minor tweaks you have to do to get it looking anything good. So if I were you use 3000 steps for 20+ images, 2e-6 unet learning rate for both the text encoder and images. Takes around 45 mins on standard colab gpu or 20 mins on premium colab and it will look GREAT first time with no tweaking.

juan9999 · 2022-12-29T20:10:06Z

yeah i had less than ideal results with latest settings and have had to lower the learning rate.

i have premium colab. are you quoting total session time or just training time? i am cheap and trying to calculate total time for the job using a premium gpu vs not

kozka · 2022-12-30T12:39:37Z

the fast_DreamBooth-Old-Method , always worked for me the first time and I haven't gotten that quality in the models anymore.
:( ,
Now I have to try and try a thousand times to get something similar and it still doesn't come.

TheLastBen · 2022-12-30T17:39:09Z

@kozka set the learning rate to 2e-6 for both unet and text_enc and up the unet steps to 3000, this is exactly like before.

LIQUIDMIND111 · 2022-12-31T00:18:13Z

the fast_DreamBooth-Old-Method , always worked for me the first time and I haven't gotten that quality in the models anymore. :( , Now I have to try and try a thousand times to get something similar and it still doesn't come.

Since they removed the OLD method, NONE of my face results are favorable, except styles,

i have no problems with styles, but on faces, i have paid 2 months of Google PRO and NEVER had a good ckpt file no matter what i do and i have been using this since October ,

ALL WAS GOOD with PRIOR images and the old method...... then after introducing the renaming INSTANCE IMAGES, everything was OK if you followed instructions, but now, after the OLD page was removed, this NEW page only works for me, for styles.... NICE quality,

But since 2 weeks from now, all models that i make from a person look ugly, and very hard to get settings correct....

TheLastBen · 2022-12-31T12:20:13Z

send me 10 of your instance images and I will train the model for you to prove that it works

Ekaitza1985 · 2023-01-01T02:19:25Z

Hello,
@TheLastBen I try with 3000 steps, 2e-6 on both and 450 on text learning and i get amazing results with my face but if i write, for example, "a beautiful portrait of will smith" seems will smith but he has my complexion and some of my features. Another error that i found is: if i write a prompt asking for a "ilustration" or "digital painting" the model ignores it and do a realistic photography and dunno why. I try on the same instalation even without rebooting the stable diffusion with SD 2.1 768px and i get a draw as i ask for. I do again the train model if you want it (i can share via my google drive) Will be nice to catch up my error. I spend 1 week trying by myself without any results before ask here, and dunno where i fail.
Ty in advance for your time and your job here

TheLastBen · 2023-01-01T08:43:53Z

mixing your face with other faces is a common issue with deep learning models called overfitting.

if you want your face to by stylized as painting, you need to reduce the text encoder steps to 250 and its learning rate to 1e-6

Ekaitza1985 · 2023-01-01T10:45:35Z

@TheLastBen ty for your tip. I will do now and i will coment to you the results.

Ekaitza1985 · 2023-01-01T16:07:39Z

Hello again @TheLastBen,
With your indications 3000steps with 2e-6 learning and 250 1e-6 i get better results on "portrait of will smith and portrait of my_token" but now ,
if i prompt:
Will Smith, d & d, fantasy, intricate, elegant, highly detailed, digital painting, artstation, concept art, matte, sharp focus, illustration, hearthstone, art by artgerm and greg rutkowski and alphonse mucha, 8k.
With negative:
deformed, cripple, ugly, additional arms, additional legs, additional head, two heads, multiple people, group of people
Euler A @ 50 steps 786px
I get amazing results!!
And if i prompt the same but with my token:
28101985KylarKyray, d & d, fantasy, intricate, elegant, highly detailed, digital painting, artstation, concept art, matte, sharp focus, illustration, hearthstone, art by artgerm and greg rutkowski and alphonse mucha, 8k.
And the same paramns.
I get awesome results but me as a Woman.

Seems that the class person are in cclonflict with i am a male? or smth wrong on the prompt? I cant understand why will smith is considered as a man as a base and me i need to put my token as a man.. bla bla ..

Thank you so much for this help! i am sooo happy to have some light on this ^^!

LIQUIDMIND111 · 2023-01-01T16:43:51Z

Hello again @TheLastBen, With your indications 3000steps with 2e-6 learning and 250 1e-6 i get better results on "portrait of will smith and portrait of my_token" but now , if i prompt: Will Smith, d & d, fantasy, intricate, elegant, highly detailed, digital painting, artstation, concept art, matte, sharp focus, illustration, hearthstone, art by artgerm and greg rutkowski and alphonse mucha, 8k. With negative: deformed, cripple, ugly, additional arms, additional legs, additional head, two heads, multiple people, group of people Euler A @ 50 steps 786px I get amazing results!! And if i prompt the same but with my token: 28101985KylarKyray, d & d, fantasy, intricate, elegant, highly detailed, digital painting, artstation, concept art, matte, sharp focus, illustration, hearthstone, art by artgerm and greg rutkowski and alphonse mucha, 8k. And the same paramns. I get awesome results but me as a Woman.

Seems that the class person are in cclonflict with i am a male? or smth wrong on the prompt? I cant understand why will smith is considered as a man as a base and me i need to put my token as a man.. bla bla ..

Thank you so much for this help! i am sooo happy to have some light on this ^^!

did you used the new CAPTION OPTION and the regularization images section too?

LIQUIDMIND111 · 2023-01-01T16:45:17Z

mixing your face with other faces is a common issue with deep learning models called overfitting.

if you want your face to by stylized as painting, you need to reduce the text encoder steps to 250 and its learning rate to 1e-6

is this caption section optional too? is the regularization images like the CLASS images on the old notebook?

LIQUIDMIND111 · 2023-01-01T16:46:25Z

Hello again @TheLastBen, With your indications 3000steps with 2e-6 learning and 250 1e-6 i get better results on "portrait of will smith and portrait of my_token" but now , if i prompt: Will Smith, d & d, fantasy, intricate, elegant, highly detailed, digital painting, artstation, concept art, matte, sharp focus, illustration, hearthstone, art by artgerm and greg rutkowski and alphonse mucha, 8k. With negative: deformed, cripple, ugly, additional arms, additional legs, additional head, two heads, multiple people, group of people Euler A @ 50 steps 786px I get amazing results!! And if i prompt the same but with my token: 28101985KylarKyray, d & d, fantasy, intricate, elegant, highly detailed, digital painting, artstation, concept art, matte, sharp focus, illustration, hearthstone, art by artgerm and greg rutkowski and alphonse mucha, 8k. And the same paramns. I get awesome results but me as a Woman.

Seems that the class person are in cclonflict with i am a male? or smth wrong on the prompt? I cant understand why will smith is considered as a man as a base and me i need to put my token as a man.. bla bla ..

Thank you so much for this help! i am sooo happy to have some light on this ^^!

also how many INSTANCE images you used with this new results?

Ekaitza1985 · 2023-01-01T17:50:00Z

Hello again @TheLastBen, With your indications 3000steps with 2e-6 learning and 250 1e-6 i get better results on "portrait of will smith and portrait of my_token" but now , if i prompt: Will Smith, d & d, fantasy, intricate, elegant, highly detailed, digital painting, artstation, concept art, matte, sharp focus, illustration, hearthstone, art by artgerm and greg rutkowski and alphonse mucha, 8k. With negative: deformed, cripple, ugly, additional arms, additional legs, additional head, two heads, multiple people, group of people Euler A @ 50 steps 786px I get amazing results!! And if i prompt the same but with my token: 28101985KylarKyray, d & d, fantasy, intricate, elegant, highly detailed, digital painting, artstation, concept art, matte, sharp focus, illustration, hearthstone, art by artgerm and greg rutkowski and alphonse mucha, 8k. And the same paramns. I get awesome results but me as a Woman.
Seems that the class person are in cclonflict with i am a male? or smth wrong on the prompt? I cant understand why will smith is considered as a man as a base and me i need to put my token as a man.. bla bla ..
Thank you so much for this help! i am sooo happy to have some light on this ^^!

also how many INSTANCE images you used with this new results?

Hello @LIQUIDMIND111
This were my params:
28101985KylarKyray (22 pics resized on my pc and not from colab). At least you could try num_photo*100 and then increase by 300 or 500 the train.

UNet_Training_Steps: 3000
UNet_Learning_Rate: 2e-6
Text_Encoder_Training_Steps:250
Text_Encoder_Learning_Rate: 1e-6
External Cap OFF
Style Training OFF
RES 768

LIQUIDMIND111 · 2023-01-03T01:21:37Z

Hello again @TheLastBen, With your indications 3000steps with 2e-6 learning and 250 1e-6 i get better results on "portrait of will smith and portrait of my_token" but now , if i prompt: Will Smith, d & d, fantasy, intricate, elegant, highly detailed, digital painting, artstation, concept art, matte, sharp focus, illustration, hearthstone, art by artgerm and greg rutkowski and alphonse mucha, 8k. With negative: deformed, cripple, ugly, additional arms, additional legs, additional head, two heads, multiple people, group of people Euler A @ 50 steps 786px I get amazing results!! And if i prompt the same but with my token: 28101985KylarKyray, d & d, fantasy, intricate, elegant, highly detailed, digital painting, artstation, concept art, matte, sharp focus, illustration, hearthstone, art by artgerm and greg rutkowski and alphonse mucha, 8k. And the same paramns. I get awesome results but me as a Woman.
Seems that the class person are in cclonflict with i am a male? or smth wrong on the prompt? I cant understand why will smith is considered as a man as a base and me i need to put my token as a man.. bla bla ..
Thank you so much for this help! i am sooo happy to have some light on this ^^!

also how many INSTANCE images you used with this new results?

Hello @LIQUIDMIND111 This were my params: 28101985KylarKyray (22 pics resized on my pc and not from colab). At least you could try num_photo*100 and then increase by 300 or 500 the train.

UNet_Training_Steps: 3000 UNet_Learning_Rate: 2e-6 Text_Encoder_Training_Steps:250 Text_Encoder_Learning_Rate: 1e-6 External Cap OFF Style Training OFF RES 768

thanks mate, i will try soon

kozka · 2023-01-05T18:24:34Z

I was testing many models and many configurations etc,,
and I think that with 23 photos, 2300 unet ie-5 and 1600 of train text 1e-6
It has given me very good results the first time, with

Ekaitza1985 · 2023-01-07T22:48:44Z

Thank you @kozka, i will try your params and write here if i get a good results too

Ekaitza1985 · 2023-01-08T15:27:50Z

hello again @kozka , with 1600 steps on train text i get a really bad results

kozka · 2023-01-08T15:56:10Z

ok I guess the photos I used had something to do with it,

Ekaitza1985 · 2023-01-08T16:00:10Z

@kozka i am really don't know... i am using the same photos that i used to create a sd 1.5 model with success but redimensioned to 768. any param works well and dunno why :S.

With your OK model coudl you do a test for me?

A beautiful portrait of Will Smith, award winning photography
negative: blurry, black and white, disfigured, malformed, kitch
Eurer:_a 30 steps

and thell me if the pic generated is will smith but similar to your tained model or is will smith 100%
will be great
Ty

kozka · 2023-01-08T21:01:50Z

I think the best thing is to use the normal model 1.5 to generate an image of willsmith next to someone and then put your face to the other person using your personal trained model, using inpainting to put it next to him or something like that, when I tried to train 2 models at the same time sometimes the images come out well and many others don't,
when you train your model and try to get willsmith out without previously training him with only images of 1 person it will mix the faces
I think you could do these two things
,1) the easiest is: you use model 1.5 to generate an image of will smith next to someone and then you change his face with your trained model
2) it is more difficult: you train the model with photos of the person you are training and also photos of willsmith.

*i have tried way 1, take a picture of willsmith with someone and then use my model in inpainting to change the face only
It didn't turn out very well but it's the first thing that came out quickly.

Ekaitza1985 · 2023-01-08T21:09:47Z

i didn't consider that -> person it will mix the faces.
So, @kozka , i change my ask question to: how can i know if my model is consistency ? if i try only, for example " a portrait of TOEKEN" o get a photo that seems more old than the model is or really really ugly and if i try to put negative prompts or "a beautiful portrait of TOEKEN" i get an image really more beauty than the model. So i am confused on this point.

kozka · 2023-01-08T21:45:01Z

I'm not an expert
When I create a model, the first thing I do is put ,photo token, and see what comes out, so when I retrain it I see if something improves, and I can compare it, if the photo token is very bad and does not look like the trained model, it is what they have little training or the selected photos are not the best,
but let's go if you put "a beautiful portrait of TOEKEN" if it is more handsome it is because you have made it more beautiful... but the important thing is that it resembles the trained model it will always make small variations that may not convince you of 100 photos it may only be 50 you very similar and only 10 remarkable and beautiful photos.
if you train a model a lot you will only get selfies,
and if you overtrain a model you will only get the same photos you used to train it.

Ekaitza1985 · 2023-01-08T23:33:51Z

Hello again @kozka !!!
First of all ty for helping me with this last msg!
I did some tests with the models that i created last days.
with LR that @LIQUIDMIND111 comented before i did:
1500 steps and text learning 150 = i try a photo of token, no negative prompts and from 10 photos 0 are similar to me
2000 steps and text learning 150 = i try a photo of token, no negative prompts and from 10 photos 3 maybe 4 are similar to me
2200 steps and text learning 1600 = i try a photo of token, no negative prompts and from 10 photos 3 maybe 4 are similar to me

So if i am not wrong i need some more unet training steps to get a better and acurate model, right?

kozka · 2023-01-09T08:48:13Z

you are right ,
or more steps you are trying 200 by 200 and you are seeing if it improves,
or change the input photos, crop them better, 3 of the face, 3 shoulders, 3 half-length, 1 full-length. + or - is also not an exact science. but if the photos do not show the face well or it is too far away, or the background is too chaotic, the best are neutral backgrounds such as a blank wall or without many things or people. it may cost you more.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ideal settings for face PLUS body training on current version #1173

Ideal settings for face PLUS body training on current version #1173

User1231300 commented Dec 27, 2022 •

edited

iqddd commented Dec 28, 2022

TheLastBen commented Dec 28, 2022 •

edited

tpcdaz commented Dec 28, 2022

juan9999 commented Dec 29, 2022

kozka commented Dec 30, 2022

TheLastBen commented Dec 30, 2022

LIQUIDMIND111 commented Dec 31, 2022 •

edited

TheLastBen commented Dec 31, 2022 •

edited

Ekaitza1985 commented Jan 1, 2023

TheLastBen commented Jan 1, 2023

Ekaitza1985 commented Jan 1, 2023

Ekaitza1985 commented Jan 1, 2023

LIQUIDMIND111 commented Jan 1, 2023

LIQUIDMIND111 commented Jan 1, 2023

LIQUIDMIND111 commented Jan 1, 2023

Ekaitza1985 commented Jan 1, 2023 •

edited

LIQUIDMIND111 commented Jan 3, 2023

kozka commented Jan 5, 2023

Ekaitza1985 commented Jan 7, 2023

Ekaitza1985 commented Jan 8, 2023

kozka commented Jan 8, 2023

Ekaitza1985 commented Jan 8, 2023

kozka commented Jan 8, 2023

Ekaitza1985 commented Jan 8, 2023 •

edited

kozka commented Jan 8, 2023

Ekaitza1985 commented Jan 8, 2023 •

edited

kozka commented Jan 9, 2023

Ideal settings for face PLUS body training on current version #1173

Ideal settings for face PLUS body training on current version #1173

Comments

User1231300 commented Dec 27, 2022 • edited

iqddd commented Dec 28, 2022

TheLastBen commented Dec 28, 2022 • edited

tpcdaz commented Dec 28, 2022

juan9999 commented Dec 29, 2022

kozka commented Dec 30, 2022

TheLastBen commented Dec 30, 2022

LIQUIDMIND111 commented Dec 31, 2022 • edited

TheLastBen commented Dec 31, 2022 • edited

Ekaitza1985 commented Jan 1, 2023

TheLastBen commented Jan 1, 2023

Ekaitza1985 commented Jan 1, 2023

Ekaitza1985 commented Jan 1, 2023

LIQUIDMIND111 commented Jan 1, 2023

LIQUIDMIND111 commented Jan 1, 2023

LIQUIDMIND111 commented Jan 1, 2023

Ekaitza1985 commented Jan 1, 2023 • edited

LIQUIDMIND111 commented Jan 3, 2023

kozka commented Jan 5, 2023

Ekaitza1985 commented Jan 7, 2023

Ekaitza1985 commented Jan 8, 2023

kozka commented Jan 8, 2023

Ekaitza1985 commented Jan 8, 2023

kozka commented Jan 8, 2023

Ekaitza1985 commented Jan 8, 2023 • edited

kozka commented Jan 8, 2023

Ekaitza1985 commented Jan 8, 2023 • edited

kozka commented Jan 9, 2023

User1231300 commented Dec 27, 2022 •

edited

TheLastBen commented Dec 28, 2022 •

edited

LIQUIDMIND111 commented Dec 31, 2022 •

edited

TheLastBen commented Dec 31, 2022 •

edited

Ekaitza1985 commented Jan 1, 2023 •

edited

Ekaitza1985 commented Jan 8, 2023 •

edited

Ekaitza1985 commented Jan 8, 2023 •

edited