Appeal for the Separation of SD 1.5 from SDXL #1401

XT-404 · 2023-08-17T19:33:19Z

Considering the critical situation of SD 1.5 content creators, which has been severely impacted since the SDXL update, shattering any feasible Lora or CP designs,

We are requesting that SD 1.5 be separated from SDXL in order to continue designing and creating our CPs or Loras.

Many of us, including myself, have invested significant amounts of money to passionately create quality Checkpoints and Loras.

I now find myself completely handicapped and unable to design even a functional and worthy Lora or Checkpoint.

Numerous individuals in the community, whether in France or the United States, are suffering due to the forced installation of SDXL, which has destroyed our ability to design and enjoy our creations.

The fact that we cannot roll back since all the commits are obsolete further exacerbates the situation.

I appeal to intelligence, logic, and reason to rescue SD 1.5 from this SDXL nightmare, in the interest of the community that supports SD 1.5 and has no interest in SDXL.

I understand that it will require effort, undoubtedly, but please realize that people like me, who have invested over €6,000 in equipment for significant projects, are now stuck and technically unemployed due to this SDXL implementation.

Thank you for considering my request. I also urge the entire community to support this message so that SD 1.5 can be revived and no longer remain in its current state.

Also, thank you for the effort and work invested, but please, separate SD 1.5 from SDXL, for the sake of all those who support you, believe in you, and hope for a repaired and functional SD 1.5 to return.

Thank you in advance.

Best regards,

wendythethird · 2023-08-17T20:02:17Z

A little more effort and Khtulu will rule the world!

XT-404 · 2023-08-17T21:23:38Z

bmaltais · 2023-08-17T21:24:38Z

Let's agree on the last good commit and I will create a SD1.5 branch from it. Then we can work out why it will not run properly as in theory it should. It might come down to some gradio version for the UI.

That release will not see any further development but will allow to keep what used to work in SD1.5 functional.

Obviously if people contribute PR for the 1.5 branch I will merge them if it make sense...

XT-404 · 2023-08-17T21:55:52Z

@bmaltais

I agree with you on this point.

The problem is to determine which version was functioning correctly.
Some say version 21.8, others 21.7.
I would have liked to conduct the tests myself. However, when I want to do it, it's impossible for me since the setup is down, even if I modify the requirements or the .sh file you indicated to me.
Either we go for a version 21.7 close to 21.8, and we take the bull by the horns to make this version work and remain in an isolated corner without updates. Which, I believe, would be ideal for you, well, I think.

I will give you feedback tomorrow. My colleague and I will verify the logs and the journals where everything went perfectly fine. We will confirm with you if the version that we believe is indeed the version:

https://github.com/bmaltais/kohya_ss/releases/tag/v21.7.6

I will come back tomorrow to inform you about this without fail.

Thank you for your response.

XT-404 · 2023-08-18T18:44:12Z

Hello @bmaltais ,

I'm reaching out to you again, as indicated, to provide you with the version that has given me slight results so far.

I've tested the following versions: from version 21.6.5 to version v21.7.8. The only one that yielded a result of about 60% is this version. https://github.com/bmaltais/kohya_ss/releases/tag/v21.7.8

However, there are clearly numerous illogical anomalies. I conducted the following tests:

20 identical images.
Settings with batch 1, epochs 10, repetitions 5 > total of 1000 steps: this configuration has always given me excellent results for 20 images.
However, this time, they're completely distorted, Ktullu-style.
On the other hand, if I use batch 2, epochs 10, repetitions 10 > total of 1000 steps: still identical, I achieve a satisfactory result of 60%.
This is entirely illogical, considering that I'm not changing either the checkpoint or the image.
I performed all tests in the same manner across all versions.
Currently, the only one standing out is v21.7.8.
Now, the task is to understand why this version isn't working as before, why batch 1 and batch 2, configured with the same steps, yield completely different results, and especially why all the images come out in a "plastic deformed monster" mode, except under batch 2 in the v21.7.8 version.

Thank you for your feedback. I'm available for any tests or assistance that I can provide.

Best regards.

bmaltais · 2023-08-18T19:15:55Z

Well, if the version work to generate the models I can't really do much about the results. If you find a version that produce the results you used to have I can create a branch from it... but personally aI have never had issues with the models produced from any of the versions... so it is hard to troubleshoot. :-(

But for now I can create the sd2.5 branch from the 21.7.8 release so it can be used as a fundation.

XT-404 · 2023-08-18T19:17:12Z

@bmaltais how its a SD 2.5 branch?

bmaltais · 2023-08-18T19:21:11Z

Typo. I have published the code in the sd15 branch. I also updated the gradio release so it does not cause issue with the browsers.

XT-404 · 2023-08-19T08:09:31Z

Well, if the version work to generate the models I can't really do much about the results. If you find a version that produce the results you used to have I can create a branch from it... but personally aI have never had issues with the models produced from any of the versions... so it is hard to troubleshoot. :-(

But for now I can create the sd2.5 branch from the 21.7.8 release so it can be used as a fundation.

@bmaltais
The version works to carry out training, certainly without any error messages or major anomalies to report. However, it only functions at 60%. The model is not stable, and the batch system is completely disorganized. We can apply all possible parameters to try to rectify the situation, whether it's in the configuration, the model used, the photos, but the anomaly remains consistent: a plastic-like effect, deformations, etc.

bmaltais · 2023-08-19T08:30:58Z

If if find the release that work let me know. The code in the release is locked and should produce consistent results. Drivers update on the other hand have been known to cause training variations. It might be possible that new drivers are now used vs the ones that were a few months ago?

XT-404 · 2023-08-19T08:43:54Z

@bmaltais
To be completely honest, I have no idea. All I know is that for this version 21.7.8, when I used it, I would get magnificent results with Batch 1, epoch 10, reap 5, or with Batch 2, epoch 10, reap 10. Now, I have monsters. I pushed the training with reap 20 and then 30, the result is either the same or completely burnt out. I'm not working alone to conduct the tests, we are two, and despite our testing, nothing is good. The only people who are getting good results are those who have never done the Kohya_ss update and who are still managing to achieve beautiful Lora or CP.

XT-404 · 2023-08-19T08:59:26Z

@bmaltais

Version v21.5.11

can you please tell me how to modify this version to carry out tests on it please? there are 2 different gratio, thank you: version

ftfy==6.1.1
gradio==3.28.1; sys_platform != 'darwin'
gradio==3.23.0; sys_platform == 'darwin'
lion-pytorch==0.0.6
opencv-python==4.7.0.68
pytorch-lightning==1.9.0

XT-404 · 2023-08-19T09:26:20Z

I modified the versions so that it is installed which and ok,
now I'm running tests to check I'm also going to go back to the basic graphics driver, I have a 4090 I'm going to see if I can install the studio driver

bitsandbytes==0.35.0; sys_platform == 'win32'
bitsandbytes==0.38.1; (sys_platform == "darwin" or sys_platform == "linux")
dadaptation==1.5
diffusers[torch]==0.10.2
easygui==0.98.3
einops==0.6.0
ftfy==6.1.1
gradio==3.36.1; sys_platform != 'darwin'
gradio==3.23.0; sys_platform == 'darwin'
lion-pytorch==0.0.6
opencv-python==4.7.0.68
pytorch-lightning==1.9.0
safetensors==0.2.6
tensorboard==2.10.1 ; sys_platform != 'darwin'
tensorboard==2.12.1 ; sys_platform == 'darwin'
tk==0.1.0
toml==0.10.2
transformers==4.26.0
voluptuous==0.13.1
wandb==0.15.0

for BLIP captioning

fairscale==0.4.13
requests==2.28.2
timm==0.6.12

tensorflow<2.11

huggingface-hub>=0.14.0; sys_platform != 'darwin'
huggingface-hub==0.13.0; sys_platform == 'darwin'
tensorflow==2.10.1; sys_platform != 'darwin'

For locon support

lycoris_lora==0.1.4

for kohya_ss library

bmaltais · 2023-08-20T17:36:34Z

@XT-404 Let me know how this config goes.

XT-404 · 2023-08-20T17:43:12Z

@bmaltais
I am currently being tested on almost all old and current versions.

I must be at around 150 to 200 tests carried out since yesterday

currently I have 4 versions that stand out

I try to obtain a 75 / 80% image fidelity with different parameters

let it be 20 frames / 30 / 40 / 50 / 100 / 1000
and the same for rehearsals, etc.

I do my best to see the best targeted among the 4 remaining

XT-404 · 2023-08-21T17:13:25Z

@bmaltais

After 2 whole days of hard testing of configuration ect , I finally found a version that reaches 80% with parameters not too hardcore worked, however Cudnn should not be installed as bytsandbit, , without it's two there the program works.
The functional version has 80% success rate currently and version 21.5.11
The changes in Pytorch file 1 are as follows for it to work.

accelerate==0.18.0
albumentations==1.3.0
altair==4.2.2

remove the *
*# https://github.com/bmaltais/bitsandbytes-windows-webui/raw/main/bitsandbytes-0.38.1-py3-none-any.whl; sys_platform == 'win32'
*# This next line is not an error but rather there to properly catch if the url based bitsandbytes was properly installed by the line above...

bitsandbytes==0.35.0; sys_platform == 'win32'
bitsandbytes==0.38.1; (sys_platform == "darwin" or sys_platform == "linux")
dadaptation==1.5
diffusers[torch]==0.10.2
easygui==0.98.3
einops==0.6.0
ftfy==6.1.1
gradio==3.36.1; sys_platform != 'darwin'
gradio==3.23.0; sys_platform == 'darwin'
lion-pytorch==0.0.6
opencv-python==4.7.0.68
pytorch-lightning==1.9.0
safetensors==0.2.6
tensorboard==2.10.1 ; sys_platform != 'darwin'
tensorboard==2.12.1 ; sys_platform == 'darwin'
tk==0.1.0
toml==0.10.2
transformers==4.26.0
voluptuous==0.13.1
wandb==0.15.0
*# for BLIP captioning
fairscale==0.4.13
requests==2.28.2
timm==0.6.12
*# tensorflow<2.11
huggingface-hub>=0.14.0; sys_platform != 'darwin'
huggingface-hub==0.13.0; sys_platform == 'darwin'
tensorflow==2.10.1; sys_platform != 'darwin'
*# For locon support
lycoris_lora==0.1.4
*# for kohya_ss library
.

This time I can confirm that this version is viable at 80%

let it be under 20 frames, 50, 100, 1000.

all that remains is to work on the code to obtain a success rate of 95/100% and all will be perfect;)

PS: the graphics drivers had no impact on training, I used the old driver versions and the new ones, it did not change the training percentage.

XT-404 · 2023-08-21T17:22:38Z

Now the community has hot talents or Python developers to contribute to the building to perfect our dear friend Kohya_ss 21.5.11 to be perfect on this training ^^.
If I had the knowledge, I would have gladly supported and helped, but I do not have the level of @bmaltais

Loadus · 2023-08-22T16:02:16Z

Adding a sidenote here that I also experienced complete breakage of anything 1.5 training when SDXL stuff was added but I managed to solve it by re-installing CUDA 11.8 (also noting that current display driver is 531.61). Non-functioning or wobbly LoRAs were a problem for weeks but this was the thing that 'repaired' the training. Took several hours to debug, that for some (whatever) reason, xformers was borking the entire training - if I trained without it, everything was more or less correct. I traced it back to CUDA not 'connecting' to the training session at all (if that is even a good way to describe it).

After re-installing CUDA 11.8, training speed increased tremendously (going from ~4.5s/it ---> ~2.13s/it), so that was a further indication that something was borked badly.

Not sure if it will help anyone else, just thought I'd mention this.

07:23:01-369448 INFO Version: v21.8.7

07:23:01-390391 INFO nVidia toolkit detected
07:23:08-767791 INFO Torch 2.0.1+cu118
07:23:08-819921 INFO Torch backend: nVidia CUDA 11.8 cuDNN 8700
07:23:08-827899 INFO Torch detected GPU: NVIDIA GeForce RTX 3060 VRAM 12287 Arch (8, 6) Cores 28

bmaltais · 2023-08-22T16:55:26Z

Thank you @Loadus for sharing your experience... SO insummary folks with issues should re-install CUDA 11.8 and make sure thay use NVidia drivers 531.61.

XT-404 · 2023-08-22T17:01:50Z

@bmaltais
I will apply and test the information provided by @Loadus
I will post the analysis return and result here.

bmaltais · 2023-08-25T14:32:04Z

Direct link to 531.61 drivers:

Game Ready Driver Download Link: https://us.download.nvidia.com/Windows/531.61/531.61-desktop-win10-win11-64bit-international-dch-whql.exe

Studio Driver Download Link: https://us.download.nvidia.com/Windows/531.61/531.61-notebook-win10-win11-64bit-international-nsd-dch-whql.exe

XT-404 · 2023-08-25T18:58:42Z

@bmaltais
After testing on different subjects: realistic, manga, comics, anime, style.
The training method has changed and no longer works on 1000 steps but on 3000 steps.
On the other hand, strangely enough, everything works except the realistic one.
If I input realistic training, I get images that are extremely difficult for the indicated model to achieve.
If I input anime, manga, comics training, the rendering is perfect.
The anomaly with the realistic is present in all CP tests,
whether it's on CivicAI's CP dedicated to realism or on personal CPs designed for that purpose.

I of course use the indicated driver CUDA 11.8 and NVidia 531.61 drivers.

Sniper199999 · 2023-09-13T09:31:19Z

@bmaltais After testing on different subjects: realistic, manga, comics, anime, style. The training method has changed and no longer works on 1000 steps but on 3000 steps. On the other hand, strangely enough, everything works except the realistic one. If I input realistic training, I get images that are extremely difficult for the indicated model to achieve. If I input anime, manga, comics training, the rendering is perfect. The anomaly with the realistic is present in all CP tests, whether it's on CivicAI's CP dedicated to realism or on personal CPs designed for that purpose.

I of course use the indicated driver CUDA 11.8 and NVidia 531.61 drivers.

great findings by @XT-404 and @Loadus. Have you figured out why the steps have been increased to 3000 from 1000 and why the realistic images are hard to achieve?

XT-404 · 2023-09-13T16:09:17Z

@bmaltais After testing on different subjects: realistic, manga, comics, anime, style. The training method has changed and no longer works on 1000 steps but on 3000 steps. On the other hand, strangely enough, everything works except the realistic one. If I input realistic training, I get images that are extremely difficult for the indicated model to achieve. If I input anime, manga, comics training, the rendering is perfect. The anomaly with the realistic is present in all CP tests, whether it's on CivicAI's CP dedicated to realism or on personal CPs designed for that purpose.
I of course use the indicated driver CUDA 11.8 and NVidia 531.61 drivers.

great findings by @XT-404 and @Loadus. Have you figured out why the steps have been increased to 3000 from 1000 and why the realistic images are hard to achieve?

@Sniper199999

"After several weeks of intensive testing,

Training on realism does not work at all. For a reason I can't understand, training on manga/comics/BD/drawing/3D/2D works perfectly in LORA.

Realism, on the other hand, is completely shattered. Why are training steps above 1000 and jump to 3000? No idea. I tried to get closer to the most functional with 20 images, and only 3000 steps work. If I'm below that, I get under-training and if I go above, it burns the training (I'm under 4090). I don't have a slowness problem and the training remains at 2.5 or 2 without significant loss.

However, regarding the design of Checkpoints, it's not even worth mentioning: nothing works. I can put any type, all the CPs made from 3000 steps to 10K steps and others come out in confetti mode or completely blown up.

The only thing currently working on my side, whether on this version or version 21.8.8, is the creation of Lora manga, comics, bd, and nothing else.

I tried a series over several days of parameter modification, installation of old drivers, etc., to no avail.

Many people have given up on the idea of designing Lora or CP given the disastrous results obtained.

Personally, I'm not giving up, but I'm also tired of these utterly disastrous results and that nothing is found to set things right.

Being forced to use SDXL while many people refuse this version is really a punishment for us."

tornado73 · 2023-09-13T18:32:36Z

For information, the latest version that correctly teaches on realism for AMD 6000 cards line -is -21.5.8, everything above is horror -)
The forced transfer to SDXL, designed exclusively for new generation cards with a large amount of memory, left many enthusiasts behind
I ran training on my card, on the latest version after editing dependencies, but the results are terrible, with different settings,
It turns out that AMD is overboard-)

oo7male · 2023-10-10T14:57:51Z

Totally agree it's literally impossible to get consistent LoRA results with recent versions of kohya_ss. The only version that has been working pretty good for me is the collab version https://colab.research.google.com/github/hollowstrawberry/kohya-colab/blob/main/Lora_Trainer.ipynb which in turn uses https://github.com/kohya-ss/sd-scripts. Both commit e6ad3cb and 9a67e0d seems to work fine. Just sharing if anyone needs to try.

AIEXAAA · 2023-10-16T14:28:46Z

Totally agree it's literally impossible to get consistent LoRA results with recent versions of kohya_ss. The only version that has been working pretty good for me is the collab version https://colab.research.google.com/github/hollowstrawberry/kohya-colab/blob/main/Lora_Trainer.ipynb which in turn uses https://github.com/kohya-ss/sd-scripts. Both commit e6ad3cb and 9a67e0d seems to work fine. Just sharing if anyone needs to try.

You can try this method, it works for me regardless of the version:
kohya-ss/sd-scripts#855 (comment)

Please note that due to version updates, the line numbers may not be consistent, but the modified code is consistent

WilliamKappler · 2023-11-28T00:29:31Z

Sorry, I have been away... and last time I was here, missed this entire issue somehow.

Previously, I spent a lot of time looking into this issue and managed to find a way to reliably reproduce the "old" LORA behavior as described here: #1291 (comment) - though the results are not exactly the same as I got previously.

I can't agree with some of the comments about this only being a problem for realism. I've had all these issues trying to train and retrain a cartoony LORA, but perhaps that is because I am using NAI. Maybe I don't know what I am doing and have less margin of error.

Another observation I had about this matter is that the newer Kohya gives more reliable, but worse results. The old one gives much less stable results, but some of them are high quality. Put another way: 'new' is almost all poor quality images, 'old' is mostly awful but some great images.

heartlocket · 2024-01-01T20:37:57Z

Is Kohya_ss effectively over for non SDXL creators? has there been a fork or a dedicated project since then? I am curious how people are making loras now

bmaltais · 2024-01-02T21:24:41Z

Kohya_ss the author of the sd-scripts code base I use in this repo is not maintaining an sd1.5 branch… so I guess this is pretty much the end of the sd1.5 only code base.

His code should support both sd1.5 and SDXL but some of the new modules required may not produce the same sd1.5 results it used to.

I suggest you raise this concern directly with him on his sd-scripts repo.

XT-404 · 2024-01-02T22:03:30Z

Hello everyone,

It's been a while since I've posted in this topic, which I created due to multiple anomalies related to the Kohya_ss script. I'd like to clarify, as bmaltais mentioned, that he is not the original author of this script. Instead, he uses the independently developed Kohya_ss script.

Since my last post, I have achieved a lot. After several months of testing, I've noticed significant changes with the integration of SDXL into the SD1.5 Kohya_ss script. I've chosen to focus on version v21.8.10, which allows me, in 90% of cases, to create various types of Lora, in terms of style or concept. However, one issue persists since the addition of SDXL: the realism of characters, known or not, with any training checkpoint model.

To overcome this challenge, I developed a specific Checkpoint that excludes images of the Cartoon/2D/3D/ANIME/MANGA/2.5D type. By training the lora with realistic images, they are transformed into BD/COMICS/Cartoon versions, etc. The Checkpoint I created then transforms these 2D/anime images back into realistic versions. To date, this is the only method I've found to achieve pure realism with functional lora images.

I've also experimented with other training systems that have yielded similar results to Bmaltais's code. These systems all seem to be based on the same developer, the creator of the Kohya_ss script.

Currently, I am not aware of any ongoing project aiming to develop a script similar to Bmaltais's for SD1.5 users who prefer to stay on this version. Unless a talented developer like Bmaltais embarks on such a project, it seems that the only alternative is to stick to functional older versions and block updates.

Best regards,
XT404

AIEXAAA · 2024-01-03T11:23:56Z

Currently, I am not aware of any ongoing project aiming to develop a script similar to Bmaltais's for SD1.5 users who prefer to stay on this version. Unless a talented developer like Bmaltais embarks on such a project, it seems that the only alternative is to stick to functional older versions and block updates.

Best regards, XT404

Have you tried the modification method I mentioned before?

The latest version of Kohya_ss has basically fixed the SD1.5 problem, and the only remaining issue is the reproducibility of the loss function.

Because Kohya_ss corrected the SD1.5 problem, but at the same time modified some references and loaded VAE into xformess, this caused subsequent versions to still be trainable but the loss function is different from before. To restore the exact same loss function, just follow my modification method.

The evidence lies in the fact that I trained with the old version of Kohya_ss, trained with the latest version of Kohya_ss, and made the code modifications I mentioned. The lora trained by both under the same seed are almost identical in action and appearance.

XT-404 · 2024-01-04T11:28:16Z

Have you tried the modification method I mentioned earlier?

The latest version of Kohya_ss essentially resolved the SD1.5 issue, and the only remaining problem is the reproducibility of the loss function.

Since Kohya_ss fixed the SD1.5 issue but at the same time altered some references and loaded VAE into xformers, the subsequent versions could still be trained, but the loss function is different than before. To exactly restore the same loss function, simply follow my modification method.

The proof lies in the fact that I trained with the old version of Kohya_ss, with the latest version of Kohya_ss, and made the code changes I mentioned. The LORAs formed by both under the same seed are almost identical in action and appearance.

Hello @AIEXAAA

After applying the suggested method in the comment on the GitHub topic (kohya-ss/sd-scripts#855 (comment)), I encountered several technical difficulties.

Modification of the library\model_util.py file: Changing the code to initialize the loss values of the SD1.5 training seems to affect the results. By replacing the initial code block with the suggested one, the initial values become identical. However, this did not resolve the main problem.
Modification of the train_network.py file: Removing the following lines, intended for compatibility with PyTorch 2.0.0 and memory efficiency, resulted in the failure of the training launch:
```
if torch.__version__ >= "2.0.0":
    vae.set_use_memory_efficient_attention_xformers(args.xformers)
```
After reinstalling these lines, the problem persisted.

In conclusion, despite following the instructions scrupulously and checking for potential manipulation errors, the modified script does not function correctly. Reinstalling the script in its original version restored its operation, but the problem of realism remains unresolved.

I remain open to any further suggestions or assistance to rectify these issues.

AIEXAAA · 2024-01-04T12:13:06Z

I remain open to any further suggestions or assistance to rectify these issues.

It might be a translation issue, I’m somewhat unclear about your response.

Are you saying that when you make modifications according to the second point, your program throws an error?

If so, the most likely reason is that your version of Kohya_ss is not up-to-date. In one version of Kohya_ss, when the aforementioned two lines of code are removed, the GPU’s RAM usage becomes huge, leading to an error. The latest version of Kohya_ss has already fixed this.

If it’s not a program error, but you still can’t reproduce lora after the modification, then this is beyond what I can explain.

XT-404 · 2024-01-04T12:17:53Z

@AIEXAAA
I am not the latest version of kohya_ss, I use the version: v21.8.10
and indeed if I delete the line:

if torch.__version__ >= "2.0.0":
vae.set_use_memory_efficient_attention_xformers(args.xformers)

launching the training crashes automatically unless of course I reinstall it as originally

AIEXAAA · 2024-01-04T12:35:07Z

launching the training crashes automatically unless of course I reinstall it as originally

I dare not make a definitive statement here, but as you can see from the code, if the PyTorch version is too low, it will not load. Therefore, even if you remove this section of code, there should be no problem. Because after removal, it’s as if your PyTorch version is too low.

So, I’m puzzled by your results.

Additionally, changing

  if torch.__version__ >= "2.0.0":

to

  if torch.__version__ <= "2.0.0":

actually has the same effect. This way, you don’t need to reinstall it, and if it can’t run, you can directly change it back.

OriginLive · 2024-01-06T16:34:03Z

Spent 3 days trying to train a 1.5 checkpoint, only to find out it doesn't work on 1.5

3 days wasted,
thanks obmaltisama

XT-404 · 2024-01-07T10:58:03Z

Spent 3 days trying to train a 1.5 checkpoint, only to find out it doesn't work on 1.5

3 days wasted, thanks obmaltisama

Greeting @OriginLive
What version of Kohya_ss are you running on? for checkpoint training the version that I indicate works, it works in a different way from before all the modifications and implementation of sdxl my works if you take the time to do things well and an efficient and correct configuration, of course that requires: tests, analysis and clean dataset.

OriginLive · 2024-01-07T11:52:17Z

I was running latest. What version are you suggesting to use? What needs to be done? I've tried a 27 release before sdxl was mentioned but i can't use ui there with the current python version

XT-404 · 2024-01-07T12:11:53Z

I was running latest. What version are you suggesting to use? What needs to be done? I've tried a 27 release before sdxl was mentioned but i can't use ui there with the current python version

@OriginLive,

For old versions before the insertion of SDXL there are modifications to be made in a Python script file that @bmaltais
stated earlier in the topic.
the version I currently use to create Lora & Checkpoint and the following version: 21.8.10
95% functional.

the 0.5% absent is linked to direct Realism which does not work on any type of training and configuration or checkpoint.

OriginLive · 2024-01-07T15:19:03Z

What changes, there's drivers mentioned and all sorts of stuff like a different branch? Could you help out a bit more, i'm trying to get 1.5 working

XT-404 · 2024-01-07T15:41:15Z

What changes, there's drivers mentioned and all sorts of stuff like a different branch? Could you help out a bit more, i'm trying to get 1.5 working

Make it very professional and detailed in explanations:
@OriginLive
So, the version I am currently using is the following: 21.8.10
available at the following link: https://github.com/bmaltais/kohya_ss/releases/tag/v21.8.10
Download link in Zip format here: https://github.com/bmaltais/kohya_ss/archive/refs/tags/v21.8.10.zip

Simply install the version without doing any updates,
once the installation is done without performing the update or updates.
Do as usual.
On older versions of Kohya_ss, training generally ran at 1000 steps to get a clean, correct result.
Now, with the same number of images to get something correct and clean, training needs to run on a minimum of 2800 steps.
To achieve a significant training improvement gain, you can use regulation images directly related to what you want to train.
Of course, there's no need to insert the caption of the images for regulation.
Regarding the batch, always use batch 1, batches 2, 3, 4 are completely out of service.
I provide you with an example of the training method I practice to obtain viable and functional Lora or checkpoints:

In the settings section, I proceed as follows:

Training batch size: 1 | epoch 10 | max train: none | max train steps: none | save every epoch 1 | caption extension: .txt |
mixed precision: BF16 (I have a 4090) |
save precision: BF16
number of CPU: 2
seed: 1111
cache latents: active
cache latents to disk: disabled
LR scheduler: Constant
Optimizer: AdamW
LR scheduler extra arguments: None
Optimizer extra arguments: None
Learning rate: 0.0001
LR warmup: 0
LR number of cycles: None
LR power: None
Max resolution: 512x512
Stop text encoder training: 0
Enable buckets: active
Minimum bucket resolution: 256
Maximum bucket resolution: 2048
Text encoder learning rate: 0.00001
Unet learning rate: 0.0001
Network rank (dimension) 256
Network Alpha: 256
You can also modify it to 128 / 128
both work very well, the precision is more refined at 256 but the volume is more substantial (for a Lora).

That's a setting I've done on my side and it works perfectly, except as said for realism where the training fails at 99.8%.

OriginLive · 2024-01-07T18:54:55Z

Done this, but am still getting picassio on checkpoints:

XT-404 · 2024-01-07T19:05:38Z

@OriginLive
can you provide me with the following information:
what version of cuda are you using?
and what driver do you use for your graphics card?

XT-404 · 2024-01-07T19:07:30Z

Go DL and install : https://developer.nvidia.com/cuda-11-6-0-download-archive
and install driver Nvidia : 531.61 https://www.nvidia.fr/download/driverResults.aspx/204335/fr

OriginLive · 2024-01-07T19:26:07Z

`=============================================================

Modules installed outside the virtual environment were found.
This can cause issues. Please review the installed modules.

You can uninstall all local modules with:
deactivate
pip freeze > uninstall.txt
pip uninstall -y -r uninstall.txt

=============================================================

20:25:07-691269 INFO Version: v21.8.10

20:25:07-694779 INFO nVidia toolkit detected
20:25:08-780964 INFO Torch 2.0.1+cu118
20:25:08-794772 INFO Torch backend: nVidia CUDA 11.8 cuDNN 8700
20:25:08-795774 INFO Torch detected GPU: NVIDIA GeForce RTX 3090 VRAM 24576 Arch (8, 6) Cores 82
20:25:08-796772 INFO Verifying modules instalation status from requirements_windows_torch2.txt...
20:25:08-802773 INFO Verifying modules instalation status from requirements.txt...
no language
20:25:11-156841 INFO headless: False
20:25:11-162841 INFO Load CSS...
Running on local URL: http://127.0.0.1:7860
`

Leme try w/ an old driver

OriginLive · 2024-01-07T19:42:37Z

I installed the old drives, but it still says 11.8 for cuda :/ even though 11.6 was installed and the old drivers were installed as well

OriginLive · 2024-01-07T19:53:04Z

still

XT-404 · 2024-01-07T20:11:58Z

@OriginLive
version 11.8 must be removed

OriginLive · 2024-01-07T20:22:40Z

@OriginLive version 11.8 must be removed

I cannot, i do not see it in the list of available programs. Maybe it's part of pytoch?

edit: https://discord.gg/ySHHDKkhat
i'm avail on training sd discord

OriginLive · 2024-01-07T21:42:00Z

Got it to run on 11.6 by finally choosing pytorch 1, (pytorch 2 has 11.7 as earliest option);

despite that, i still get picasso :(

OriginLive · 2024-01-08T12:50:25Z

Ok, i think i figured it out, i was using 1e-4 for learning rate of Dreambooth, but 1e-6 or 1e-5 works better and doesn't produce the mosaic above, as finetuning would want a lower LR than LoRA since there’s a lot more weights that need adjusting and a higher LR would be much more destructive. I think 🤔

It works fine for me on latest cuda, drivers and latest kohya ss

bmaltais pinned this issue Aug 25, 2023

bmaltais closed this as completed Jan 29, 2024

bmaltais unpinned this issue Feb 17, 2024

Appeal for the Separation of SD 1.5 from SDXL #1401

Appeal for the Separation of SD 1.5 from SDXL #1401

Comments

XT-404 commented Aug 17, 2023

wendythethird commented Aug 17, 2023

XT-404 commented Aug 17, 2023

bmaltais commented Aug 17, 2023 • edited Loading

XT-404 commented Aug 17, 2023

XT-404 commented Aug 18, 2023

bmaltais commented Aug 18, 2023

XT-404 commented Aug 18, 2023

bmaltais commented Aug 18, 2023

XT-404 commented Aug 19, 2023 • edited Loading

bmaltais commented Aug 19, 2023

XT-404 commented Aug 19, 2023

XT-404 commented Aug 19, 2023 • edited Loading

XT-404 commented Aug 19, 2023

for BLIP captioning

tensorflow<2.11

For locon support

for kohya_ss library

bmaltais commented Aug 20, 2023

XT-404 commented Aug 20, 2023

XT-404 commented Aug 21, 2023

XT-404 commented Aug 21, 2023

Loadus commented Aug 22, 2023 • edited Loading

bmaltais commented Aug 22, 2023

XT-404 commented Aug 22, 2023

bmaltais commented Aug 25, 2023

XT-404 commented Aug 25, 2023

Sniper199999 commented Sep 13, 2023

XT-404 commented Sep 13, 2023

tornado73 commented Sep 13, 2023

oo7male commented Oct 10, 2023

AIEXAAA commented Oct 16, 2023

WilliamKappler commented Nov 28, 2023

heartlocket commented Jan 1, 2024

bmaltais commented Jan 2, 2024

XT-404 commented Jan 2, 2024

AIEXAAA commented Jan 3, 2024

XT-404 commented Jan 4, 2024

AIEXAAA commented Jan 4, 2024 • edited Loading

XT-404 commented Jan 4, 2024

AIEXAAA commented Jan 4, 2024

OriginLive commented Jan 6, 2024

XT-404 commented Jan 7, 2024

OriginLive commented Jan 7, 2024

XT-404 commented Jan 7, 2024

OriginLive commented Jan 7, 2024

XT-404 commented Jan 7, 2024 • edited Loading

OriginLive commented Jan 7, 2024

XT-404 commented Jan 7, 2024

XT-404 commented Jan 7, 2024 • edited Loading

OriginLive commented Jan 7, 2024 • edited Loading

OriginLive commented Jan 7, 2024

OriginLive commented Jan 7, 2024

XT-404 commented Jan 7, 2024 • edited Loading

OriginLive commented Jan 7, 2024 • edited Loading

OriginLive commented Jan 7, 2024

OriginLive commented Jan 8, 2024

bmaltais commented Aug 17, 2023 •

edited

Loading

XT-404 commented Aug 19, 2023 •

edited

Loading

XT-404 commented Aug 19, 2023 •

edited

Loading

Loadus commented Aug 22, 2023 •

edited

Loading

AIEXAAA commented Jan 4, 2024 •

edited

Loading

XT-404 commented Jan 7, 2024 •

edited

Loading

XT-404 commented Jan 7, 2024 •

edited

Loading

OriginLive commented Jan 7, 2024 •

edited

Loading

XT-404 commented Jan 7, 2024 •

edited

Loading

OriginLive commented Jan 7, 2024 •

edited

Loading