How To Install And Use Kohya LoRA GUI / Web UI on RunPod IO With Stable Diffusion & Automatic1111 #249
FurkanGozukara
announced in
Tutorials
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
How To Install And Use Kohya LoRA GUI / Web UI on RunPod IO With Stable Diffusion & Automatic1111
Full tutorial: https://www.youtube.com/watch?v=3uzCNrQao3o
How to install famous Kohya SS LoRA GUI on RunPod IO pods and do training on cloud seamlessly as in your PC. Then use Automatic1111 Web UI to generate images with your trained LoRA files. Everything is explained step by step and amazing resource GitHub file is provided with necessary commands. If you want to use Kohya's Stable Diffusion trainers on RunPod this tutorial is for that.
Source GitHub File⤵️
https://github.com/FurkanGozukara/Stable-Diffusion/blob/main/Tutorials/How-To-Install-Kohya-LoRA-Web-UI-On-RunPod.md
Auto Installer Script⤵️
https://www.patreon.com/posts/84898806
Sign up RunPod⤵️
https://bit.ly/RunPodIO
Our Discord server⤵️
https://bit.ly/SECoursesDiscord
If I have been of assistance to you and you would like to show your support for my work, please consider becoming a patron on 🥰⤵️
https://www.patreon.com/SECourses
Technology & Science: News, Tips, Tutorials, Tricks, Best Applications, Guides, Reviews⤵️
https://www.youtube.com/playlist?list=PL_pbwdIyffsnkay6X91BWb9rrfLATUMr3
Playlist of StableDiffusion Tutorials, Automatic1111 and Google Colab Guides, DreamBooth, Textual Inversion / Embedding, LoRA, AI Upscaling, Pix2Pix, Img2Img⤵️
https://www.youtube.com/playlist?list=PL_pbwdIyffsmclLl0O144nQRnezKlNdx3
00:00:00 Introduction to how to install Kohya GUI on RunPod tutorial
00:00:20 Pick which RunPod server and template
00:01:20 Starting installation of Kohya LoRA on RunPod
00:03:42 How to start Kohya Web GUI after installation
00:04:16 How to download models on RunPod and start Kohya LoRA training
00:05:36 LoRA training parameters
00:06:57 Starting Kohya LoRA training on RunPod
00:07:46 Where are Kohya LoRA training checkpoints saved
00:08:05 How to use LoRA saved checkpoints on RunPod
00:08:29 How to use LoRA checkpoints in Automatic1111 Web UI
00:09:12 Noticing a very crucial mistake during training
00:10:59 Testing different checkpoints after fixing the previous training mistake
00:11:36 How to understand model overtraining
00:12:28 How to fix overtraining problem
Title: Install Kohya GUI on RunPod for LoRA Training: Step-by-Step Tutorial
Description:
Welcome to my comprehensive guide on how to install Kohya GUI on RunPod for LoRA training. I take you through each step, explaining clearly to ensure you can follow along with ease. This tutorial will help you set up a powerful development environment using an RTX 3090 GPU and a RunPod with 30GB RAM.
In this video, we will:
Deploy a community cloud with a specific template.
Edit template overrides and set the container disk.
Connect to JupyterLab and clone a GitHub repository.
Generate a new virtual environment and activate it.
Install Kohya on RunPod and handle common errors.
Set up and start the Kohya web UI on RunPod.
Execute a quick demonstration of training a realistic vision model.
Troubleshoot common errors during the training process.
Optimize the training process and improve training quality.
Navigate through our GitHub repository for further learning.
Remember, if you're unfamiliar with how to use Kohya or RunPod, I've included links to excellent tutorials in the video description.
Whether you're just getting started with Kohya, RunPod, or LoRA training, or looking to enhance your existing skills, this tutorial offers valuable insights.
Don't forget to like, share, and subscribe for more tutorials like this!
#StableDiffusion #Kohya #RunPod #LoRATraining #Tutorial #MachineLearning #AI
Video Transcription
00:00:00 Greetings everyone. In this video I will show you how to install Kohya GUI on RunPod to do
00:00:07 LoRA training. This has been asked to me many times. Sorry for the delay. Hopefully I will
00:00:12 explain that today. So this is the beginning screen of the RunPod IO. Let's go to the
00:00:18 community cloud. I will use RTX 3090 which is a very powerful GPU. Also this RunPod has 30GB
00:00:26 RAM. Click deploy. Select a template here. Select web automatic template. This is really important.
00:00:32 Currently it is 6.0.1. When you are watching this tutorial. It may be higher. Then go to
00:00:39 the edit template overrides. Make the container disk 10GB. You can set the volume disk as much as
00:00:44 you want. Set overrides and click continue. Click deploy. Okay container started. Let's
00:00:51 connect with JupyterLab. Now I am using a GitHub repository to put descriptions and used commands
00:00:58 in my tutorials. The link of this file will be in the description. Every command is written here. If
00:01:04 you don't know how to use Kohya I have excellent tutorial here. You can click this link and watch
00:01:09 it. Also if you don't know how to use RunPod I have another excellent tutorial here. You can
00:01:14 click and watch it. So the commands are ready here. Kohya LoRA GUI on RunPod. First thing is,
00:01:20 we will clone the repository with this command. Select it copy. Then in the JupyterLab terminal
00:01:26 in the workspace. Let's clone the repository like this. The repository is cloned. Then this command.
00:01:33 Now we are inside Kohya SS. Then we will generate a new virtual environment in this folder with
00:01:40 this command. The virtual environment is generated inside Kohya SS folder here. Then we will execute
00:01:46 this command. Now the new virtual environment is activated. Now we will run this command. This
00:01:53 won't affect our Stable Diffusion installation on our RunPod. So this is a very convenient way
00:01:59 to install Kohya on RunPod. Okay in the first try, we have got an error because obviously the
00:02:07 download of the file failed. So what I am going to do is I will repeat the operation. So I will
00:02:13 rerun the command to be sure. To rerun the command I just did like this while the virtual environment
00:02:19 is activated. Okay this time we didn't get the previous error. However we have got the tkinter
00:02:26 error. This is the most common error that you were encountering. I have a solution for that. While
00:02:32 virtual environment is activated you don't need to run this again. However if you start a new CMD
00:02:38 you need to do. Then just copy this command while virtual environment is activated paste
00:02:43 it like this. Then copy this command. This will install this tkinter. It is installed. And finally
00:02:51 we will install latest torch. Copy while virtual environment is activated. Install. This is really
00:02:58 important. You need to have Kohya SS virtual environment to be activated while executing all
00:03:06 of these commands. So if your virtual environment is activated, you don't need to run this once
00:03:11 again. The torch installation is pretty fast. It is installing 2.0.1 version which is the latest
00:03:18 official version. This is being installed in our Kohya virtual environment. This won't affect our
00:03:24 Stable Diffusion installation. By default Kohya is installing 1.12 which works pretty slow on
00:03:31 the newest GPUs. Okay the installation has been completed. You can ignore this message because
00:03:37 we won't use xformers while training. It is just slowing us down. Then we will start the Kohya web
00:03:43 UI. Copy this. For starting this, you don't need to have virtual environment activated. Actually
00:03:48 it is preferably not to activate it. Open a new terminal inside Kohya SS like this. Just copy
00:03:55 paste it. It will automatically activate the virtual environment and also it will give you
00:04:00 a Gradio link like this. Open it and the Kohya GUI web UI started on RunPod and ready to use.
00:04:07 As I said if you don't know how to use Kohya to do training, you can watch this amazing tutorial. I
00:04:13 will do quick demonstration of training. So I will use realistic vision full model. To
00:04:18 download it just copy this. Run it inside Stable Diffusion models folder so we can use it with our
00:04:24 Automatic1111 web UI. I will also download the best VAE file from this link. It will get into
00:04:30 VAE folder. You can also download realistic vision version 2 classification images from
00:04:36 this post. Posted on our Patreon. So I will use these training images same as in the last video.
00:04:43 I have uploaded them to here. Also classification images are ready as well. So in the Kohya web UI
00:04:51 obviously these icons won't work because we are on run pod. Therefore we need to copy paste the model
00:04:58 path ourselves or you can use the automatic models from this drop down. So I will get the path of the
00:05:05 model from Stable Diffusion realistic vision. Copy path pasted here. Put a backslash to the beginning
00:05:12 of it and our model is ready. Then as shown in the previous video ohwx man. I will set the training
00:05:20 images directory manually which is here copy path. Let's also set the regularization images. Copy
00:05:27 path like this and the destination directory will be: test1. Prepare training data. Okay,
00:05:33 test one appeared here. Copy info to folders tab. Okay, everything is copied and in the
00:05:39 training parameters. I will use everything. Default only network rank 256 these are the
00:05:46 best settings that I have found. In the advanced tab. Now this is important. Don't use xformers,
00:05:52 uncheck it. And finally, let's also save our configuration. So for saving, open a notepad
00:06:01 file type workspace/kohya_test1.json. It will be saved here. Copy paste it here. Save and
00:06:10 you will see kohya_test1.json file is generated. From there you can just load it by typing this,
00:06:18 type here and click load and it will load the settings. Okay, everything ready. Let's train
00:06:23 model and we will see entire training in here. By the way, we have forgotten to set number of
00:06:30 epochs. Therefore, I will kill and restart or shut down all of the terminals. Okay, let's go back to
00:06:37 Kohya folder. Open a new terminal, start the web UI with this command. Once again like this, open
00:06:44 the new link. Let's copy our saved configuration file like this: put a slash to beginning of it,
00:06:51 click load settings are loaded. Let's also set the epochs like 14. Save every one epoch, save
00:06:58 and click training okay. Ok, training started. You see there are some errors and warning messages.
00:07:05 These are fine. It is just working very well. The important thing is do not use xformers. Okay,
00:07:12 it has started and the it per second you are seeing 5.4 with batch size one. I can also
00:07:20 increase batch size. Currently gpu memory used is only 60% because Automatic1111 web ui is also
00:07:28 running at the same time on the same gpu. This is also using some vram you see I have opened
00:07:34 it. The first checkpoint already saved, the second checkpoint already saved. Now it is processing the
00:07:40 third checkpoint with 5.15 it per second. Okay, entire training is done in three minutes and only
00:07:48 53 seconds. The files are generated inside test1 folder inside model and here our checkpoints.
00:07:57 Let's also save the last checkpoint as 14, then I will select all while hitting left shift with
00:08:05 cut and then I will move them into my Stable Diffusion web UI inside models inside LoRA folder.
00:08:12 Paste here and all pasted. Let's refresh our web ui. Refresh models folder. Let's pick realistic
00:08:20 vision. Okay, realistic vision is selected. Click show hide extra networks. In here click LoRA click
00:08:27 refresh. Okay LoRA checkpoints arrived. Let's see last checkpoint and see our result photo of ohwx
00:08:35 man. Generate and here our picture. It doesn't look very good. We need to do some beautifying
00:08:41 and also some checkpoint comparison. So I will go to the tutorials in my GitHub page. I will go to
00:08:48 the generate studio quality realistic photos. In here I have some prompts. I will copy the negative
00:08:54 prompt as well and let's say dpm SDE Karras, 30 steps cfg scale 5 let's try again. Okay,
00:09:04 still not looking very good so let's try different checkpoints. Interestingly, the results are not
00:09:11 very good. I have found the reason because I have uploaded only one training image and based on this
00:09:19 image, the model was trained. How did I notice it? I noticed it from these processing messages
00:09:27 displayed on the command line interface. You see it says that 40 ohwx man is containing one image
00:09:36 files so you may also encounter such problem. Be careful. Now i will repeat the training and see
00:09:42 what will happen and nothing else is different only I will change the model output name as
00:09:49 test3 save hit train. This time it will take more time because it was doing training only on single
00:09:56 image. Now the training. Okay it is still seeing only single image. Oh I see because we need to
00:10:03 update this folder as well. Don't forget that. So let's kill this too. So I go to the test1 folder.
00:10:09 Go to the image folder in here. I will upload the training images into this folder otherwise
00:10:16 it won't be effective. Let's go back to the Kohya ss and restart. Okay, test4 save, train and you
00:10:24 see now it has found 13 images. Correct, number of steps is correct, total number of steps and other
00:10:31 things are correct. Of course this time it will take 13 times more time. I am not deleting any
00:10:37 of these parts of the video because you may also encounter such problems. You may also make same
00:10:44 mistakes. This is how you debug your mistake, debug your error, and fix it. Training started
00:10:51 and this time it is taking like 48 minutes. It has been 10 epochs since the training started.
00:10:57 I think this is enough for testing purposes and demonstration. So I will terminate this terminal.
00:11:04 The model files are saved inside test1 folder, inside model with the name as test4. So I will cut
00:11:13 them, paste into the LoRA folder. Paste it. Let's connect to our Stable Diffusion web UI. Let's load
00:11:20 the last prompt. So this time we will use the new LoRA. To do that let's click the show hide extra
00:11:28 networks LoRA refresh. Okay, test for LoRA has arrived. Let's look for the checkpoint 6. Okay, it
00:11:36 looks like memorized, overtrained because there is no stylization. Let's look for lower checkpoint.
00:11:43 With checkpoint 2, we are able to get somewhat okay results. However, this is still not very
00:11:51 good. I know the reason. Because I have repeating backgrounds and same clothing in my training
00:11:59 images and when I check the generated images I see that it is almost generating same backgrounds
00:12:05 in the images. Which means it is memorized. One another thing is, even if I use checkpoint 3, you
00:12:14 see it is the same place of the training images. That means this model is already over trained with
00:12:21 checkpoint 3, checkpoint 2 is also already over trained I think. Therefore, what we need to do is
00:12:29 we need to have better training data set. First of all, this is really important. Another thing is we
00:12:36 need to reduce number of repeating because with number of repeating 40, we are not able to save
00:12:44 more frequent checkpointing with lesser training data. It is saving checkpoints after every 40
00:12:51 multiplied with 30 steps. Therefore, it is 520 steps over for every checkpoint saving. Therefore,
00:12:58 we can reduce this to 20 and have more frequent, more fine-tuned checkpoints. Other than that
00:13:07 network, rank 128 may work better on unix. Maybe we can try other optimizers you see,
00:13:15 there are so many optimizers, but improving our training data set is the number one thing that
00:13:21 will improve our training quality. This is all for today. So the link of this page will be in
00:13:27 the description and also in the pinned comment of the video. Everything you need is written here:
00:13:33 I didn't compare realistic vision half model versus full model so you can test both of them.
00:13:39 I will also add the full model link here. If you support us on Patreon I would appreciate that very
00:13:44 much. Also, on our channel, we have amazing other Stable Diffusion related videos as well. Just go
00:13:51 to the playlist, you will see our Stable Diffusion playlist. All of the Stable Diffusion related
00:13:57 videos are in here. Check it out! Also, please support us on Patreon and by joining our youtube
00:14:03 channel. I would appreciate those very much. And if you star our repository, fork it and watch it.
00:14:10 I would appreciate that too. You will find a lot of useful stuff on our GitHub repository. You will
00:14:17 find tutorials, other useful readme files, and in our GitHub page all of our Stable Diffusion
00:14:25 tutorials are listed like you are seeing right now. Neatly organized with their thumbnails,
00:14:31 their titles so you can check out these links and see which one of them you want to learn. Hopefully
00:14:38 see you in another amazing video tutorial. And don't forget to join our Discord channel.
Beta Was this translation helpful? Give feedback.
All reactions