# Stable Diffusion XL Fine-Tuning with Kohya SS

*This solution creates all the necessary components to get you started quickly with fine-tuning Stable Diffusion XL with a custom dataset, using a custom training container that leverages Kohya SS to do the fine-tuning. Stable Diffusion allows you to generate images from text prompts. The training is coordinated with a SageMaker pipeline and a SageMaker Training job. This solution automates many of the tedious tasks you must do to set up the necessary infrastructure to run your training. You will use this Notebook to set up the solution. For a general overview of the solution components, see the README file.*

## Step One - Create the necessary resources through Cloudformation

This solution has been automated using a Cloudformation template, located in this project directory. You may either run it through the AWS console, or by the CLI command below. In the template.yml file, you may update the "KOHYA_SS_VERSION" environment variable to use a specific version of Kohya SS.

**To run the Cloudformation template via the AWS console, follow the steps below:**

1. Navigate to the Cloudformation console and click "Create Stack".
2. Select "Upload a template file" and click "Choose file". Select the template.yml file located in this project directory and click "Next".
3. Enter a stack name. Modify the resource names if required, or leave the defaults. Click "Next". On the next page, click "Next" again.
4. Scroll to the bottom of the page. In the "Capabilities and transforms" section, acknowledge the three checkbox items to confirm potential IAM updates.
5. Click "Submit" to create the stack.

**To run the Cloudformation template with all defaults via the AWS CLI v2, run the next command:**

In [None]:
aws cloudformation create-stack --stack-name kohya-ss-fine-tuning-stack --template-body file://template.yml --capabilities "CAPABILITY_IAM" "CAPABILITY_NAMED_IAM" "CAPABILITY_AUTO_EXPAND" --parameters ParameterKey=TrainingS3BucketNamePrefix,ParameterValue=kohya-ss-fine-tuning ParameterKey=TrainingContainerRepositoryName,ParameterValue=kohya-ss-fine-tuning ParameterKey=TrainingContainerCodeRepositoryName,ParameterValue=kohya-ss-fine-tuning-container-image ParameterKey=TrainingContainerBuildProjectName,ParameterValue=kohya-ss-fine-tuning-build-container ParameterKey=TrainingPipelineName,ParameterValue=kohya-ss-fine-tuning-pipeline

## Step Two - Upload the fine-tuning configuration file, and your custom images to the S3 bucket

The next step is to upload the following to the S3 Bucket that was created as part of Step One:

**Kohya SS SDXL Configuration File: this .toml file is used to define the fine-tuning configuration parameters (instead of using the Kohya GUI)**
**Custom Image Assets: You will need to provide a set of images for the fine-tuning process, which you will upload to the S3 Bucket**

The structure of the S3 Bucket is intended to be the following:

    bucket/0001-dataset/kohya-sdxl-config.toml
    bucket/0001-dataset/<asset-folder-name>/        (images and caption files go here)
    bucket/0002-dataset/kohya-sdxl-config.toml
    bucket/0002-dataset/<asset-folder-name>/        (images and captions files go here)
    ...

The <asset-folder-name> must be named properly for the fine-tuning to be successful. This format will be described in the Asset Upload section below.
Each "xxxx-dataset" prefix may contain separate datasets, with a different config file.
Do not change the "kohya-sdxl-config.toml" file name. If you change it, you will also have to change the file name in the "train" file.
The config and asset folder will be downloaded by the SageMaker Training job during the training process.
**Keep in mind that whatever name you specify for "xxxx-dataset", will be the same parameter you will specify when launching the SageMaker Pipeline.**

**To upload the config file to the S3 Bucket, run the next command after you confirm the bucket name:**

In [None]:
# Set the variables needed for the remaining steps. Update these only if you have made changes to this solution that would have changed these parameter values.

# The base path of the local code that was cloned from the git repo in the README
!local_code_base_path="amazon-sagemaker-examples/use-cases/text-to-image-fine-tuning"
!codecommit_repo_name="kohya-ss-fine-tuning-container-image"
!s3_training_bucket="kohya-ss-fine-tuning-${AWS_ACCOUNT_ID}"

In [None]:
# ************** IMPORTANT NOTE: Make sure the S3 bucket below matches the S3 bucket name that was created in Step One **************

# Before uploading this config file, update your desired parameter values. Refer to the Appendix of this notebook for configuration notes. At a minimum, you will want to change the output_name parameter, which is the name of the output model.
!aws s3 cp ~/${local_code_base_path}/config/kohya-sdxl-config.toml "s3://${s3_training_bucket}/0001-dataset/"

In this step, you will upload your custom image assets to the same S3 Bucket.

The <asset-folder-name> must be named properly, according to the Kohya SS guidelines. This naming convention is what defines the number of repetitions, the trigger word, class name, etc. For example, 30_dwjz_man specifies 30 repititions with the trigger prompt word of "dwjz". Name this prefix in S3 properly according to your requirements, and manually upload your images to this prefix directory. You must upload your assets before continuing with the next steps.

To become more familiar with Kohya SS fine-tuning, visit the references here: https://github.com/bmaltais/kohya_ss. There are many variables to fine-tuning, and currently no accepted single pattern for generating great results. To ensure good results, ensure you have enough steps in the training, as well as good resolution assets, and make sure to have enough images.

## Step Three - Upload the necessary code to the CodeCommit repository

The code required for this solution is in the "code" directory of this project. These files should be uploaded to the CodeCommit repository that was created by the Cloudformation template. This repository contains the code required to build the custom training container. Any updates to the code in this repository will trigger the container image to be built and pushed to ECR (ie through an EventBridge rule). Once you run the next steps, it will kick off the process that creates the training container image.

- The "buildspec.yml" file creates the container image by leveraging the GitHub repository for Kohya SS, and pushes the training image to ECR
- The "Dockerfile" file is used to override the Dockerfile in the Kohya SS project, enabling it for use with SageMaker Training
- The "train" file calls the Kohya SS program to do the fine tuning, and is invoked when the SageMaker Training job kicks off

**To run the command to copy these files to the CodeCommit repository, run the next commands:**

In [None]:
# Initial configuration for accessing CodeCommit. Reference: https://docs.aws.amazon.com/codecommit/latest/userguide/setting-up-git-remote-codecommit.html

# Install latest pip version
!curl -O https://bootstrap.pypa.io/get-pip.py
!python3 get-pip.py --user

# Install git-remote-codecommit
!pip install git-remote-codecommit

# Clone the CodeCommit repository
# ************** IMPORTANT NOTE: Make sure the region and the repository name below match the CodeCommit repository name that was created in Step One **************
!git clone codecommit::us-west-2://${codecommit_repo_name} ~/${codecommit_repo_name}
%cd ~/${codecommit_repo_name}

# Copy the "code" directory files from the sagemaker-examples repository to the local git repository 
# We will commit the changes to this local git repository to the CodeCommit repository in our AWS account
!cp -r ~/${local_code_base_path}/code/* ~/${codecommit_repo_name}
!git add .
!git -c user.name=Author -c user.email=author@example.com commit -m "initial commit"
!git push origin master:main

## Step Four - Initiate the SageMaker Pipeline to start training

Note: If you are running through this Notebook for the first time, you must ensure the previous step has finished uploading the container image to ECR before you continue.

To run a SageMaker pipeline, navigate to SageMaker Studio and follow the steps below:

1. In the left navigation pane, click "Pipelines".
2. Navigate to the pipeline named "kohya-ss-fine-tuning-pipeline" and click it. If you changed the default name in Step One, select that one instead.
3. Click "Create execution". Then enter a name for the execution.
4. Update the parameter values if necessary, and click "Start" to execute the pipeline.

Parameters:
-InputS3DatasetLocation: the S3 prefix containing the training resources (e.g. kohya-ss-fine-tuning-<aws-account-id>/0001-dataset)
-OutputS3ModelLocation: where the resulting model will be output (e.g. kohya-ss-fine-tuning-<aws-account-id>/model-outputs)
-TrainingDockerImage: the latest ECR image tag
-TrainingInstanceType: the instance type to run the training on
-TrainingVolumeSizeInGB: the volume size of the training instance
-MaxTrainingRuntimeInSeconds: the maximum time the training is allowed to run

For training that will require many epochs/steps, also consider updating the MaxTrainingRuntimeInSeconds (currently set for 24 hours). You might also consider different instance types and volume sizes if your use case requires it.

## Appendix

#### The Kohya configuration .toml file
This file contains the config values that are fed into the Kohya program for training. If you change the config filename, you must also change it in the "train" file. This configuration is not specific to just Stable Diffusion XL. It's flexible to apply to other pre-trained models (however, if you modify the config file to apply to other models, also change the entrypoint file in the "train" file, as it currently points to "sdxl_train_network.py"). The configuration instance contained in this sample repository is one possible configuration for SDXL. This is the reason that some parameters are commented out - because they are either optional for SDXL, or don't apply to SDXL. There is currently no consensus for optimal parameter values. You will need to try different permutations of the configuration and compare your output model.

To give you some intitial direction, try modifying these hyperparameters first:
- learning_rate
- text_encoder_lr
- unet_lr
- optimizer_type
- network_dim

Please note that some config parameters rely on underlying hardware/GPU type (e.g. mixed_precision=bf16). You must ensure that your training instance has the proper hardware configuration.


#### Kohya enhancements
There are a few enhancements that may be made, to allow for the following. These are currently not enabled.
- Sampling. Support may be added for adding sampling, which outputs images regularly during the training process. This variable is specified by the "sample_*" parameters in the configuration file.
- Regularization. Support may be added for adding regularization images in a specific directory, and creating another data channel which can be leveraged by SageMaker Training. This variable is specified by the "reg_data_dir" parameter in the configuration file.
- Captions. Support may be added for auto-generating caption files for the images before training. Currently, you must manually add caption files to the S3 directory.


#### Inference
TODO: this is for training LoRA, will need to do inference (list how here)


#### Model output upload time
The resulting fine-tuned model is likely to be a few GB, which takes time to upload to S3 once the training has completed. Enhancements may be made to reduce this upload time in the future.


#### Cloudformation template enhancements
Consider restricting the permissions for the following Roles. Currently, these permissions use Administrator permissions, and are unrestricted.

- SageMakerServiceRole
- PipelineExecutionRole



#TODO: captions, and image naming convention