## Nova Canvas Fine-Tuning for Character Consistency

This notebook demonstrates how to fine-tune Amazon Nova Canvas to create character-consistent storyboards using images from the animated short film "Picchu".

## Introduction

In this notebook, we'll walk through the process of fine-tuning Amazon Nova Canvas to maintain visual consistency for specific characters (Mayu and her mom) across multiple generated images. This approach allows for more precise control over character appearance than prompt engineering alone.

### Prerequisites
- AWS account with access to Amazon Bedrock
- Appropriate IAM permissions for Bedrock, S3, and related services
- This notebook must be run in the `us-east-1` AWS region

### What You'll Learn
- How to prepare training data from existing character images
- How to configure and run a fine-tuning job on Amazon Nova Canvas
- How to generate character-consistent storyboard frames with your fine-tuned model

You can watch the original "Picchu" animated short film here: [Picchu on YouTube](https://www.youtube.com/watch?v=XfyJbkRV_Eo)

### Setup

First, let's install the required dependencies for this notebook. These packages will help us with image processing, AWS interactions, and visualization.

In [None]:
%pip install -r requirements.txt

### Initialize

Now we'll set up our AWS environment by initializing the necessary clients and defining key variables. This includes:
- Setting up boto3 clients for Bedrock, S3, IAM, and STS
- Defining our S3 bucket and prefix for storing training data
- Setting the base model ID for fine-tuning
- Creating a directory for our training images

In [None]:
%pip install pillow

In [None]:
import boto3
import sagemaker
from sagemaker.utils import name_from_base
import time
import json
from image_processing import process_folders, upload_to_s3
import os
import shutil

sess = sagemaker.Session()
role = sagemaker.get_execution_role()

bucket = sess.default_bucket() # Set a default S3 bucket or use your own
prefix = "picchu-canvas/images"


# Initialize Boto3 Clients
bedrock = boto3.client('bedrock')
bedrock_runtime = boto3.client('bedrock-runtime')
s3 = boto3.client('s3')
iam_client = boto3.client('iam')
sts_client = boto3.client('sts')

# Account and region info
session = boto3.session.Session()
region = session.region_name
account_id = sts_client.get_caller_identity()["Account"]

# Base model id for fine-tuning
model_id = 'amazon.nova-canvas-v1:0'

image_dir = "picchu_images"

In [None]:
def create_or_replace_folder(folder_path):
    if os.path.exists(folder_path):
        shutil.rmtree(folder_path)  # Remove the existing folder and its contents
    os.makedirs(folder_path)         # Create a new, empty folder

create_or_replace_folder(image_dir)

## Download images

In this step, we'll download a pre-prepared set of images from the "Picchu" animated short film. These images will serve as our training data for fine-tuning the Nova Canvas model to consistently generate the main character Mayu and her mom.

In [None]:
!wget --no-check-certificate https://ws-assets-prod-iad-r-iad-ed304a55c2ca1aee.s3.us-east-1.amazonaws.com/3c3519c9-93dc-404d-87f7-4d9bde05f265/picchu_images.zip

!unzip picchu_images.zip

### Prepare the images for fine tuning

Now we'll process the downloaded images and prepare them for fine-tuning. This function will:
1. Upload the images to our S3 bucket
2. Generate a manifest file that pairs each image with a descriptive caption
3. The manifest file follows the required format for Amazon Nova Canvas fine-tuning

For more information on manifest file requirements, see the [Amazon Bedrock documentation on fine-tuning](https://docs.aws.amazon.com/bedrock/latest/userguide/custom-models.html).

In [None]:
updated_data = process_folders(image_dirs, bucket, prefix)

In [None]:
output_file = f'{prefix.split("-")[0]}_manifest.jsonl'
with open(output_file, 'w') as f:
    for item in updated_data:
        item_filtered = {d:item[d] for d in item if d != 'id'}
        f.write(json.dumps(item_filtered) + '\n')
print(f"{output_file} processed completed!")

### Preview the manifest file

Let's examine the first few entries of our manifest file to understand its structure. Each line contains a JSON object with:
- `image-ref`: The S3 path to the image
- `caption`: A detailed description of the image that helps the model learn the character's appearance and style

In [None]:
!head -n 5 {output_file}

### Upload manifest to S3

Now we'll upload our manifest file to S3. This file will be used by the fine-tuning job to locate and process our training images along with their captions.

In [None]:
training_path = upload_to_s3(output_file, bucket, prefix.replace("images", "manifests"))
training_path

## Train Custom Model Using Bedrock

Now we'll begin the process of fine-tuning the Amazon Nova Canvas model using our prepared training data. This involves several steps:
1. Creating the necessary IAM roles and policies
2. Configuring the fine-tuning job parameters
3. Submitting the job to Amazon Bedrock
4. Monitoring the job progress

### Fine tune job preparation - Creating role and policies requirements

We will now prepare the necessary IAM role for the fine-tune job. This includes creating the policies required to run customization jobs with Amazon Bedrock.

### Create Trust relationship
This JSON object defines the trust relationship that allows the Bedrock service to assume a role that will give it the ability to interact with other required AWS services. The conditions set restrict the assumption of the role to a specific account ID and a specific component of the Bedrock service (model_customization_jobs).

In [None]:
# This JSON object defines the trust relationship that allows the bedrock service to assume a role that will give it the ability to talk to other required AWS services. The conditions set restrict the assumption of the role to a specfic account ID and a specific component of the bedrock service (model_customization_jobs)
ROLE_DOC = f"""{{
    "Version": "2012-10-17",
    "Statement": [
        {{
            "Effect": "Allow",
            "Principal": {{
                "Service": "bedrock.amazonaws.com"
            }},
            "Action": "sts:AssumeRole",
            "Condition": {{
                "StringEquals": {{
                    "aws:SourceAccount": "{account_id}"
                }},
                "ArnEquals": {{
                    "aws:SourceArn": "arn:aws:bedrock:{region}:{account_id}:model-customization-job/*"
                }}
            }}
        }}
    ]
}}
"""

### Create S3 access policy

This JSON object defines the permissions for the role we want Bedrock to assume. It allows access to the S3 bucket that contains our fine-tuning datasets and permits specific bucket and object operations necessary for the fine-tuning process.

In [None]:
ACCESS_POLICY_DOC = f"""{{
    "Version": "2012-10-17",
    "Statement": [
        {{
            "Effect": "Allow",
            "Action": [
                "s3:AbortMultipartUpload",
                "s3:DeleteObject",
                "s3:PutObject",
                "s3:GetObject",
                "s3:GetBucketAcl",
                "s3:GetBucketNotification",
                "s3:ListBucket",
                "s3:PutBucketNotification"
            ],
            "Resource": [
                "arn:aws:s3:::{bucket}",
                "arn:aws:s3:::{bucket}/*"
            ]
        }}
    ]
}}"""