# Local-LLM Fine-Tuning of Vid-LLM for improved performance

In the previous lab, you performed fine-tuning using a remote SageMaker training job. If your pipeline works, this is a very cost effective way to utilize GPU resources.
When running the training job, sometimes you wait for quiet some time, only to discover that the training job failed because of a small code issue.

To achieve a faster developer iterations, we will learn how to fine-tune a Video-Language Model (Vid-LLM) locally using Sagemaker Local-Mode. 
We'll use the Qwen2-VL-2B-Instruct model and apply the LoRA (Low-Rank Adaptation) fine-tuning technique.

In this lab you will learn how to,
1. Set Up Docker Environment
2. Download Training Dataset: We'll use the [LLaVA-Video-sm                                                                   all-swift](https://huggingface.co/datasets/malterei/LLaVA-Video-small-swift) dataset for fine-tuning
3. Create config.yaml with necessary settings for local GPU execution
4. Define Fine-tuning Function and pipeline using SageMaker
5. Create the fine-tuning function using [MS-SWIFT framework](https://github.com/modelscope/ms-swift)
5. Create and Run Pipeline in local mode

After completing this lab, you will have:

* A fine-tuned Vid-LLM model optimized for your specific use case.
* Understanding of local fine-tuning process using SageMaker local mode.
* Experience with MS-SWIFT framework for video-language models.


## 1. Set Up Docker Environment

First, we'll install Docker to handle our containerized workloads and 
validate the Docker installation by running the `docker version` command.


In [None]:
%%bash

# see https://docs.docker.com/engine/install/ubuntu/#install-using-the-repository
sudo apt-get update
sudo apt-get install -y ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

# Add the repository to Apt sources:
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update

## Currently only Docker version 20.10.X is supported in Studio: see https://docs.aws.amazon.com/sagemaker/latest/dg/studio-updated-local.html
# pick the latest patch from:
# apt-cache madison docker-ce | awk '{ print $3 }' | grep -i 20.10
VERSION_STRING=5:20.10.24~3-0~ubuntu-jammy
sudo apt-get install docker-ce-cli=$VERSION_STRING docker-compose-plugin -y

# validate the Docker Client is able to access Docker Server at [unix:///docker/proxy.sock]
docker version

## 2. Download Training Dataset

We'll use the [LLaVA-Video-small-swift](https://huggingface.co/datasets/malterei/LLaVA-Video-small-swift) dataset for fine-tuning.

In [None]:
import os
from sagemaker.session import Session
from huggingface_hub import snapshot_download

default_bucket_name = Session().default_bucket()
dataset_dir = "./data/"
dataset_dir_s3 = f"s3://{default_bucket_name}/mydataset/"

file_path = snapshot_download(
    repo_id="malterei/LLaVA-Video-small-swift",
    repo_type="dataset",
    local_dir=dataset_dir
)

## 3. Configure SageMaker Environment

Set up the configuration for local GPU execution:

* Creates a YAML configuration file named `config.yaml` for the SageMaker Remote Function.
* `ImageUri:` the Docker image to use, e.g. Sagemaker distribution 1.11 - `public.ecr.aws/sagemaker/sagemaker-distribution:1.11-gpu`.
* `InstanceType:` We use `local_gpu` to make sure that the local mode will run with GPU enabled.
* `Dependencies:` We use `./requirements.txt` to add additional packages.
* `IncludeLocalWorkDir:` Makes sure to include the local working directory.
* `PreExecutionCommands:` Sets the permissions for the `/opt/ml/model` directory. Add any additional commands you might need as preparation.
* `CustomFileFilter:` File patterns to ignore when uploading the local work directory for remote exection. 


In [None]:
config_yaml = f"""
SchemaVersion: '1.0'
SageMaker:
  PythonSDK:
    Modules:
      RemoteFunction:
        # role arn is not required if in SageMaker Notebook instance or SageMaker Studio
        # Uncomment the following line and replace with the right execution role if in a local IDE
        # RoleArn: <replace the role arn here>
        # S3RootUri: <replace with bucket prefix>
        ImageUri: "public.ecr.aws/sagemaker/sagemaker-distribution:1.11-gpu"
        InstanceType: local_gpu
        Dependencies: ./requirements.txt
        IncludeLocalWorkDir: true
        PreExecutionCommands:
        - "sudo chmod -R 777 /opt/ml/model"
        - "pip install packaging"
        CustomFileFilter:
          IgnoreNamePatterns:
          - "data/*"
          - "output/*"
          - "accelerate/*"
          - "container/*"
          - "ms-swift/*"
          - "models/*"
          - "*.ipynb"
          - "__pycache__"

"""

print(config_yaml, file=open('config.yaml', 'w'))
print(config_yaml)

This cell sets an environment variable `SAGEMAKER_USER_CONFIG_OVERRIDE` to the current working directory. This environment variable is used by SageMaker to load the `config.yaml` file created in the previous cell.

In [None]:
os.environ["SAGEMAKER_USER_CONFIG_OVERRIDE"] = os.getcwd()

## 4. Define Fine-tuning Function

Create the fine-tuning function using the [MS-SWIFT framework](https://github.com/modelscope/ms-swift):

The Key Parameters

  * **Model**: Qwen2-VL-2B-Instruct
  * **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
  * **NFRAMES**: The number of frames for video used for model inputs.
  * **MAX_PIXELS**: Maximum Resolution: 313600 pixels, e.g. 400x28x28.
  * **GPU Configuration**: `CUDA_VISIBLE_DEVICES` Determines how many GPU devices will be used for trainin.

For addtional parameters see the [ms-swift cli documentation](https://github.com/modelscope/ms-swift/blob/main/docs/source_en/Instruction/Command-line-parameters.md).

This is a simplified version of the fine-tuning process for demonstration purposes. For production use cases, you might want to adjust parameters like frame count and resolution.

This pipeline requires a `g5.2xlarge` to run in local mode.

In [None]:
import json
import subprocess

def fine_tune_video(training_data_s3, train_data_path="train.jsonl",validation_data_path="validation.jsonl"):
    from swift.llm.utils import SftArguments
    from swift.llm.sft import llm_sft, get_sft_main

    ## copy the training data from input source to local directory
    dataset_dir = "."
    os.makedirs(dataset_dir, exist_ok=True)
    subprocess.run(['aws', 's3', 'cp', training_data_s3, dataset_dir, '--recursive'])
    train_data_local_path = os.path.join(dataset_dir, train_data_path)
    validation_data_local_path = os.path.join(dataset_dir, validation_data_path)

    # # Set training parameters run the fine-tuning using ms-swift framework
    sft_main = get_sft_main(SftArguments, llm_sft)
    
    os.environ["NFRAMES"]=json.dumps(2) # The number of frames for video used for model inputs
    os.environ["MAX_PIXELS"]=json.dumps(78400) # Resolution setting, 100*28*28 for demo purposes only, should be increased for production
    os.environ["CUDA_VISIBLE_DEVICES"]="0" # devices to be used
    os.environ["NPROC_PER_NODE"]="1"

    # Configure fine-tuning arguments
    argv = [
        '--model_type', 'qwen2-vl-2b-instruct',
        '--model_id_or_path', 'Qwen/Qwen2-VL-2B-Instruct', 
        '--sft_type', 'lora', 
        '--output_dir', '/opt/ml/model' ,
        '--max_length', '1048',
        '--dataset', train_data_local_path, 
        '--val_dataset', validation_data_local_path]
    
    sft_main(argv)
    return "done"

## 5. Create and Run Pipeline

Defines a function that:
- Sets up a SageMaker training pipeline
- Configures training steps using the previously defined fine-tuning function
- Handles pipeline execution in local mode


In [None]:
import urllib
import boto3
from sagemaker.session import Session
from sagemaker import get_execution_role
from sagemaker.workflow.function_step import step
from sagemaker.workflow.pipeline import Pipeline

# import mlflow
from sagemaker.workflow.execution_variables import ExecutionVariables
from sagemaker.workflow.pipeline_definition_config import PipelineDefinitionConfig
from sagemaker.workflow.pipeline_context import LocalPipelineSession

def run_pipeline(local_mode=True):

    train_result = step(fine_tune_video, name="finetune")(dataset_dir_s3)
    steps = [train_result]
    
    role = get_execution_role()
    local_pipeline_session = LocalPipelineSession()
    more_params = {}
    if local_mode:
        more_params["sagemaker_session"] = local_pipeline_session 
    
    pipeline = Pipeline(
        name="projectz",
        parameters=[],
        steps=steps,
        pipeline_definition_config=PipelineDefinitionConfig(use_custom_job_prefix=True),        
        **more_params
    )

    pipeline.upsert(role_arn=role)
    pipeline.start()

Executes the pipeline in local mode by calling `run_pipeline(local_mode=True)`.

In [None]:
run_pipeline(local_mode=True)

Observe the execution of the pipeline using local mode with docker and docker-compose.

### Troubleshooting
Error: `9olvadyii5-sagemaker-local  | RuntimeError: DataLoader worker (pid(s) 335) exited unexpectedly`.

If you see this error, it means the following:

* We managed to start the execution of a Sagemaker pipeline locally using docker and docker-compose.
  * The local container to run the pipeline was set up correctly
  * The training dataset was downloaded correctly
  * The model was downloaded correctly
  * Fine-tuning started loading the dataset
* Loading the dataset failed due to unsufficient memory: Make sure that you have at least a `g5.2xlarge` to have enough CPU Memory for the dataset.