# PEFT Meta-Llama3.1-8B on Dolphin Dataset

This example shows how to do parameter efficient fine tuning (PEFT) of [Meta-Llama3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B) model on [dolphin](https://huggingface.co/datasets/cognitivecomputations/dolphin) dataset using [Nemo](https://github.com/NVIDIA/NeMo) [Megatron-LM](https://github.com/NVIDIA/Megatron-LM). In this notebook, we use a [Kubeflow Pipeline (v2)](https://www.kubeflow.org/docs/components/pipelines/v2/introduction/) to run the end-to-end workflow for PEFT. 

## Launch a Kubeflow Notebook server

We need to run this notebook in a [Kubeflow Notebooks](https://www.kubeflow.org/docs/components/notebooks/overview/) JupyterLab notebook server. [Access Kubeflow Central Dashboard](https://github.com/aws-samples/amazon-eks-machine-learning-with-terraform-and-kubeflow/tree/master/README.md#access-kubeflow-central-dashboard-optional), and follow the [Quickstart Guide](https://www.kubeflow.org/docs/components/notebooks/quickstart-guide/) to create an instance of the default JupyterLab notebook server. Connect to the launched notebook server from within the Kubeflow Central Dashboard. Clone this git repository under the home directory on the notebook server, and open this notebook.

## Persistent volumes

Amazon EFS and FSx for Lustre persistent volumes are mounted at `~/pv-efs`and `~/pv-fsx`, respectively, within the notebook server, and at `/efs` and `/fsx` within the pre-training job runtime environment.

## Implicitly defined environment variables

Following variables are implicitly defined by the [pytorch-distributed](../../../charts/machine-learning/training/pytorchjob-distributed/Chart.yaml) Helm chart for use with [Torch distributed run](https://github.com/pytorch/pytorch/blob/main/torch/distributed/run.py):

1. `PET_NNODES` : Maps to `nnodes`
2. `PET_NPROC_PER_NODE` : Maps to `nproc_per_node` 
3. `PET_NODE_RANK` : Maps to `node_rank` 
4. `PET_MASTER_ADDR`: Maps to `master_addr` 
5. `PET_MASTER_PORT`: Maps to `master_port`

## Create Kubeflow Pipelines Client

We start by creating a client for Kubeflow Pipelines. Since we are running this notebook in a JupyterLab notebook server inside the Kubeflow platform, we can discover the Kubeflow Pipelines endpoint automatically, as shown below.

In [None]:
import kfp

client = kfp.Client()
client

Next, we get our Kubernetes namespace.

In [None]:
ns = client.get_user_namespace()
print(f"user namespace: {ns}")

##  PEFT Workflow

Below we define the steps in the PEFT workflow. Each step in the workflow is defined as a Helm Chart config. The sequential list of Helm Chart configs  defines the complete workflow.

In each Helm chart config, the Helm chart `release_name` must be unique among the Helm charts installed within the user namespace. The `repo_url` below specifies the Git repository URL for the Helm chart, and the `path` specifies the relative path within the Git repository.The `values` field specifies the Helm Chart Values used by the Helm Chart.


### Step 0 Specify Docker Image

This notebook uses a custom Docker container image for [Nemo](https://github.com/NVIDIA/NeMo.git) [Megatron-LM](https://github.com/NVIDIA/Megatron-LM.git). See README file in this folder for more details on how to build and push the Docker container image. Be sure to set the `image` field below to the Amazon ECR URI for your Docker image. The image must be built on a build machine, not on this notebook.

In [None]:
image = ''
assert image, "Docker image is required"

### Step 1: Download Hugging Face Pre-trained Model

Below, we define the Helm chart config for downloading Hugging Face pre-trained model. Specify Hugging Face access token in `hf_token` to access the gated model.

In [None]:
import yaml

release_name = "nemo-llama31-8b-peft-dolphin"

hf_model_id = "meta-llama/Llama-3.1-8B"
hf_token = ''
assert hf_token, "Hugging Face Token is required to access the gated model"

hf_download_config = {
    "release_name": release_name,
    "namespace": ns,
    "repo_url": "https://github.com/aws-samples/amazon-eks-machine-learning-with-terraform-and-kubeflow.git",
    "path": "charts/machine-learning/model-prep/hf-snapshot",
    "timeout": "5m0s"
}

hf_download_config["values"] = {
    "env": [
        {"name":"HF_MODEL_ID","value":hf_model_id},
        {"name":"HF_TOKEN","value":hf_token}
    ]
}

### Step 2: Convert HuggingFace Checkpoint to Nemo Checkpoint

Below we define the Helm Chart config for converting Hugging Face pre-trained model checkpoint to Nemo checkpoint.

In [None]:

hf_to_nemo_config = {
    "release_name": release_name,
    "namespace": ns,
    "repo_url": "https://github.com/aws-samples/amazon-eks-machine-learning-with-terraform-and-kubeflow.git",
    "path": "charts/machine-learning/data-prep/data-process",
    "timeout": "30m0s"
}

with open("hf_to_nemo.yaml") as file:
    hf_to_nemo_config["values"] = yaml.safe_load(file)
    hf_to_nemo_config["values"]["image"] = image

### Step 3: Preprocess Dolphin Dataset

Below we define the Helm Chart config for preprocessing Hugging Face Dolphin dataset into the format required by Nemo.

In [None]:

preprocess_config = {
    "release_name": release_name,
    "namespace": ns,
    "repo_url": "https://github.com/aws-samples/amazon-eks-machine-learning-with-terraform-and-kubeflow.git",
    "path": "charts/machine-learning/data-prep/data-process",
    "timeout": "30m0s"
}

with open("preprocess.yaml") as file:
    preprocess_config["values"] = yaml.safe_load(file)
    preprocess_config["values"]["image"] = image

### Step 4: PEFT Fine-tuning

Below we define the Helm Chart config for PEFT fine-tuning.

In [None]:

peft_config = {
    "release_name": release_name,
    "namespace": ns,
    "repo_url": "https://github.com/aws-samples/amazon-eks-machine-learning-with-terraform-and-kubeflow.git",
    "path": "charts/machine-learning/training/pytorchjob-distributed",
    "timeout": "30m0s"
}

with open("peft.yaml") as file:
    peft_config["values"] = yaml.safe_load(file)
    peft_config["values"]["image"] = image
    peft_config["values"]["hf_token"] = hf_token

### Step 5: Evaluate

Below we define the Helm Chart Config for evaluating Peft fine-tuned model.

In [None]:
peft_eval_config =  {
    "release_name": release_name,
    "namespace": ns,
    "repo_url": "https://github.com/aws-samples/amazon-eks-machine-learning-with-terraform-and-kubeflow.git",
    "path": "charts/machine-learning/training/pytorchjob-distributed",
    "timeout": "30m0s"
}

with open("peft_eval.yaml") as file:
    peft_eval_config["values"] = yaml.safe_load(file)
    peft_eval_config["values"]["image"] = image
    peft_eval_config["values"]["hf_token"] = hf_token

### Step 6: Merge PEFT Model to Base Model

Below we define the Helm Chart config for merging the PEFT model weights to the base model.

In [None]:
merge_peft_config = {
    "release_name": release_name,
    "namespace": ns,
    "repo_url": "https://github.com/aws-samples/amazon-eks-machine-learning-with-terraform-and-kubeflow.git",
    "path": "charts/machine-learning/data-prep/data-process",
    "timeout": "30m0s"
}

with open("merge_peft.yaml") as file:
    merge_peft_config["values"] = yaml.safe_load(file)
    merge_peft_config["values"]["image"] = image
    merge_peft_config["values"]["hf_token"] = hf_token

### Step 7: Convert Nemo Checkpoint to Hugging Face Checkpoint

Finally, we define the Helm Chart config for converting Nemo checkpoint to Hugging Face checkpoint.

In [None]:
nemo_to_hf_config = {
    "release_name": release_name,
    "namespace": ns,
    "repo_url": "https://github.com/aws-samples/amazon-eks-machine-learning-with-terraform-and-kubeflow.git",
    "path": "charts/machine-learning/data-prep/data-process",
    "timeout": "1800"
}

with open("nemo_to_hf.yaml") as file:
    nemo_to_hf_config["values"] = yaml.safe_load(file)
    nemo_to_hf_config["values"]["hf_token"] = hf_token
    nemo_to_hf_config["values"]["image"] = image

## Create a New Kubeflow Experiment 

Next, we create a new [Kubeflow Experiment](https://www.kubeflow.org/docs/components/pipelines/v1/concepts/experiment/).

In [None]:
exp_name = "peft-llama31_8b-dolphin"
exp_desc="PEFT Llama 3.1 8B on dolphin dataset"
exp = client.create_experiment(name=exp_name, description=exp_desc, namespace=ns)
exp

## Run the Pipeline in the Experiment

To run this pipeline, we must input `arguments` with `chart_configs` list, as shown below.

In [None]:
from datetime import datetime

ts = datetime.now().strftime("%Y-%m-%d-%H-%M-%S")

run_name=f"{exp_name}-run-{ts}"

pipeline_package = "../../../../kfp/pipelines/packages/helm_charts_pipeline.yaml"

pipeline_run=client.create_run_from_pipeline_package(
    pipeline_file=pipeline_package, 
    arguments={ 
        "chart_configs": [
            hf_download_config,
            hf_to_nemo_config,
            preprocess_config,
            peft_config,
            peft_eval_config,
            merge_peft_config,
            nemo_to_hf_config
        ]
    },
    run_name=run_name,
    experiment_name=exp_name, 
    namespace=ns, 
    enable_caching=False, 
    service_account='default'
)
pipeline_run

## What happens during the Pipeline Run

You can check the Kubeflow Pipeline Run logs using the link output above. 

During the Pipeline Run, the Helm charts in the `chart_configs` list are installed sequentially. Each installed Helm chart is monitored to a successful completion, or failure. If any chart in the list fails, the Pipeline Run ends in a failure, otherwise, when all the charts in the list complete successfully, the Pipeline Run concludes successfully.