# Introduction

In this tutorial, we'll focus on how you can easily tune various flavors of Llama. For simplicity, we'll be using Polaris as the platform for compute.

**NOTE:** This tutorial builds off of our [Finetuning Tutorial](https://github.com/openlema/lema/blob/main/notebooks/LeMa%20-%20Finetuning%20Tutorial.ipynb). We recommend starting there first to get a thorough understanding of how tuning works in our library.

# Prerequisites

This tutorial assumes:
- You have a valid ALCF account with access to Polaris
- You're familiar with our tuning flow
- You're familiar with how to launch lema workflows on Polaris. [Here's a relevant tutorial](https://github.com/openlema/lema/blob/main/notebooks/LeMa%20-%20Deploying%20a%20Job.ipynb)
- You've signed [Llama's agreement on HuggingFace](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B)

# Tuning Llama

We currently have out-of-the-box tuning jobs configured for the following flavors of Llama:

- Llama3.1 8b LoRA: [configs/lema/jobs/polaris/llama8b_lora.yaml](https://github.com/openlema/lema/blob/main/configs/lema/jobs/polaris/llama8b_lora.yaml) ✨
- Llama3.1 8b SFT: [configs/lema/jobs/polaris/llama8b_sft.yaml](https://github.com/openlema/lema/blob/main/configs/lema/jobs/polaris/llama8b_sft.yaml) ✨
- Llama3.1 70b LoRA: [configs/lema/jobs/polaris/llama70b_lora.yaml](https://github.com/openlema/lema/blob/main/configs/lema/jobs/polaris/llama70b_lora.yaml) ✨
- Llama3.1 70b SFT – COMING SOON! 🚀

By default our tuning job will run using the [`yahma/alpaca-cleaned`](https://huggingface.co/datasets/yahma/alpaca-cleaned) dataset. This is configured in the job configs above. We strongly suggest tuning these parameters as needed for your specific run.

Before running the job, ensure you've signed Llama's agreement on HuggingFace and have obtained your [HF_TOKEN](https://huggingface.co/docs/huggingface_hub/en/package_reference/environment_variables#hftoken). We'll pass it to our job by appending envs.HF_TOKEN=$HF_TOKEN to our launcher script. This is not currently needed as our script copies the model from Polaris' Eagle file system, but it will be needed to pull the model directly from HuggingFace.

The example command below uses the `preemptable` queue, which supports up to 10 nodes and up to 3 days runtime. When debugging, prefer using the `debug-scaling` queue, which usually has less queue time. However, you need to modify the `walltime` in the yaml config above to be <= 1 hour. For more details on available queues, see [ALCF's documentation](https://docs.alcf.anl.gov/polaris/running-jobs/).

```shell
# Replace with your desired config. We use Llama 8B LoRA below.
LEMA_CONFIG_PATH="configs/lema/jobs/polaris/llama8b_lora.yaml"
# If using debug-scaling queue, make sure that the walltime is <= 1 hr.
POLARIS_QUEUE="preemptable"
lema-launch -p $LEMA_CONFIG_PATH -c $POLARIS_QUEUE.$ALCF_USER envs.HF_TOKEN=$HF_TOKEN user=$ALCF_USER
```