meta-llama · wukaixingxp · Jun 5, 2024 · Jun 5, 2024 · Jun 10, 2024 · Jun 11, 2024
diff --git a/.github/scripts/spellcheck_conf/wordlist.txt b/.github/scripts/spellcheck_conf/wordlist.txt
@@ -1350,4 +1350,10 @@ SalesBot
 Weaviate
 MediaGen
 SDXL
-SVD
+SVD
+QLoRA
+composable
+hackable
+safetensor
+PyPI
+torchtune
diff --git a/recipes/finetuning/torchtune.md b/recipes/finetuning/torchtune.md
@@ -0,0 +1,130 @@
+# torchtune recipe for Meta Llama 3 finetuning
+
+torchtune is a PyTorch-native library for easily authoring, fine-tuning and experimenting with LLMs.
+
+torchtune provides:
+
+- Native-PyTorch implementations of popular LLMs using composable and modular building blocks
+- Easy-to-use and hackable training recipes for popular fine-tuning techniques (LoRA, QLoRA) - no trainers, no frameworks, just PyTorch!
+- YAML configs for easily configuring training, evaluation, quantization or inference recipes
+- Built-in support for many popular dataset formats and prompt templates to help you quickly get started with training
+
+Here we will show the how to use torchtune to easily finetune meta Llama 3. For more usage details, please check the [torchtune repo](https://github.com/pytorch/torchtune/tree/main).
+
+## Installation
+
+**Step 1:** [Install PyTorch](https://pytorch.org/get-started/locally/). torchtune is tested with the latest stable PyTorch release as well as the preview nightly version.
+
+**Step 2:** The latest stable version of torchtune is hosted on PyPI and can be downloaded with the following command:
+
+```bash
+pip install torchtune
+```
+
+To confirm that the package is installed correctly, you can run the following command:
+
+```bash
+tune --help
+```
+
+And should see the following output:
+
+```bash
+usage: tune [-h] {ls,cp,download,run,validate} ...
+
+Welcome to the torchtune CLI!
+
+options:
+  -h, --help            show this help message and exit
+
+...
+```
+
+You can also install the latest and greatest torchtune has to offer by [installing a nightly build](https://pytorch.org/torchtune/main/install.html).
+
+
+
+---
+
+
+
+
+## Meta Llama3 fine-tuning steps
+
+torchtune supports fine-tuning for the Llama3 8B and 70B size models. It currently support LoRA, QLoRA and full fine-tune on a single GPU as well as LoRA and full fine-tune on multiple devices for the 8B model, and LoRA on multiple devices for the 70B model. For all the details, take a look at the [tutorial](https://pytorch.org/torchtune/main/tutorials/llama3.html).
+
+
+**Note**: The Llama3 LoRA and QLoRA configs default to the instruct fine-tuned models.
+This is because not all special token embeddings are initialized in the base 8B and 70B models.
+
+In the initial experiments for Llama3-8B, QLoRA has a peak allocated memory of ``~9GB`` while LoRA on a single GPU has a peak allocated memory of ``~19GB``. To get started, you can use the default configs to kick off training.
+
+
+### torchtune repo download
+We need to clone the torchtune repo to get the configs.
+
+```bash
+git clone git@github.com:pytorch/torchtune.git
+cd torchtune/recipes/configs/
+```
+
+### Model download
+Select which model you want to download, in this example we want to use meta-llama/Meta-Llama-3-8B.
+```bash
+tune download meta-llama/Meta-Llama-3-8B \
+--output-dir /tmp/Meta-Llama-3-8B \
+--hf-token <HF_TOKEN> \
+```
+
+
+> Tip: Set your environment variable `HF_TOKEN` or pass in `--hf-token` to the command in order to validate your access.
+You can find your token at https://huggingface.co/settings/tokens
+
+### Single GPU finetune
+
+LoRA 8B
+
+```bash
+tune run lora_finetune_single_device --config llama3/8B_lora_single_device
+```
+
+QLoRA 8B
+
+```bash
+tune run lora_finetune_single_device --config llama3/8B_qlora_single_device
+```
+
+Full 8B
+
+```bash
+tune run full_finetune_single_device --config llama3/8B_full_single_device
+```
+
+### Multi GPU finetune
+
+Full 8B
+
+```bash
+tune run --nproc_per_node 4 full_finetune_distributed --config llama3/8B_full
+```
+
+LoRA 8B
+
+```bash
+tune run --nproc_per_node 2 lora_finetune_distributed --config llama3/8B_lora
+```
+
+LoRA 70B
+
+Note that the download command for the Meta-Llama3 70B model slightly differs from download commands for the 8B models. This is because it use the HuggingFace [safetensor](https://huggingface.co/docs/safetensors/en/index) model format to load the model. To download the 70B model, run
+```bash
+tune download meta-llama/Meta-Llama-3-70b --hf-token <> --output-dir /tmp/Meta-Llama-3-70b --ignore-patterns "original/consolidated*"
+```
+
+Then, a finetune can be kicked off:
+
+```bash
+tune run --nproc_per_node 8 lora_finetune_distributed --config recipes/configs/llama3/70B_lora.yaml
+```
+
+You can find a full list of all the Llama3 configs [here](https://github.com/pytorch/torchtune/tree/main/recipes/configs/llama3).