Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding model builders for code-llama2 7b, 13b, and 70b #847

Merged
merged 15 commits into from
Apr 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ torchtune currently supports the following models.
|-----------------------------------------------|-----------|
| [Llama3](https://llama.meta.com/llama3) | 8B, 70B [[models](torchtune/models/llama3/_model_builders.py), [configs](recipes/configs/llama3/)] |
| [Llama2](https://llama.meta.com/llama2/) | 7B, 13B, 70B [[models](torchtune/models/llama2/_model_builders.py), [configs](recipes/configs/llama2/)] |
| [Code-Llama2](https://huggingface.co/codellama) | 7B, 13B, 70B [[model](torchtune/models/code_llama2/_model_builders.py), [configs](recipes/configs/code_llama2/)] |
| [Mistral](https://huggingface.co/mistralai) | 7B [[model](torchtune/models/mistral/_model_builders.py), [configs](recipes/configs/mistral/)] |
| [Gemma](https://huggingface.co/collections/google/gemma-release-65d5efbccdbb8c4202ec078b) | 2B [[model](torchtune/models/gemma/_model_builders.py), [configs](recipes/configs/gemma/)] |

Expand Down
79 changes: 79 additions & 0 deletions recipes/configs/code_llama2/7B_full_low_memory.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# Config for single device full finetuning in full_finetune_single_device.py
# using a Code-Llama2 7B model
#
# This config assumes that you've run the following command before launching
# this run:
# tune download codellama/CodeLlama-7b-hf --output-dir /tmp/CodeLlama-7b-hf
# The default config uses an optimizer from bitsandbytes. If you do not have it installed,
# you can install it with
# pip install bitsandbytes
#
# To launch on a single device, run the following command from root:
# tune run full_finetune_single_device --config code_llama2/7B_full_low_memory
#
# You can add specific overrides through the command line. For example
# to override the checkpointer directory while launching training
# you can run:
# tune run full_finetune_single_device --config code_llama2/7B_full_low_memory checkpointer.checkpoint_dir=<YOUR_CHECKPOINT_DIR>
#
# This config works only for training on single device.


# Tokenizer
tokenizer:
_component_: torchtune.models.llama2.llama2_tokenizer
path: /tmp/CodeLlama-7b-hf/tokenizer.model

# Dataset
dataset:
_component_: torchtune.datasets.alpaca_dataset
train_on_input: True
seed: null
shuffle: True

# Model Arguments
model:
_component_: torchtune.models.code_llama2.code_llama2_7b

checkpointer:
_component_: torchtune.utils.FullModelHFCheckpointer
checkpoint_dir: /tmp/CodeLlama-7b-hf
checkpoint_files: [
pytorch_model-00001-of-00003.bin,
pytorch_model-00002-of-00003.bin,
pytorch_model-00003-of-00003.bin
]
recipe_checkpoint: null
output_dir: /tmp/CodeLlama-7b-hf
model_type: LLAMA2
resume_from_checkpoint: False

# Fine-tuning arguments
batch_size: 2
epochs: 3
optimizer:
_component_: bitsandbytes.optim.PagedAdamW
lr: 2e-5
optimizer_in_bwd: True
loss:
_component_: torch.nn.CrossEntropyLoss
max_steps_per_epoch: null
gradient_accumulation_steps: 1
compile: False

# Training environment
device: cuda

# Memory management
enable_activation_checkpointing: True

# Reduced precision
dtype: bf16

# Logging
metric_logger:
_component_: torchtune.utils.metric_logging.DiskLogger
log_dir: ${output_dir}
output_dir: /tmp/code_llama2_finetune
log_every_n_steps: 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also please add log_peak_memory_stats: False in these configs. It won't error out without it, but rn we do a safe check on the config inside the recipe, which we'd eventually like to remove (keeping configs as the source of truth).

log_peak_memory_stats: False
89 changes: 89 additions & 0 deletions recipes/configs/code_llama2/7B_lora_single_device.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# Config for single device full finetuning in full_finetune_single_device.py
# using a Code-Llama2 7B model
#
# This config assumes that you've run the following command before launching
# this run:
# tune download codellama/CodeLlama-7b-hf --output-dir /tmp/CodeLlama-7b-hf
#
# To launch on a single device, run the following command from root:
# tune run lora_finetune_single_device --config code_llama2/7B_lora_single_device
#
# You can add specific overrides through the command line. For example
# to override the checkpointer directory while launching training
# you can run:
# tune run lora_finetune_single_device --config code_llama2/7B_lora_single_device checkpointer.checkpoint_dir=<YOUR_CHECKPOINT_DIR>
#
# This config works only for training on single device.

# Model Arguments
model:
_component_: torchtune.models.code_llama2.lora_code_llama2_7b
lora_attn_modules: ['q_proj', 'v_proj']
apply_lora_to_mlp: False
apply_lora_to_output: False
lora_rank: 8
lora_alpha: 16

# Tokenizer
tokenizer:
_component_: torchtune.models.llama2.llama2_tokenizer
path: /tmp/CodeLlama-7b-hf/tokenizer.model

# Dataset
dataset:
_component_: torchtune.datasets.alpaca_cleaned_dataset
train_on_input: True
seed: null
shuffle: True


checkpointer:
_component_: torchtune.utils.FullModelHFCheckpointer
checkpoint_dir: /tmp/CodeLlama-7b-hf
checkpoint_files: [
pytorch_model-00001-of-00003.bin,
pytorch_model-00002-of-00003.bin,
pytorch_model-00003-of-00003.bin
]
adapter_checkpoint: null
recipe_checkpoint: null
output_dir: /tmp/CodeLlama-7b-hf
model_type: LLAMA2

# Fine-tuning arguments
batch_size: 2
epochs: 1
max_steps_per_epoch: null
gradient_accumulation_steps: 64
compile: False

optimizer:
_component_: torch.optim.AdamW
weight_decay: 0.01
lr: 3e-4

lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
num_warmup_steps: 100

loss:
_component_: torch.nn.CrossEntropyLoss


# Training environment
device: cuda
enable_activation_checkpointing: True
dtype: bf16

# Logging
metric_logger:
_component_: torchtune.utils.metric_logging.DiskLogger
log_dir: ${output_dir}
output_dir: /tmp/lora_code_llama2_finetune_output
log_every_n_steps: 1
log_peak_memory_stats: False

profiler:
_component_: torchtune.utils.profiler
enabled: False
output_dir: ${output_dir}/torchtune_perf_tracing.json
92 changes: 92 additions & 0 deletions recipes/configs/code_llama2/7B_qlora_single_device.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
# Config for single device QLoRA finetuning in lora_finetune_single_device.py
# using a Code-Llama2 7B model
#
# This config assumes that you've run the following command before launching
# this run:
# tune download codellama/CodeLlama-7b-hf --output-dir /tmp/CodeLlama-7b-hf
#
# To launch on a single device, run the following command from root:
# tune run lora_finetune_single_device --config code_llama2/7B_qlora_single_device
#
# You can add specific overrides through the command line. For example
# to override the checkpointer directory while launching training
# you can run:
# tune run lora_finetune_single_device --config code_llama2/7B_qlora_single_device checkpointer.checkpoint_dir=<YOUR_CHECKPOINT_DIR>
#
# This config works only for training on single device.

# Model Arguments
model:
_component_: torchtune.models.code_llama2.qlora_code_llama2_7b
lora_attn_modules: ['q_proj', 'v_proj']
apply_lora_to_mlp: False
apply_lora_to_output: False
lora_rank: 8
lora_alpha: 16

# Tokenizer
tokenizer:
_component_: torchtune.models.llama2.llama2_tokenizer
path: /tmp/CodeLlama-7b-hf/tokenizer.model

# Dataset
dataset:
_component_: torchtune.datasets.alpaca_cleaned_dataset
train_on_input: True
seed: null
shuffle: True


checkpointer:
_component_: torchtune.utils.FullModelHFCheckpointer
checkpoint_dir: /tmp/CodeLlama-7b-hf
checkpoint_files: [
pytorch_model-00001-of-00003.bin,
pytorch_model-00002-of-00003.bin,
pytorch_model-00003-of-00003.bin
]
adapter_checkpoint: null
recipe_checkpoint: null
output_dir: /tmp/CodeLlama-7b-hf
model_type: LLAMA2
resume_from_checkpoint: False

# Fine-tuning arguments and training
batch_size: 2
epochs: 1
max_steps_per_epoch: null
gradient_accumulation_steps: 64
compile: False

optimizer:
_component_: torch.optim.AdamW
weight_decay: 0.01
lr: 3e-4

lr_scheduler:
_component_: torchtune.modules.get_cosine_schedule_with_warmup
num_warmup_steps: 100

loss:
_component_: torch.nn.CrossEntropyLoss


# Training environment
device: cuda
enable_activation_checkpointing: True
dtype: bf16

# Logging
metric_logger:
_component_: torchtune.utils.metric_logging.DiskLogger
log_dir: ${output_dir}
output_dir: /tmp/qlora_code_llama2_finetune_output
log_every_n_steps: 1
log_peak_memory_stats: False

# Show case the usage of pytorch profiler
# Set enabled to False as it's only needed for debugging training
profiler:
_component_: torchtune.utils.profiler
enabled: False
output_dir: ${output_dir}/torchtune_perf_tracing.json
12 changes: 12 additions & 0 deletions torchtune/_recipe_registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,10 @@ class Recipe:
name="llama2/7B_full_low_memory",
file_path="llama2/7B_full_low_memory.yaml",
),
Config(
name="code_llama2/7B_full_low_memory",
file_path="code_llama2/7B_full_low_memory.yaml",
),
Config(
name="llama3/8B_full_single_device",
file_path="llama3/8B_full_single_device.yaml",
Expand Down Expand Up @@ -66,6 +70,14 @@ class Recipe:
name="llama2/7B_qlora_single_device",
file_path="llama2/7B_qlora_single_device.yaml",
),
Config(
name="code_llama2/7B_lora_single_device",
file_path="code_llama2/7B_lora_single_device.yaml",
),
Config(
name="code_llama2/7B_qlora_single_device",
file_path="code_llama2/7B_qlora_single_device.yaml",
),
Config(
name="llama3/8B_lora_single_device",
file_path="llama3/8B_lora_single_device.yaml",
Expand Down
8 changes: 7 additions & 1 deletion torchtune/models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,10 @@
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree.

from torchtune.models import convert_weights, gemma, llama2, mistral # noqa
from torchtune.models import ( # noqa
code_llama2,
convert_weights,
gemma,
llama2,
mistral,
)
27 changes: 27 additions & 0 deletions torchtune/models/code_llama2/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Copyright (c) Meta Platforms, Inc. and affiliates.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree.

from ._model_builders import ( # noqa
code_llama2_13b,
code_llama2_70b,
code_llama2_7b,
lora_code_llama2_13b,
lora_code_llama2_70b,
lora_code_llama2_7b,
qlora_code_llama2_13b,
qlora_code_llama2_7b,
)

__all__ = [
"code_llama2_13b",
"code_llama2_70b",
"code_llama2_7b",
"lora_code_llama2_13b",
"lora_code_llama2_70b",
"lora_code_llama2_7b",
"qlora_code_llama2_13b",
"qlora_code_llama2_7b",
]
Loading
Loading