# 🦎 LazyAxolotl

> 🗣️ [Large Language Model Course](https://github.com/mlabonne/llm-course)

❤️ Created by [@maximelabonne](https://twitter.com/maximelabonne).

* This notebook allows you to **fine-tune your LLMs** using [Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) and [Runpod](https://runpod.io?ref=9nvk2srl) \(please consider using my [referral link](https://runpod.io?ref=9nvk2srl)).
* It can also use [LLM AutoEval](https://github.com/mlabonne/llm-autoeval) to automatically evaluate the trained model using [Nous' benchmark suite](https://huggingface.co/spaces/mlabonne/Yet_Another_LLM_Leaderboard).
* You can find Axolotl YAML configurations (SFT or DPO) on [GitHub](https://github.com/OpenAccess-AI-Collective/axolotl/tree/main/examples) or [Hugging Face](https://huggingface.co/models?other=axolotl&sort=likes).

In [None]:
MODEL = "AlphaMonarch-7B"
yaml_config = f"""
base_model: mlabonne/NeuralMonarch-7B
model_type: MistralForCausalLM
tokenizer_type: LlamaTokenizer
is_mistral_derived_model: true

load_in_8bit: false
load_in_4bit: true
strict: false

rl: dpo
chat_template: chatml
datasets:
  - path: mlabonne/chatml-OpenHermes2.5-dpo-binarized-alpha
    split: train
    type: chatml.intel
dataset_prepared_path:
val_set_size: 0.01
output_dir: ./out

adapter: qlora
lora_model_dir:

sequence_len: 1800
sample_packing: false
pad_to_sequence_len: false

lora_r: 16
lora_alpha: 16
lora_dropout: 0.05
lora_target_linear: true
lora_fan_in_fan_out:
lora_target_modules:

wandb_project: axolotl
wandb_entity:
wandb_watch:
wandb_name:
wandb_log_model:

gradient_accumulation_steps: 8
micro_batch_size: 1
num_epochs: 1
optimizer: paged_adamw_32bit
lr_scheduler: cosine
learning_rate: 5e-7

train_on_inputs: false
group_by_length: false
bf16: true
fp16: false
tf32: true

gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true

warmup_steps: 100
evals_per_epoch: 1
eval_table_size:
eval_table_max_new_tokens: 128
save_steps: 1080
max_steps: 1080
debug:
deepspeed:
weight_decay: 0.0
fsdp:
fsdp_config:
special_tokens:
"""

In [None]:
!pip install -qqq runpod --progress-bar off

import runpod
from google.colab import userdata, runtime
import requests
import json
import yaml
import requests

def upload_to_github_gist(text, gist_name, gh_token):
    # Create the gist content
    gist_content = {
        "public": False,  # set to True if you want it to be a public gist
        "files": {
            f"{gist_name}": {  # Change the file extension to .txt for plain text
                "content": text
            }
        },
    }

    # Headers for the request
    headers = {
        "Authorization": f"token {gh_token}",
        "Accept": "application/vnd.github.v3+json",
    }

    # Make the request
    response = requests.post(
        "https://api.github.com/gists", headers=headers, json=gist_content
    )

    if response.status_code == 201:
        gist_data = response.json()
        raw_url = gist_data['files'][gist_name]['raw_url']
        print(f"Uploaded gist successfully! Raw URL: {raw_url}")
        return raw_url
    else:
        print(
            f"Failed to upload gist. Status code: {response.status_code}. Response: {response.text}"
        )
        return None

# @title ## Training parameters
GPU = "NVIDIA GeForce RTX 3090" # @param ["NVIDIA A100 80GB PCIe", "NVIDIA A100-SXM4-80GB", "NVIDIA A30", "NVIDIA A40", "NVIDIA GeForce RTX 3070", "NVIDIA GeForce RTX 3080", "NVIDIA GeForce RTX 3080 Ti", "NVIDIA GeForce RTX 3090", "NVIDIA GeForce RTX 3090 Ti", "NVIDIA GeForce RTX 4070 Ti", "NVIDIA GeForce RTX 4080", "NVIDIA GeForce RTX 4090", "NVIDIA H100 80GB HBM3", "NVIDIA H100 PCIe", "NVIDIA L4", "NVIDIA L40", "NVIDIA RTX 4000 Ada Generation", "NVIDIA RTX 4000 SFF Ada Generation", "NVIDIA RTX 5000 Ada Generation", "NVIDIA RTX 6000 Ada Generation", "NVIDIA RTX A2000", "NVIDIA RTX A4000", "NVIDIA RTX A4500", "NVIDIA RTX A5000", "NVIDIA RTX A6000", "Tesla V100-FHHL-16GB", "Tesla V100-PCIE-16GB", "Tesla V100-SXM2-16GB", "Tesla V100-SXM2-32GB"]
NUMBER_OF_GPUS = 1 # @param {type:"slider", min:1, max:8, step:1}
CONTAINER_DISK = 75 # @param {type:"slider", min:50, max:500, step:25}
CLOUD_TYPE = "COMMUNITY" # @param ["COMMUNITY", "SECURE"]
SCRIPT = "https://gist.githubusercontent.com/mlabonne/9d85e1bb8fc3efe8649b677845c83bdb/raw" # @param {type:"string"}
LLM_AUTOEVAL = True # @param {type:"boolean"}
DEBUG = False # @param {type:"boolean"}

# @markdown ---

# @markdown ## Tokens
# @markdown Enter the name of your tokens in the Secrets tab.
USERNAME = "mlabonne" # @param {type:"string"}
RUNPOD_TOKEN = "runpod" # @param {type:"string"}
HUGGING_FACE_TOKEN = "HF_TOKEN" # @param {type:"string"}
WANDB_TOKEN = "wandb" # @param {type:"string"}
GITHUB_TOKEN = "github" # @param {type:"string"}

# Environment variables
runpod.api_key = userdata.get(RUNPOD_TOKEN)
WANDB_API_KEY = userdata.get(WANDB_TOKEN)
HF_TOKEN = userdata.get(HUGGING_FACE_TOKEN)
GITHUB_API_TOKEN = userdata.get(GITHUB_TOKEN)

# Make sure it's a valid YAML file
config = yaml.safe_load(yaml_config)

# Upload the YAML file to GitHub
gist_url = upload_to_github_gist(yaml_config, "config.yaml", GITHUB_API_TOKEN)

# Summary
base_model = config.get('base_model', 'Unknown model')
dataset_info = []
datasets = config.get('datasets', [])
for dataset in datasets:
    path = dataset.get('path', 'Unknown path')
    dtype = dataset.get('type', 'Unknown type')
    dataset_info.append(f"{path} ({dtype})")
datasets_summary = ', '.join(dataset_info)
print(f"This runs trains {base_model} on {datasets_summary}.")

# Create a pod
pod = runpod.create_pod(
    name=f"LazyAxolotl - {MODEL}",
    image_name="winglian/axolotl-cloud:main-latest",
    gpu_type_id=GPU,
    cloud_type=CLOUD_TYPE,
    gpu_count=NUMBER_OF_GPUS,
    volume_in_gb=0,
    container_disk_in_gb=CONTAINER_DISK,
    template_id="eul6o46pab",
    env={
        "HF_TOKEN": HF_TOKEN,
        "SCRIPT": SCRIPT,
        "WANDB_API_KEY": WANDB_API_KEY,
        "GIST_URL": gist_url,
        "MODEL": MODEL,
        "BASE_MODEL": config['base_model'],
        "USERNAME": USERNAME,
        "LLM_AUTOEVAL": LLM_AUTOEVAL,
        "BENCHMARK": "nous",
        "GITHUB_API_TOKEN": GITHUB_API_TOKEN,
        "TRUST_REMOTE_CODE": True,
        "DEBUG": DEBUG,
    }
)

print("https://www.runpod.io/console/pods")

Uploaded gist successfully! Raw URL: https://gist.githubusercontent.com/mlabonne/fea8dc7f067cb0543a8cad974fe1b4e4/raw/c040f19bc5f13e46e2fa4493dbde111910df7ba3/config.yaml
This runs trains mlabonne/NeuralMonarch-7B on mlabonne/chatml-OpenHermes2.5-dpo-binarized-alpha (chatml.intel).
https://www.runpod.io/console/pods
