<a href="https://colab.research.google.com/github/edgarbc/LLM_optimizer/blob/main/mastering_LLMs_workshop2024/axolotl_example.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Axolotl example

Adapted from https://github.com/OpenAccess-AI-Collective/axolotl/blob/main/examples/colab-notebooks/colab-axolotl-example.ipynb

May 2024.

--
Before starting, change the compute to GPU T4.

In [1]:
import torch
# Check so there is a gpu available, a T4(free tier) is enough to run this notebook
assert (torch.cuda.is_available()==True)

## 1. Install Axolotl and dependencies

In [2]:
!pip install torch=="2.1.2"
!pip install -e git+https://github.com/OpenAccess-AI-Collective/axolotl#egg=axolotl
!pip install flash-attn=="2.5.0"
!pip install deepspeed=="0.13.1"!pip install mlflow=="2.13.0"

Obtaining axolotl from git+https://github.com/OpenAccess-AI-Collective/axolotl#egg=axolotl
  Updating ./src/axolotl clone
  Running command git fetch -q --tags
  Running command git reset --hard -q 05b0bd08d229ee28cd3f11098d5b178f2ce441b6
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting fschat@ git+https://github.com/lm-sys/FastChat.git@27a05b04a35510afb1d767ae7e5990cbd278f8fe (from axolotl)
  Using cached fschat-0.2.36-py3-none-any.whl
Installing collected packages: axolotl
  Attempting uninstall: axolotl
    Found existing installation: axolotl 0.4.1
    Uninstalling axolotl-0.4.1:
      Successfully uninstalled axolotl-0.4.1
  Running setup.py develop for axolotl
Successfully installed axolotl-0.4.1
[31mERROR: Could not find a version that satisfies the requirement deepspeed==0.13.1!pip (from versions: 0.3.1.dev1, 0.3.1.dev2, 0.3.1.dev3, 0.3.1.dev4, 0.3.1.dev5, 0.3.1.dev6, 0.3.1.dev7, 0.3.1.dev8, 0.3.1, 0.3.2, 0.3.3, 0.3.4, 0.3.5, 0.3.6, 0.3.7, 0.3.8, 0.3.9, 0.3.10, 

## 2. Define and load the config file (yaml)

This step is the key of using axolotl (and the power of it). It is easy to change the configutaration to run the fine tuning process with different parameters and keep the pipeline constant.

In [3]:
import yaml

# Your YAML string
yaml_string = """
base_model: TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T
model_type: LlamaForCausalLM
tokenizer_type: LlamaTokenizer

load_in_8bit: false
load_in_4bit: true
strict: false

datasets:
  - path: mhenrichsen/alpaca_2k_test
    type: alpaca
dataset_prepared_path:
val_set_size: 0.05
output_dir: ./outputs/qlora-out

adapter: qlora
lora_model_dir:

sequence_len: 4096
sample_packing: true
eval_sample_packing: false
pad_to_sequence_len: true

lora_r: 32
lora_alpha: 16
lora_dropout: 0.05
lora_target_modules:
lora_target_linear: true
lora_fan_in_fan_out:

wandb_project:
wandb_entity:
wandb_watch:
wandb_name:
wandb_log_model:

gradient_accumulation_steps: 4
micro_batch_size: 2
num_epochs: 4
optimizer: paged_adamw_32bit
lr_scheduler: cosine
learning_rate: 0.0002

train_on_inputs: false
group_by_length: false
bf16: auto
fp16:
tf32: false

gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true

warmup_steps: 10
evals_per_epoch: 4
saves_per_epoch: 1
debug:
deepspeed:
weight_decay: 0.0
fsdp:
fsdp_config:
special_tokens:

"""

# Convert the YAML string to a Python dictionary
yaml_dict = yaml.safe_load(yaml_string)

# Specify your file path
file_path = 'test_axolotl.yaml'

# Write the YAML file
with open(file_path, 'w') as file:
    yaml.dump(yaml_dict, file)

## 3. Training

Now the training

In [4]:
# Buy using the ! the comand will be executed as a bash command
!accelerate launch -m axolotl.cli.train /content/test_axolotl.yaml

The following values were not passed to `accelerate launch` and had defaults used instead:
	`--num_processes` was set to a value of `1`
	`--num_machines` was set to a value of `1`
	`--mixed_precision` was set to a value of `'no'`
	`--dynamo_backend` was set to a value of `'no'`
[2024-06-02 17:40:00,946] [INFO] [datasets.<module>:58] [PID:3843] PyTorch version 2.1.2 available.
[2024-06-02 17:40:00,947] [INFO] [datasets.<module>:70] [PID:3843] Polars version 0.20.2 available.
[2024-06-02 17:40:00,948] [INFO] [datasets.<module>:105] [PID:3843] TensorFlow version 2.15.0 available.
[2024-06-02 17:40:00,949] [INFO] [datasets.<module>:118] [PID:3843] JAX version 0.4.26 available.
2024-06-02 17:40:02.556679: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-06-02 17:40:02.556728: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to regist

## 4. Gradio for inference interaction

We use the fine tuned model and gradio for an interface.

In [None]:
# Buy using the ! the comand will be executed as a bash command
!accelerate launch -m axolotl.cli.inference /content/test_axolotl.yaml \
    --qlora_model_dir="./qlora-out" --gradio