# Finetuning Tinyllama on Elon Musk's Tweets using Axolotl

This notebook demonstrates the utilization of the [axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) tool to perform fine-tuning on the [Tinyllama](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0) model using the dataset of tweets authored by Elon Musk.

Throughout this project, we employ Weight and Biases to monitor the fine-tuning process, ensuring that we can track and analyze the model's performance as it adapts to the specific tweet data.

We use the following config file that can be found on github:
https://github.com/Skower/mlpops/blob/d676a2755426f0f94ee03a3649ba8c6c6f2d1d4e/model-finetuning/TinyLlamusk.yml

```
base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
base_model_config: TinyLlama/TinyLlama-1.1B-Chat-v1.0
model_type: LlamaForCausalLM
tokenizer_type: LlamaTokenizer
is_llama_derived_model: true
hub_model_id: TinyLlamusk

load_in_8bit: false
load_in_4bit: true
strict: false

datasets:
    - path: lcama/elon-tweets
    type: completion
dataset_prepared_path: last_run_prepared
val_set_size: 0.02
output_dir: ./qlora-out

adapter: qlora
lora_model_dir:

sequence_len: 2048
sample_packing: true

lora_r: 32
lora_alpha: 16
lora_dropout: 0.05
lora_target_modules:
lora_target_linear: true
lora_fan_in_fan_out:

wandb_project: axolotl-tinyllama
wandb_entity:
wandb_watch:
wandb_run_id:
wandb_log_model:

gradient_accumulation_steps: 1
micro_batch_size: 10
num_epochs: 3
optimizer: paged_adamw_32bit
lr_scheduler: cosine
learning_rate: 0.0002

train_on_inputs: false
group_by_length: false
bf16: true
fp16: false
tf32: false

gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true

warmup_steps: 100
eval_steps: 0.01
save_strategy: epoch
save_steps:
debug:
deepspeed:
weight_decay: 0.0
fsdp:
fsdp_config:
special_tokens:
    bos_token: "<s>"
    eos_token: "</s>"
    unk_token: "<unk>"
```

In [1]:
!pip3 install -U -qqq torch torchvision torchaudio

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m670.2/670.2 MB[0m [31m2.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.8/6.8 MB[0m [31m84.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.3/3.3 MB[0m [31m62.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m23.7/23.7 MB[0m [31m52.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m823.6/823.6 kB[0m [31m63.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m14.1/14.1 MB[0m [31m91.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m731.7/731.7 MB[0m [31m2.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m410.6/410.6 MB[0m [31m2.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━

In [2]:
!git clone https://github.com/OpenAccess-AI-Collective/axolotl

fatal: destination path 'axolotl' already exists and is not an empty directory.


In [3]:
%cd axolotl

/content/axolotl


In [4]:
!pip3 install packaging



In [5]:
!pip3 install -e '.[flash-attn,deepspeed]'

Obtaining file:///content/axolotl
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting transformers@ git+https://github.com/huggingface/transformers.git@3cefac1d974db5e2825a0cb2b842883a628be7a0 (from axolotl==0.3.0)
  Using cached transformers-4.37.0.dev0-py3-none-any.whl
Collecting accelerate@ git+https://github.com/huggingface/accelerate.git@0d2280dadc6a93413a5496613b7fdda3a4d2551b (from axolotl==0.3.0)
  Using cached accelerate-0.25.0.dev0-py3-none-any.whl
Collecting torch>=1.0.0 (from bert-score==0.3.13->axolotl==0.3.0)
  Using cached torch-2.0.1-cp310-cp310-manylinux1_x86_64.whl (619.9 MB)
Collecting triton==2.0.0 (from torch>=1.0.0->bert-score==0.3.13->axolotl==0.3.0)
  Using cached triton-2.0.0-1-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (63.3 MB)
Installing collected packages: triton, torch, axolotl
  Attempting uninstall: triton
    Found existing installation: triton 2.1.0
    Uninstalling triton-2.1.0:
      Successfully uninstalled triton-2.1.0
 

In [6]:
!pip3 install -U git+https://github.com/huggingface/peft.git

Collecting git+https://github.com/huggingface/peft.git
  Cloning https://github.com/huggingface/peft.git to /tmp/pip-req-build-2ci0ev00
  Running command git clone --filter=blob:none --quiet https://github.com/huggingface/peft.git /tmp/pip-req-build-2ci0ev00
  Resolved https://github.com/huggingface/peft.git to commit ebbff4023ad276cbcb2466fd7e99be7d3ae0ae11
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Building wheels for collected packages: peft
  Building wheel for peft (pyproject.toml) ... [?25l[?25hdone
  Created wheel for peft: filename=peft-0.7.2.dev0-py3-none-any.whl size=183138 sha256=433fbfc0deec77ab1799a7288cd5b71bfe9fe2cfc500321f4232c6e86e981e71
  Stored in directory: /tmp/pip-ephem-wheel-cache-0gj0pzty/wheels/d7/c7/de/1368fac8590e1b103ddc2ec2a28ad51d83aded1a3830e8a087
Successfully built peft
Installing collected packages: peft
  Attempting uninstal

In [14]:
!pip install -U flash-attn --no-build-isolation

Collecting flash-attn
  Downloading flash_attn-2.4.2.tar.gz (2.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.4/2.4 MB[0m [31m15.2 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: flash-attn
  Building wheel for flash-attn (setup.py) ... [?25l[?25hdone
  Created wheel for flash-attn: filename=flash_attn-2.4.2-cp310-cp310-linux_x86_64.whl size=113930372 sha256=2c7ddc942e0715ef4a7ab62e3404b519a7ac040b3b6eae8fedcdc08a36ced786
  Stored in directory: /root/.cache/pip/wheels/9d/cf/7f/d14555553b5b30698dae0a4159fdd058157e7021cec565ecaa
Successfully built flash-attn
Installing collected packages: flash-attn
  Attempting uninstall: flash-attn
    Found existing installation: flash-attn 2.3.3
    Uninstalling flash-attn-2.3.3:
      Successfully uninstalled flash-attn-2.3.3
Successfully installed flash-attn-2.4.2


In [11]:
!wget https://github.com/Skower/mlpops/blob/d676a2755426f0f94ee03a3649ba8c6c6f2d1d4e/model-finetuning/TinyLlamusk.yml

--2024-01-20 19:33:29--  https://github.com/Skower/mlpops/blob/d676a2755426f0f94ee03a3649ba8c6c6f2d1d4e/model-finetuning/TinyLlamusk.yml
Resolving github.com (github.com)... 192.30.255.112
Connecting to github.com (github.com)|192.30.255.112|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 10995 (11K) [text/plain]
Saving to: ‘TinyLlamusk.yml’


2024-01-20 19:33:29 (1.34 MB/s) - ‘TinyLlamusk.yml’ saved [10995/10995]



In [8]:
%env HUGGING_FACE_HUB_TOKEN=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

env: HUGGING_FACE_HUB_TOKEN=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


In [9]:
%env WANDB_API_KEY=fcbdc8ae35ee4c6ccbb132cb80ec158938fb44bb

env: WANDB_API_KEY=fcbdc8ae35ee4c6ccbb132cb80ec158938fb44bb


In [19]:
!accelerate launch -m axolotl.cli.train TinyLlamusk.yml

The following values were not passed to `accelerate launch` and had defaults used instead:
	`--num_processes` was set to a value of `1`
	`--num_machines` was set to a value of `1`
	`--mixed_precision` was set to a value of `'no'`
	`--dynamo_backend` was set to a value of `'no'`
2024-01-20 19:55:10.660521: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-01-20 19:55:10.660576: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-01-20 19:55:10.662017: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
[2024-01-20 19:55:13,625] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_

In [20]:
!wget https://gist.githubusercontent.com/mlabonne/a3542b0519708b8871d0703c938bba9f/raw/60abc5afc07f9d843bc23d56f4e0b7ab072c4a62/merge_peft.py

--2024-01-20 20:02:18--  https://gist.githubusercontent.com/mlabonne/a3542b0519708b8871d0703c938bba9f/raw/60abc5afc07f9d843bc23d56f4e0b7ab072c4a62/merge_peft.py
Resolving gist.githubusercontent.com (gist.githubusercontent.com)... 185.199.110.133, 185.199.111.133, 185.199.109.133, ...
Connecting to gist.githubusercontent.com (gist.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1502 (1.5K) [text/plain]
Saving to: ‘merge_peft.py’


2024-01-20 20:02:18 (27.8 MB/s) - ‘merge_peft.py’ saved [1502/1502]



In [21]:
!python merge_peft.py --base_model=TinyLlama/TinyLlama-1.1B-Chat-v1.0 --peft_model=./qlora-out --hub_id=TinyLlamusk

[1/5] Loading base model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
[2/5] Loading adapter: ./qlora-out
[3/5] Merge base model and adapter
[4/5] Saving model and tokenizer in TinyLlamusk
[5/5] Uploading to Hugging Face Hub: TinyLlamusk
model.safetensors: 100% 2.20G/2.20G [00:57<00:00, 38.0MB/s]
Merged model uploaded to Hugging Face Hub!
