# Finetuning Tinyllama on Elon Musk's Tweets using Axolotl

This notebook demonstrates the utilization of the [axolotl](https://github.com/OpenAccess-AI-Collective/axolotl) tool to perform fine-tuning on the [Tinyllama](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0) model using the dataset of tweets authored by Elon Musk.

Throughout this project, we employ Weight and Biases to monitor the fine-tuning process, ensuring that we can track and analyze the model's performance as it adapts to the specific tweet data.

We use the following config file that can be found on github:
https://github.com/Skower/mlpops/blob/d676a2755426f0f94ee03a3649ba8c6c6f2d1d4e/model-finetuning/TinyLlamusk.yml

```
base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
model_type: LlamaForCausalLM
tokenizer_type: LlamaTokenizer
is_llama_derived_model: true
hub_model_id: Pytiny

load_in_8bit: false
load_in_4bit: true
strict: false

datasets:
    - path: mlabonne/Evol-Instruct-Python-1k
      type: alpaca
dataset_prepared_path:
val_set_size: 0.02
output_dir: ./qlora-out

adapter: qlora
lora_model_dir:

sequence_len: 2048
sample_packing: true
pad_to_sequence_len: true

lora_r: 32
lora_alpha: 16
lora_dropout: 0.05
lora_target_modules:
lora_target_linear: true
lora_fan_in_fan_out:

wandb_project: axolotl-pytiny
wandb_entity:
wandb_watch:
wandb_name:
wandb_log_model:

gradient_accumulation_steps: 2
micro_batch_size: 1
num_epochs: 4
optimizer: paged_adamw_32bit
lr_scheduler: cosine
learning_rate: 0.0002

train_on_inputs: false
group_by_length: false
bf16: false
fp16: true
tf32: false

gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: false

warmup_steps: 10
evals_per_epoch: 2
saves_per_epoch: 1
debug:
deepspeed:
weight_decay: 0.0
fsdp:
fsdp_config:
special_tokens:
    bos_token: "<s>"
    eos_token: "</s>"
    unk_token: "<unk>"
```

In [1]:
!pip3 install -U -qqq torch torchvision torchaudio

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m670.2/670.2 MB[0m [31m2.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.8/6.8 MB[0m [31m56.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.3/3.3 MB[0m [31m57.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m23.7/23.7 MB[0m [31m40.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m823.6/823.6 kB[0m [31m65.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m14.1/14.1 MB[0m [31m48.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m731.7/731.7 MB[0m [31m1.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m410.6/410.6 MB[0m [31m2.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━

In [2]:
!git clone https://github.com/OpenAccess-AI-Collective/axolotl

Cloning into 'axolotl'...
remote: Enumerating objects: 9596, done.[K
remote: Counting objects: 100% (2452/2452), done.[K
remote: Compressing objects: 100% (395/395), done.[K
remote: Total 9596 (delta 2262), reused 2104 (delta 2038), pack-reused 7144[K
Receiving objects: 100% (9596/9596), 3.17 MiB | 14.70 MiB/s, done.
Resolving deltas: 100% (6263/6263), done.


In [3]:
%cd axolotl

/content/axolotl


In [4]:
!pip3 install packaging



In [5]:
!pip3 install -e '.[flash-attn,deepspeed]'

Obtaining file:///content/axolotl
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting transformers@ git+https://github.com/huggingface/transformers.git@3cefac1d974db5e2825a0cb2b842883a628be7a0 (from axolotl==0.3.0)
  Cloning https://github.com/huggingface/transformers.git (to revision 3cefac1d974db5e2825a0cb2b842883a628be7a0) to /tmp/pip-install-k2rxmz3q/transformers_955bf95d6de549cfac7d9f4625103655
  Running command git clone --filter=blob:none --quiet https://github.com/huggingface/transformers.git /tmp/pip-install-k2rxmz3q/transformers_955bf95d6de549cfac7d9f4625103655
  Running command git rev-parse -q --verify 'sha^3cefac1d974db5e2825a0cb2b842883a628be7a0'
  Running command git fetch -q https://github.com/huggingface/transformers.git 3cefac1d974db5e2825a0cb2b842883a628be7a0
  Running command git checkout -q 3cefac1d974db5e2825a0cb2b842883a628be7a0
  Resolved https://github.com/huggingface/transformers.git to commit 3cefac1d974db5e2825a0cb2b842883a628be7a0
  Installing b

In [6]:
!pip3 install -U git+https://github.com/huggingface/peft.git

Collecting git+https://github.com/huggingface/peft.git
  Cloning https://github.com/huggingface/peft.git to /tmp/pip-req-build-askj8y2l
  Running command git clone --filter=blob:none --quiet https://github.com/huggingface/peft.git /tmp/pip-req-build-askj8y2l
  Resolved https://github.com/huggingface/peft.git to commit ebbff4023ad276cbcb2466fd7e99be7d3ae0ae11
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Building wheels for collected packages: peft
  Building wheel for peft (pyproject.toml) ... [?25l[?25hdone
  Created wheel for peft: filename=peft-0.7.2.dev0-py3-none-any.whl size=183138 sha256=3d87626910ac698571f5e0237b4f9700ea73ed08a8b088f00f2320e30fc12d38
  Stored in directory: /tmp/pip-ephem-wheel-cache-mu6yujqn/wheels/d7/c7/de/1368fac8590e1b103ddc2ec2a28ad51d83aded1a3830e8a087
Successfully built peft
Installing collected packages: peft
  Attempting uninstal

In [7]:
!pip install -U flash-attn --no-build-isolation

Collecting flash-attn
  Downloading flash_attn-2.4.2.tar.gz (2.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.4/2.4 MB[0m [31m9.6 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: flash-attn
  Building wheel for flash-attn (setup.py) ... [?25l[?25hdone
  Created wheel for flash-attn: filename=flash_attn-2.4.2-cp310-cp310-linux_x86_64.whl size=113930372 sha256=2c7ddc942e0715ef4a7ab62e3404b519a7ac040b3b6eae8fedcdc08a36ced786
  Stored in directory: /root/.cache/pip/wheels/9d/cf/7f/d14555553b5b30698dae0a4159fdd058157e7021cec565ecaa
Successfully built flash-attn
Installing collected packages: flash-attn
  Attempting uninstall: flash-attn
    Found existing installation: flash-attn 2.3.3
    Uninstalling flash-attn-2.3.3:
      Successfully uninstalled flash-attn-2.3.3
Successfully installed flash-attn-2.4.2


In [9]:
!wget https://github.com/Skower/mlpops/blob/908893f707d2b28b0bcc7dd1bc501b759a52df64/model-finetuning/pytiny.yml

UsageError: Line magic function `%wget` not found.


In [11]:
%env HUGGING_FACE_HUB_TOKEN="<YOUR_KEY_HERE>"

env: HUGGING_FACE_HUB_TOKEN=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


In [12]:
%env WANDB_API_KEY="<YOUR_KEY_HERE>"

env: WANDB_API_KEY=fcbdc8ae35ee4c6ccbb132cb80ec158938fb44bb


In [17]:
!accelerate launch -m axolotl.cli.train pytiny.yml

The following values were not passed to `accelerate launch` and had defaults used instead:
	`--num_processes` was set to a value of `1`
	`--num_machines` was set to a value of `1`
	`--mixed_precision` was set to a value of `'no'`
	`--dynamo_backend` was set to a value of `'no'`
2024-01-21 11:55:09.547232: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-01-21 11:55:09.547295: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-01-21 11:55:09.549614: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
[2024-01-21 11:55:12,875] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_

In [18]:
!wget https://gist.githubusercontent.com/mlabonne/a3542b0519708b8871d0703c938bba9f/raw/60abc5afc07f9d843bc23d56f4e0b7ab072c4a62/merge_peft.py

--2024-01-21 14:25:53--  https://gist.githubusercontent.com/mlabonne/a3542b0519708b8871d0703c938bba9f/raw/60abc5afc07f9d843bc23d56f4e0b7ab072c4a62/merge_peft.py
Resolving gist.githubusercontent.com (gist.githubusercontent.com)... 185.199.111.133, 185.199.109.133, 185.199.108.133, ...
Connecting to gist.githubusercontent.com (gist.githubusercontent.com)|185.199.111.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1502 (1.5K) [text/plain]
Saving to: ‘merge_peft.py.1’


2024-01-21 14:25:53 (28.8 MB/s) - ‘merge_peft.py.1’ saved [1502/1502]



In [19]:
!python merge_peft.py --base_model=TinyLlama/TinyLlama-1.1B-Chat-v1.0 --peft_model=./qlora-out --hub_id=Pytiny

[1/5] Loading base model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
[2/5] Loading adapter: ./qlora-out
[3/5] Merge base model and adapter
[4/5] Saving model and tokenizer in Pytiny
[5/5] Uploading to Hugging Face Hub: Pytiny
model.safetensors: 100% 2.20G/2.20G [00:44<00:00, 49.3MB/s]
Merged model uploaded to Hugging Face Hub!
