<a href="https://colab.research.google.com/github/mr-cri-spy/LLM-Playground/blob/main/LLM_Training.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Set up the development environment to utilize GPU resources.
Understand and install specific library versions directly from a repository.
Familiarize with YAML configuration for training setups.
Execute a basic training session for a language model using the Axolotl library.

In [None]:
!pip install torch==2.4.0



In [3]:
import torch
# Check so there is a gpu available, a T4(free tier) is enough to run this notebook
assert (torch.cuda.is_available()==True)

Install the Axolotl library directly from GitHub

In [4]:
%pip install -e 'git+https://github.com/axolotl-ai-cloud/axolotl.git@78b42a3fe13c49e317bc116b9999c30e070322cc#egg=axolotl' # ensures the same version we used in the course

[33mDEPRECATION: Loading egg at /usr/local/lib/python3.12/dist-packages/hqq_aten-0.0.0-py3.12-linux-x86_64.egg is deprecated. pip 24.3 will enforce this behaviour change. A possible replacement is to use pip for package installation. Discussion can be found at https://github.com/pypa/pip/issues/12330[0m[33m
[0mObtaining axolotl from git+https://github.com/axolotl-ai-cloud/axolotl.git@78b42a3fe13c49e317bc116b9999c30e070322cc#egg=axolotl
  Skipping because already up-to-date.
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting transformers@ git+https://github.com/huggingface/transformers.git@026a173a64372e9602a16523b8fae9de4b0ff428 (from axolotl)
  Cloning https://github.com/huggingface/transformers.git (to revision 026a173a64372e9602a16523b8fae9de4b0ff428) to /tmp/pip-install-7533_1i0/transformers_43cfb910f93642eb969844b078c60fde
  Running command git clone --filter=blob:none --quiet https://github.com/huggingface/transformers.git /tmp/pip-install-7533_1i0/transformers_4

#Configuration Setup
Create a YAML configuration to meticulously set up the training parameters. This configuration file will include settings for the model, tokenizer, and training details, structured to work efficiently even on less powerful, free tier GPUs.

In [None]:
import yaml

train_config = """
# model params
base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
model_type: LlamaForCausalLM
tokenizer_type: LlamaTokenizer


# dataset params
datasets:
  - path: jaydenccc/AI_Storyteller_Dataset
    type:
      system_prompt: ""
      field_system: system
      field_instruction: synopsis
      field_output: short_story
      format: "<|user|>\n {instruction} </s>\n<|assistant|>"
      no_input_format: "<|user|> {instruction} </s>\n<|assistant|>"

output_dir: ./models/TinyLlama_Storyteller

# model params
sequence_length: 1024
bf16: auto
tf32: false

# training params
batch_size: 4
micro_batch_size: 4
num_epochs: 2
optimizer: adamw_bnb_8bit
learning_rate: 0.0002

logging_steps: 1
"""

# Convert the YAML string to a Python dictionary
yaml_dict = yaml.safe_load(train_config)


# Write the YAML file
with open("basic_train.yml", 'w') as file:
    yaml.dump(yaml_dict, file)


Launch the training process with the accelerate command. This command is optimized for use even with free-tier resources, ensuring that you can train models effectively without requiring premium hardware.

In [6]:
!accelerate launch -m axolotl.cli.train basic_train.yml

The following values were not passed to `accelerate launch` and had defaults used instead:
	`--num_processes` was set to a value of `1`
	`--num_machines` was set to a value of `1`
	`--mixed_precision` was set to a value of `'no'`
	`--dynamo_backend` was set to a value of `'no'`
2025-08-19 19:48:55.997141: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1755632936.287969   19752 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1755632936.365267   19752 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1755632936.948279   19752 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target