# LLM Finetuning using AutoTrain Advanced

In this notebook, we will finetune a llama-3.2-1b-instruct model using AutoTrain Advanced.
You can replace the model with any Hugging Face transformers compatible model and dataset with any other dataset in proper formatting.
For dataset formatting, please take a look at [docs](https://huggingface.co/docs/autotrain/index).

In [3]:
from autotrain.params import LLMTrainingParams
from autotrain.project import AutoTrainProject

In [4]:
params = LLMTrainingParams(
    model="meta-llama/Llama-3.2-1B-Instruct",
    data_path="HuggingFaceH4/no_robots", # path to the dataset on huggingface hub
    chat_template="tokenizer", # using the chat template defined in the model's tokenizer
    text_column="messages", # the column in the dataset that contains the text
    train_split="train",
    trainer="sft", # using the SFT trainer, choose from sft, default, orpo, dpo and reward
    epochs=1,
    batch_size=1,
    lr=1e-5,
    peft=True, # training LoRA using PEFT
    quantization="int4", # using int4 quantization
    target_modules="all-linear",
    padding="right",
    optimizer="paged_adamw_8bit",
    scheduler="cosine",
    gradient_accumulation=8,
    mixed_precision="bf16",
    merge_adapter=True,
    project_name="autotrain-llama32-1b-finetune",
    log="tensorboard",
    push_to_hub=True,
    username="Jahirrrr",
    token="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",
)

In [8]:
!autotrain llm \
   --train \
   --project_name demo-proj-test-syalalalalala \
   --model openai-community/gpt2 \
   --data_path minhalvp/islamqa \
   --peft \
   --batch-size 2 \
   --epochs 1 \
   --trainer sft \
   --model-max-length 512 \
   --text-column Question \
   --push_to_hub \
   --username="Jahirrrr" \
   --token="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"


[1mINFO    [0m | [32m2024-12-22 07:49:36[0m | [36mautotrain.cli.run_llm[0m:[36mrun[0m:[36m136[0m - [1mRunning LLM[0m
[1mINFO    [0m | [32m2024-12-22 07:49:36[0m | [36mautotrain.backends.local[0m:[36mcreate[0m:[36m20[0m - [1mStarting local training...[0m
[1mINFO    [0m | [32m2024-12-22 07:49:36[0m | [36mautotrain.commands[0m:[36mlaunch_command[0m:[36m514[0m - [1m['accelerate', 'launch', '--num_machines', '1', '--num_processes', '1', '--mixed_precision', 'no', '-m', 'autotrain.trainers.clm', '--training_config', 'demo-proj-test-syalalalalala/training_params.json'][0m
[1mINFO    [0m | [32m2024-12-22 07:49:36[0m | [36mautotrain.commands[0m:[36mlaunch_command[0m:[36m515[0m - [1m{'model': 'openai-community/gpt2', 'project_name': 'demo-proj-test-syalalalalala', 'data_path': 'minhalvp/islamqa', 'train_split': 'train', 'valid_split': None, 'add_eos_token': True, 'block_size': 1024, 'model_max_length': 512, 'padding': 'right', 'trainer': 'sft', 'use

In [9]:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "Jahirrrr/demo-proj-test-syalalalalala"

tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
    model_path,
    device_map="auto",
    torch_dtype='auto'
).eval()

# Prompt content: "hi"
messages = [
    {"role": "user", "content": "hi"}
]

input_ids = tokenizer.apply_chat_template(conversation=messages, tokenize=True, add_generation_prompt=True, return_tensors='pt')
output_ids = model.generate(input_ids.to('cuda'))
response = tokenizer.decode(output_ids[0][input_ids.shape[1]:], skip_special_tokens=True)

# Model response: "Hello! How can I assist you today?"
print(response)


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/824 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/798k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/3.56M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/131 [00:00<?, ?B/s]

adapter_config.json:   0%|          | 0.00/730 [00:00<?, ?B/s]



adapter_model.safetensors:   0%|          | 0.00/9.45M [00:00<?, ?B/s]

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.



<|im_end|>user

<|im_start|>assistant


In [13]:
!autotrain llm --help

usage: autotrain <command> [<args>] llm [-h] [--train] [--deploy] [--inference]
                                        [--backend BACKEND] [--model MODEL]
                                        [--project-name PROJECT_NAME] [--data-path DATA_PATH]
                                        [--train-split TRAIN_SPLIT] [--valid-split VALID_SPLIT]
                                        [--add-eos-token] [--model-max-length MODEL_MAX_LENGTH]
                                        [--padding PADDING] [--trainer TRAINER]
                                        [--use-flash-attention-2] [--log LOG]
                                        [--disable-gradient-checkpointing]
                                        [--logging-steps LOGGING_STEPS]
                                        [--eval-strategy EVAL_STRATEGY]
                                        [--save-total-limit SAVE_TOTAL_LIMIT]
                                        [--auto-find-batch-size]
                                      

In [15]:
import os

from autotrain.params import LLMTrainingParams
from autotrain.project import AutoTrainProject


params = LLMTrainingParams(
    model="openai-community/gpt2",
    data_path="HuggingFaceH4/no_robots",
    chat_template="tokenizer",
    text_column="messages",
    train_split="train",
    trainer="sft",
    epochs=1,
    batch_size=1,
    lr=1e-5,
    peft=False,
    quantization="int4",
    target_modules="all-linear",
    padding="right",
    optimizer="paged_adamw_8bit",
    scheduler="cosine",
    gradient_accumulation=8,
    mixed_precision="bf16",
    merge_adapter=True,
    project_name="autotrain-llama32-1b-finetune",
    log="tensorboard",
    push_to_hub=True,
    username="Jahirrrr",
    token="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",
)


backend = "local"
project = AutoTrainProject(params=params, backend=backend, process=True)
project.create()

[1mINFO    [0m | [32m2024-12-22 07:33:46[0m | [36mautotrain.backends.local[0m:[36mcreate[0m:[36m20[0m - [1mStarting local training...[0m
[1mINFO    [0m | [32m2024-12-22 07:33:46[0m | [36mautotrain.commands[0m:[36mlaunch_command[0m:[36m514[0m - [1m['accelerate', 'launch', '--cpu', '-m', 'autotrain.trainers.clm', '--training_config', 'autotrain-llama32-1b-finetune/training_params.json', '-m', 'autotrain.trainers.clm', '--training_config', 'autotrain-llama32-1b-finetune/training_params.json', '-m', 'autotrain.trainers.clm', '--training_config', 'autotrain-llama32-1b-finetune/training_params.json'][0m
[1mINFO    [0m | [32m2024-12-22 07:33:46[0m | [36mautotrain.commands[0m:[36mlaunch_command[0m:[36m515[0m - [1m{'model': 'openai-community/gpt2', 'project_name': 'autotrain-llama32-1b-finetune', 'data_path': 'HuggingFaceH4/no_robots', 'train_split': 'train', 'valid_split': None, 'add_eos_token': True, 'block_size': -1, 'model_max_length': 2048, 'padding': 'ri

7868

If your dataset is in CSV / JSONL format (JSONL is most preferred) and is stored locally, make the following changes to `params`:

```python
params = LLMTrainingParams(
    data_path="data/", # this is the path to folder where train.jsonl/train.csv is located
    text_column="text", # this is the column name in the CSV/JSONL file which contains the text
    train_split = "train" # this is the filename without extension
    .
    .
    .
)
```

In [5]:
# this will train the model locally
project = AutoTrainProject(params=params, backend="local", process=True)
project.create()

[1mINFO    [0m | [32m2024-12-22 07:06:10[0m | [36mautotrain.backends.local[0m:[36mcreate[0m:[36m20[0m - [1mStarting local training...[0m
[1mINFO    [0m | [32m2024-12-22 07:06:10[0m | [36mautotrain.commands[0m:[36mlaunch_command[0m:[36m514[0m - [1m['accelerate', 'launch', '--cpu', '-m', 'autotrain.trainers.clm', '--training_config', 'autotrain-llama32-1b-finetune/training_params.json'][0m
[1mINFO    [0m | [32m2024-12-22 07:06:10[0m | [36mautotrain.commands[0m:[36mlaunch_command[0m:[36m515[0m - [1m{'model': 'meta-llama/Llama-3.2-1B-Instruct', 'project_name': 'autotrain-llama32-1b-finetune', 'data_path': 'HuggingFaceH4/no_robots', 'train_split': 'train', 'valid_split': None, 'add_eos_token': True, 'block_size': -1, 'model_max_length': 2048, 'padding': 'right', 'trainer': 'sft', 'use_flash_attention_2': False, 'log': 'tensorboard', 'disable_gradient_checkpointing': False, 'logging_steps': -1, 'eval_strategy': 'epoch', 'save_total_limit': 1, 'auto_find_bat

1215