# Finetuning a language model

Let's see how to finetune a language model to generate character backstories using HuggingFace Trainer with wandb integration. We'll use a tiny language model (`TinyStories-33M`) due to resource constraints, but the lessons you learn here should be applicable to large models too!

In [1]:
from transformers import AutoTokenizer
from datasets import load_dataset
from transformers import AutoModelForCausalLM
from transformers import Trainer, TrainingArguments
import transformers
transformers.set_seed(42)

import wandb

  from .autonotebook import tqdm as notebook_tqdm


RuntimeError: CUDA error: the launch timed out and was terminated
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

In [2]:
import os
os.environ['CUDA_LAUNCH_BLOCKING'] = "1"

In [3]:
# in codespace environments wandb is unable to detect the nb name, that's why we need to set it manually
os.environ['WANDB_NOTEBOOK_NAME'] = '05_train_llm_starter.ipynb'

In [4]:
#wandb.login()
wandb.login(key="247b3da94c9b88bd5e990f1d94799ca3ded57d6b")

[34m[1mwandb[0m: Currently logged in as: [33mfelix-ml[0m ([33mfml-team[0m). Use [1m`wandb login --relogin`[0m to force relogin
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /home/felix/.netrc


True

In [5]:
model_checkpoint = "roneneldan/TinyStories-33M"

### Preparing data

We'll start by loading a dataset containing Dungeons and Dragons character biographies from Huggingface. 

> You can expect to get some warning here, this is ok

In [6]:
ds = load_dataset('MohamedRashad/characters_backstories')

In [7]:
# Let's take a look at one example
ds["train"][400]

{'text': 'Generate Backstory based on following information\nCharacter Name: Dewin \nCharacter Race: Halfling\nCharacter Class: Sorcerer bard\n\nOutput:\n',
 'target': 'Dewin thought he was a wizard, but it turned out it was the draconic blood in his veins that brought him eldritch power.  Music classes in wizarding college taught him yet another use for his power, and when he was expelled he took up adventuring'}

In [21]:
type(ds["train"][400]["text"])

str

In [8]:
# count the number of words in the dataset
total_words = 0
for example in ds["train"]:
    total_words += len(example["text"].split())
    total_words += len(example["target"].split())
print(f"Total number of words in the dataset: {total_words}")


Total number of words in the dataset: 594817


In [9]:
# As this dataset has no validation split, we will create one
ds = ds["train"].train_test_split(test_size=0.2, seed=42)

In [24]:
# We'll create a tokenizer from model checkpoint
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint, use_fast=False)

# We'll need padding to have same length sequences in a batch
tokenizer.pad_token = tokenizer.eos_token

# Define a tokenization function that first concatenates text and target
def tokenize_function(example):
    merged = example["text"] + " " + example["target"]
    print(type(merged))
    print(merged)
    batch = tokenizer(merged, padding='max_length', truncation=True, max_length=128)
    batch["labels"] = batch["input_ids"].copy()
    return batch

# Apply it on our dataset, and remove the text columns
tokenized_datasets = ds.map(tokenize_function, remove_columns=["text", "target"])

Map:   4%|▎         | 68/1857 [00:00<00:02, 600.71 examples/s]

<class 'str'>
Generate Backstory based on following information
Character Name: Adriarin Melouchevine
Character Race: Wood elf
Character Class: Ranger

Output:
 Runs business side of shop, currently away on a mission to secure the Toern city watch contract. Arriving back to the fort just after the party's return. She was denied the contract due to unknown reasons (it was a Sending from Reynolds, who knows people in the city.)  Mama bear type. (Not actually a bear.)
<class 'str'>
Generate Backstory based on following information
Character Name: Keezle "Arch Jester Extraordinaire" Arcanodyne
Character Race: Gnome
Character Class: Wizard

Output:
At the moment Keezle enjoys tricking and annoying the evil population of the fallen kingdom of Mhourgh Brak. Currently his greatest enjoyment comes from the Young Adult Deep Dragon that dwells in a cave connected to the old Vault. very open to keezle creative ideas, at least in the beginning, it went out of hand, and Master Molton Runesprocket t

Map:  13%|█▎        | 238/1857 [00:00<00:02, 648.85 examples/s]

<class 'str'>
Generate Backstory based on following information
Character Name: Lisette Curtis
Character Race: Human
Character Class: Necromancer

Output:
 Lisette's church was destroyed by adventurers, leaving her disgraced in the eyes of her god.  She must bring the grasp of the tyrant's fist to the ocean through by becoming a pirate.
<class 'str'>
Generate Backstory based on following information
Character Name: Zuq Maddik
Character Race: Elf
Character Class: Monk

Output:
 Abandoned by his parents from birth, Zuq Kachiga knew nothing of his identity besides his father's whiskey bottle with his name engraved. Knowing nothing of his lineage, Zuq remained an outcasts throughout his adolescence. In response, the socially shunned elf changed his name, with rebellious spirit, to "Zuq Maddiq." Despite his pitiful upbringing, Zuq found success as a crime investigator and happily established a family life. This success, however, was short-lived. Briefly after his investigation work took dow

Map:  17%|█▋        | 322/1857 [00:00<00:02, 714.00 examples/s]

<class 'str'>
Generate Backstory based on following information
Character Name: Trullan Serbella
Character Race: Elf
Character Class: Sorcerer

Output:
Trullan was the first sorcerer born in centuries. She had an easy, spoiled childhood as the family favorite. When she came of age, she decided to set off into the world and learn what happened to the dragon. She has convinced herself that they are still alive, and that she will bring them back home to restore her family to their former glory.ries. For unknown reasons, the dragon left one day, never to be seen again. Sorcerers began to be more and more rarely born, and the Serbella family faded into obscurity.
<class 'str'>
Generate Backstory based on following information
Character Name: Kallus Merulan
Character Race: Half-elf
Character Class: Warlock

Output:
recieve a portion of his powers. His first prank, burning down a building, resulted in the near death of an innocent family- this caught the attention of his Onyx circle member me

Map:  29%|██▉       | 545/1857 [00:00<00:01, 812.06 examples/s]

<class 'str'>
Generate Backstory based on following information
Character Name: Faradan
Character Race: Human
Character Class: Wizard

Output:
I don't know much more about her.ght armour. She emphatically does not wear a hat.
<class 'str'>
Generate Backstory based on following information
Character Name: Finarieth Vrue
Character Race: Half-elf
Character Class: Sorcerer

Output:
(I donÕt like the cleric, partly because heÕs very rule-oriented, but mostly because heÕs also a half-elf, and I have no small amount of self-loathing left over from childhood.). I began a lucrative career as a burglar, starting with small residential homes and moving up to banks and jewelry stores. ItÕs been 15 years, and while IÕm not quite a local legend, other criminals definitely know about me. IÕm known only as Fin, and I donÕt address my elf heritage at all, ever, with anyone. I have a solid network of contacts among other thieves, fences, smugglers, and the occasional assassin. I rarely work on contract,

Map:  39%|███▊      | 719/1857 [00:00<00:01, 837.40 examples/s]

<class 'str'>
Generate Backstory based on following information
Character Name: cha boy taako
Character Race: Sweet elf
Character Class: Flip wizard

Output:
 fuck you
<class 'str'>
Generate Backstory based on following information
Character Name: Crimson Brimstone
Character Race: Human
Character Class: Wizard

Output:
 Arch-enemy of a barbarian named Francis. 
<class 'str'>
Generate Backstory based on following information
Character Name: Keyu
Character Race: Half oni
Character Class: Monk

Output:
Tanaka family every few years to pay his respects.visit the 
<class 'str'>
Generate Backstory based on following information
Character Name: Farthan Nologin
Character Race: Human
Character Class: Wizard

Output:
 Trained in secret mind-control magic by the MRA. Searching for new girls.
<class 'str'>
Generate Backstory based on following information
Character Name: Avi
Character Race: Half-elf
Character Class: Paladin

Output:
On the whole, he lived where he never wanted for anything; he had

Map:  50%|█████     | 935/1857 [00:01<00:01, 843.26 examples/s]

<class 'str'>
Generate Backstory based on following information
Character Name: Letka the Hearty
Character Race: Halfling
Character Class: Bard

Output:
Having Letka, her mother settled down, relying on her music and performance to make a living, with Letka joining her in song and story telling once she was old enough. Once Letka was older, her mother resumed adventuing, bringing her along. These were mostly light tasks, but she took on more as her daughter grew. She accumulated some bumps, bruises, and scars along the way, but grew up strong because of it. As she grew into adulthood Letka and her mother parted ways, but they run into each other from time to time and catch up. 
<class 'str'>
Generate Backstory based on following information
Character Name: Arno 
Character Race: Human
Character Class: Fighter (eldritch knight)

Output:
 Former adventurer who lost his party against an evil wizard. Spend the last decades training to fight mages.
<class 'str'>
Generate Backstory based on f

Map:  56%|█████▌    | 1043/1857 [00:01<00:01, 770.91 examples/s]

<class 'str'>
Generate Backstory based on following information
Character Name: Alicia Maria de Luna
Character Race: Human
Character Class: Wizard

Output:
When she wants to, she can become the nightmare image of the craftily and capriciously mad gentry. Lately that also seems to happen when she is short on sleep or abstracted.cal consequence of her budding transcendence).s not yawned in over eleven years.) She tends to unnerve people, a fact which she is aware of - and unashamed of! - though she will make an effort to compensate for it if circumstances demand it. She is a talented liar, but only for advantage - she does not lie reflexively, or to toy with others. Her sense of humor is...unique.f the planes. She quotes often from her readings, and has published a small number of well-regarded monographs, one of which has been censored by three churches - a fact of which she is quietly proud.laim to the title of Contessa...which will now, unexpectedly, fall to Alicia.
<class 'str'>
Gene

Map:  63%|██████▎   | 1168/1857 [00:01<00:00, 781.21 examples/s]

<class 'str'>
Generate Backstory based on following information
Character Name: Caleb Tierce
Character Race: Halfling
Character Class: Rogue

Output:
Some years later, after most of his rogue's apprenticeship had been completed, Caleb received a letter from his sister encouraging him to come visit her on the surface, thus leading to the start of his own journey.d to be an adversary to Caleb's aunts, but has since mellowed), who took him on as an apprentice and began training him in various rogue skills.
<class 'str'>
Generate Backstory based on following information
Character Name: Tenzin Dolma
Character Race: Half-elf
Character Class: Mystic

Output:
 Born in the mountains of Tabot, tucked away within the continent of Kara-Tur, he was raised at a monastery dedicated to the Path of Dust. There, he honed his natural psionic powers, and eventually he was sent off to journey the world in search of enlightenment. His elven blood stems from the mysterious Maraloi elves, and may be why he's 

Map:  74%|███████▍  | 1373/1857 [00:01<00:00, 794.36 examples/s]

<class 'str'>
Generate Backstory based on following information
Character Name: Cloud Heamsang
Character Race: Human
Character Class: Witch hunter

Output:
 Secret order last member alive
<class 'str'>
Generate Backstory based on following information
Character Name: Syvas
Character Race: Elf
Character Class: Druid

Output:
His druidic training complete, Syvas took to the road again, ever seeking to learn moreanding the secrets of the natural worldf druidshe sometimes added)lityÓthe conservatives had once held
<class 'str'>
Generate Backstory based on following information
Character Name: Rowan O'Dell
Character Race: Human
Character Class: Paladin

Output:
Rowan is a little stodgy, and tends to overthink things, but means well, and feels responsible for those around her.  Even if she is kind of a know it all, she does stick her neck out for the people she feels responsible for, even if it comes at personal cost.   an now literally call down thunder from the sky, that's pretty fun, she 

Map:  85%|████████▍ | 1575/1857 [00:02<00:00, 792.49 examples/s]

<class 'str'>
Generate Backstory based on following information
Character Name: Professor Aurelia
Character Race: Dragonborn, blue
Character Class: Artificer

Output:
 A blue Dragonborn born into an influential family of bronze dragonborn, Aurelia never fit in. She was much more interested in explosions and the scientific method than in politics or the machinations of Dragonborn society. She ran away to university and studied for several degrees, focusing on chemistry and alchemy and making things go boom. 
<class 'str'>
Generate Backstory based on following information
Character Name: Scrimmick L'Hedge
Character Race: Goblin
Character Class: Druid

Output:
 As an infant he was collected out of the anthropological zoo collection of the largest and most prestigious institution of learning, by an anthropologist looking to discern whether or not the evilness of goblins was innate or taught. Raised by a human academic in an academy surrounded by learning and wizards was an amazing experien

Map:  91%|█████████ | 1691/1857 [00:02<00:00, 780.04 examples/s]

<class 'str'>
Generate Backstory based on following information
Character Name: Nara
Character Race: Half-orc
Character Class: Fighter

Output:
 Nara was raised by her orc father; the only connection she had to her human mother was a stone pendant that she wears as a necklace. When Nara was about 8, a magical message revealed itself on the pendant, telling Nara to journey to a faraway cave by the ocean where her mother would be there to give Nara a magical birthright. Nara left home and set off on this long journey. Along the way, she met a half-elf named Kael, who was on her own journey to become a paladin. They traveled together for almost 10 years, but parted ways when Nara reached the cave promising each other that they would meet again in the future and get married. However, she discovered that someone, or something, had gotten to the cave first. There was no sign of her mother or the magic. Frustrated but refusing to give up, Nara went to a tavern in the closest town with the int

Map: 100%|██████████| 1857/1857 [00:02<00:00, 775.15 examples/s]

<class 'str'>
Generate Backstory based on following information
Character Name: Skye Wolfe
Character Race: Wild elf
Character Class: Ranger

Output:
-	When I left the library I was just in time to join Jumper and Tolir on a return trip to the caves.  I noticed the stones around one of the cages is marked with the light symbol in draconic, apparently it was part of what was keeping one of the creatures caged.  The cages were of interesting construction though.  One was made of glass, one of metal and the last was made of stone.  The stone one is the one that is ringed by the stones.  I would deduce that they were specially made to contain the creatures that the mage was trying to create.  There is no way tell exactly how old they are but I would say very.  The metal is rusted and the stones of light were made by a high-level mage so could have been active for centuries at least.  The spell appears to be spent though, I wonder, do they only shine at night?  Could the sunÕs presence be en


Map:  12%|█▏        | 55/465 [00:00<00:00, 526.04 examples/s]

<class 'str'>
Generate Backstory based on following information
Character Name: Jinkies
Character Race: Human
Character Class: Fighter

Output:
 My family is no stranger to wealth, power, and privilege. In the glory days of Neverwinter, my parents were the count and countess of Corlinn Hill, a large estate located in the hills northeast of the city. But Mount Hotenow erupted thirty years ago, devastating Neverwinter and erasing Corlinn Hill from the map. Instead of growing up on an estate, I was raised in a small but comfortable town house in Waterdeep. As an adult, I stand to inherit a meaningless title and little else. My personal goal is to civilise Phandelen.
<class 'str'>
Generate Backstory based on following information
Character Name: Kestrel
Character Race: Half gnome, half orc
Character Class: Cleric assassin

Output:
 Kestrel was born to a gnome woman who survived an orc raid on her village. Her mother dropped her on the doorstep of a temple to Set, and the priests raised her

Map:  43%|████▎     | 202/465 [00:00<00:00, 547.69 examples/s]

<class 'str'>
Generate Backstory based on following information
Character Name: Cormac
Character Race: Human
Character Class: Wizard

Output:
 Raised among the clans of the Axe Mountains, his career as a mercenary began once it was relatively clear that there was no room for a competitor in the sorcerous circles of his home. He endeavoured to join the Bloody Axe Mercenary Company as a way to see what greater wonders the world offers and to escape the petty concerns of the local chiefdoms. Since then, he has become a valuable member of the Company, fighting in countless war actions on land and sea.
<class 'str'>
Generate Backstory based on following information
Character Name: Clanless Kraygesh
Character Race: Dragonborn
Character Class: Cleric

Output:
Years later a vision told me of a great need of my skills and possibly my faith, although the details were vague. Four dragons seemed to be clear as day and a path leading north seemed to radiate in the dark for my eyes. A sign, I would 

Map:  75%|███████▌  | 351/465 [00:00<00:00, 654.58 examples/s]

<class 'str'>
Generate Backstory based on following information
Character Name: Theia
Character Race: Human 
Character Class: Cleric

Output:
 Theia was a street urchin who bacame a priestess after being caught stealing from a church. She was saved by a former paladin when her temple was attacked by bandits. The two were directed directed by visions from their goddess to travel to another continent to be her representatives on a council of cl
<class 'str'>
Generate Backstory based on following information
Character Name: Traminar
Character Race: Elf
Character Class: Rogue

Output:
 Born in an orphanage Traminar was raised by a group of wizards who taught him spells. His life taught him to be a rogue and how to fight as a warrior. His only weapon is an unbreakable wooden staff capped with diamonds on either end that can extend on a chain.
<class 'str'>
Generate Backstory based on following information
Character Name: Jimmy Hardtack
Character Race: Volksdeutsche
Character Class: Politica

Map: 100%|██████████| 465/465 [00:00<00:00, 658.00 examples/s]

<class 'str'>
Generate Backstory based on following information
Character Name: Larina Nix
Character Race: Human
Character Class: Witch

Output:
 Larina Nix was only six when she heard the call of the Goddess to become a witch.
<class 'str'>
Generate Backstory based on following information
Character Name: Bright Shield
Character Race: Human
Character Class: Kai lord

Output:
shield with its image.Ó it in glorious combat and emblazoning mymenceolk.
<class 'str'>
Generate Backstory based on following information
Character Name: Pith
Character Race: Elf (drow)
Character Class: Warlock

Output:
 Criminal
<class 'str'>
Generate Backstory based on following information
Character Name: Onan Bakkal
Character Race: Human
Character Class: Warlock

Output:
After leaving, he made his way to Duragzhar where he was able to put his warlock powers to use working in black market trades, smuggling, and treasure hunting. find the foul creature who killed his grandmother. He lost his way and was never se




In [22]:
tokenized_datasets

DatasetDict({
    train: Dataset({
        features: ['input_ids', 'attention_mask', 'labels'],
        num_rows: 1857
    })
    test: Dataset({
        features: ['input_ids', 'attention_mask', 'labels'],
        num_rows: 465
    })
})

In [11]:
# Let's check out one prepared example
print(tokenizer.decode(tokenized_datasets["train"][900]['input_ids']))

Generate Backstory based on following information
Character Name: Mr. Gale
Character Race: Half-orc
Character Class: Cleric

Output:
 Growing up the only half-orc in a small rural town was rough. His mother didn't survive childbirth and so was raised in a church in a high mountain pass, his attention was always drawn by airships passing through, and dreams of an escape. Leaving to strike out on his own as early as he could he made a living for most of his life as an airship sailor, and occasionally a pirate. A single storm visits him throughout his life, marking every major


### Training
Let's finetune a pretrained language model on our dataset using HF Transformers and their wandb integration. 

In [12]:
# We will train a causal (autoregressive) language model from a pretrained checkpoint
model = AutoModelForCausalLM.from_pretrained(model_checkpoint);

In [13]:
# Start a new wandb run
run = wandb.init(project='fhtw_llm_tuning', job_type="training", anonymous="allow")

In [14]:
# write simple testscript to test torch, then transformers, then accelerate

# import torch
# import transformers
# from accelerate import Accelerator

# # Test torch
# x = torch.rand(5, 3)
# print(f"torch test passed. Tensor: \n{x}")

# # Test transformers
# model = transformers.BertModel.from_pretrained('bert-base-uncased')
# print(f"transformers test passed. Model architecture: \n{model}")

# # Test accelerate
# accelerator = Accelerator()
# print(f"accelerate test passed. Device: {accelerator.device}")

# print("All tests passed.")



In [15]:
import torch

# If CUDA is available, use it
if torch.cuda.is_available():
    device = torch.device('cuda')
    torch.backends.cudnn.benchmark = True
else:
    device = torch.device('cpu')


In [16]:
# Define training arguments
model_name = model_checkpoint.split("/")[-1]
training_args = TrainingArguments(
    f"{model_name}-finetuned-characters-backstories",
    report_to="wandb", # we need one line to track experiments in wandb
    num_train_epochs=1,
    logging_steps=1,
    evaluation_strategy = "epoch",
    learning_rate=1e-4,
    weight_decay=0.01,
    no_cuda=False, # force cpu use, will be renamed `use_cpu`
)

In [17]:
# We'll use HF Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets["test"],
)

In [18]:
# Let's train!
trainer.train()

Epoch,Training Loss,Validation Loss


RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)`

In [None]:
transformers.logging.set_verbosity_error() # suppress tokenizer warnings

prefix = "Generate Backstory based on following information Character Name: "

prompts = [
    "Frogger Character Race: Aarakocra Character Class: Ranger Output: ",
    "Smarty Character Race: Aasimar Character Class: Cleric Output: ",
    "Volcano Character Race: Android Character Class: Paladin Output: ",
]

table = wandb.Table(columns=["prompt", "generation", "input_length", "output_length"])

for prompt in prompts:
    input_ids = tokenizer.encode(prefix + prompt, return_tensors="pt")
    input_ids = input_ids.to(device)
    output = model.generate(input_ids, do_sample=True, max_new_tokens=50, top_p=0.3)
    output_text = tokenizer.decode(output[0], skip_special_tokens=True)
    table.add_data(prefix + prompt, output_text, len(input_ids), len(output[0]))

    
wandb.log({'tiny_generations': table})

In [None]:
wandb.finish()