<a href="https://colab.research.google.com/github/NiclasFenton-Wiegleb/schlager-lyrics-bot/blob/main/7b_Schlager_bot.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#7b Schlager Bot Model

The goal of this experiment is to create a fine-tuned model that can create lyrics to Schlager songs (a specific music genre in German speaking countries) based on a verse as an input. To improve on our baseline model, we make use of a more powerful base model: Llama 2 from Meta. We use the 7 billion parameter instance, due to constraints of GPU memory.

Additionally, compared to the baseline model, we fine-tune the model making use of huggingfaces SFT trainer module and QLorA (see:...) to maximise resource utilisation.

The implemented process is based on this tutorial:

_____________

##Prepare Dataset

We need to prepare the dataset and create a formatting function to be used during the fine-tuning process, before we can proceed with training the model.

In [None]:
#Install dependencies

!pip install "transformers==4.34.0" "datasets==2.13.0" "peft==0.5.0" "accelerate==0.23.0" "bitsandbytes==0.41.1" "trl==0.4.7" "safetensors>=0.3.1" "huggingface_hub" --upgrade

In [None]:
#Login to huggingface so we can push the trained model to the hub later

from huggingface_hub import notebook_login

notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [None]:
#Import Libraries

from datasets import load_dataset
from random import randrange
import pandas as pd
from datasets.dataset_dict import DatasetDict, Dataset
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import LoraConfig, prepare_model_for_kbit_training, get_peft_model
from transformers import TrainingArguments
from trl import SFTTrainer
import os
from peft import AutoPeftModelForCausalLM

In [None]:
#Import data

df_lyrics = pd.read_csv('schlager_songs_v2.csv')
df_lyrics['lyrics'][10]

'{"status"false,"reason""Unexpected error occurred (no quota cost) Please try again later"}'

In [None]:
#Clean and prepare data to be turned into dataset

context=[]

instruction=[]

for ind in df_lyrics["lyrics"].index:
  lyrics = df_lyrics["lyrics"].iloc[ind]
  verse_1 = df_lyrics["verse_1"].iloc[ind]
  if type(verse_1) != str:
    continue
  if lyrics == '{"status"false,"reason""Unexpected error occurred (no quota cost) Please try again later"}':
    continue
  else:
    try:
      #Instruction
      instruction.append(verse_1)

      #Context
      context.append(lyrics)
    except:
      continue

df_dataset = pd.DataFrame(columns=['instruction', 'context'])

df_dataset['instruction'] = instruction
df_dataset['context'] = context

df_dataset.head(15)

Unnamed: 0,instruction,context
0,Hey Hey Hey (Hey) In einer Kneipe bei Bier un...,Hey\n Hey\n Hey\n (Hey) In einer Kneipe bei B...
1,"Jetzt geht es los, los, los","Jetzt geht es los, los, los\n Die Krüge hoch,..."
2,"Du bist die schönste, schönste, schönste Frau...","Du bist die schönste, schönste, schönste Frau..."
3,Wie jedes Jahr Malle wir sind wieder da Ole O...,Wie jedes Jahr\n Malle wir sind wieder da\n O...
4,Der Wind is kalt und rau,Der Wind is kalt und rau\n Und mer wisse janz...
5,Oh oh oh yeoh Oh oh oh yeoh Oh oh oh yeoh Oh ...,Oh oh oh yeoh\n Oh oh oh yeoh\n Oh oh oh yeoh...
6,Ich hab Muskeln und 'nen durchtrainierten Bauch,Ich hab Muskeln und 'nen durchtrainierten Bau...
7,Wir sind wieder da Wie jedes Jahr Hier ist An...,Wir sind wieder da\n Wie jedes Jahr\n Hier is...
8,"Oh Baby, komm doch her zu mir","Oh Baby, komm doch her zu mir\n Es ist mir eg..."
9,Du bist der geilste Ort der Welt,Du bist der geilste Ort der Welt\n Bist unser...


In [None]:
#Turn dataframe into dataset

dataset_lyrics = Dataset.from_pandas(df_dataset)
dataset_lyrics

Dataset({
    features: ['instruction', 'context'],
    num_rows: 1046
})

In [None]:
#Check sample from dataset

dataset_lyrics[randrange(len(dataset_lyrics))]

{'instruction': ' Vorbei ist vorbei Ich kann dich nicht halten Vorbei ist vorbei Was weg ist Ist weg',
 'context': " Vorbei ist vorbei\n Ich kann dich nicht halten\n Vorbei ist vorbei\n Was weg ist\n Ist weg\n Einfach weg\n Habe Liebe gesucht\n Ich will's nicht bestreiten\n Aber Einsamkeit tut weh\n Ich will noch mehr\n Ich geb nicht auf\n Ich will noch mehr\n Ich nehm alles in Kauf\n Du hast mich jedes mal belogen\n Aber keine war wie du\n Vorbei ist vorbei\n Das weisst du doch selber\n Vorbei ist vorbei\n Jetzt bin ich allein\n So allein\n Habe Fehler gemacht\n Ich will's nicht beschreiben\n Aber Einsamkeit tut weh\n Ich will noch mehr\n Ich geb nicht auf\n Ich will noch mehr\n Ich nehm alles in Kauf\n Du hast mich jedes mal belogen\n Aber keine war wie du\n"}

In [None]:
#Creating formatting function with prompt

def format_instruction(sample):
	return f"""### Instruction:
Benuzte den gegebenen Input um ein Schlager Lied zu schreiben.

### Input:
{sample['instruction']}

### Response:
{sample['context']}
"""

In [None]:
#Test formatting function

print(format_instruction(dataset_lyrics[randrange(len(dataset_lyrics))]))

### Instruction:
Benuzte den gegebenen Input um ein Schlager Lied zu schreiben.

### Input:
 Schön war die Zeit, als wir verliebt und glücklich waren

### Response:
 Schön war die Zeit, als wir verliebt und glücklich waren
 Noch jung und unerfahren, fast wie im Rausch flog jede Stunde dahin
 Schön war die Zeit, was hilft es uns, ihr nachzutrauern?
 Sie kann nicht ewig dauern, was auch geschehen ist
 Keinen Tag brauchten wir je zu bereuen
 Nein, wir haben uns nichts zu verzeihen
 Auch wenn jeder von uns heut eig'ne Wege geht
 Schön war die Zeit, das Paradies, das wir besessen
 Ich werd' es nie vergessen, wir glaubten, dass uns keiner daraus vertreibt
 Wird es auch alt, dein Bild ist tief in mir geblieben
 Ich werd es ewig lieben, was auch geschehen ist
 Keinen Tag brauchten wir je zu bereuen
 Nein, wir haben uns nichts zu verzeihen
 Auch wenn jeder von uns heut eig'ne Wege geht
 Und hätte ich die Wahl, ich würd' alles genau so wieder machen
 Noch einmal mit dir weinen und lachen
 Durch 

_________________________________
### Train the Model

Now, we can set the hyperparameters and start the training process.

In [None]:
#Set whether to use flash attention (Ampere GPU only)

use_flash_attention = False

In [None]:
#Load base model and tokenizer

#Hugging Face model id (German fin-tuned Llama instance)
model_id = "LeoLM/leo-hessianai-7b"

#BitsAndBytesConfig int-4 config
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

#Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    quantization_config=bnb_config,
    use_cache=False,
    use_flash_attention_2=use_flash_attention,
    device_map="auto",
)
model.config.pretraining_tp = 1


tokenizer = AutoTokenizer.from_pretrained(model_id)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

Downloading (…)lve/main/config.json:   0%|          | 0.00/708 [00:00<?, ?B/s]

Downloading (…)model.bin.index.json:   0%|          | 0.00/23.9k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

Downloading (…)l-00001-of-00002.bin:   0%|          | 0.00/9.98G [00:00<?, ?B/s]

Downloading (…)l-00002-of-00002.bin:   0%|          | 0.00/3.50G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Downloading (…)neration_config.json:   0%|          | 0.00/183 [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/855 [00:00<?, ?B/s]

Downloading tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/1.84M [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/414 [00:00<?, ?B/s]

In [None]:
#Prep Lora layer to be trained

#LoRA config based on QLoRA paper
peft_config = LoraConfig(
        lora_alpha=16,
        lora_dropout=0.1,
        r=64,
        bias="none",
        task_type="CAUSAL_LM",
)

#Prepare model for training
model = prepare_model_for_kbit_training(model)
model = get_peft_model(model, peft_config)

In [None]:
#Set training hyperparameters

args = TrainingArguments(
    output_dir="./content/drive/MyDrive/AiCore/trained_model/schlager-bot-004",
    num_train_epochs=10,
    per_device_train_batch_size=6 if use_flash_attention else 4,
    gradient_accumulation_steps=2,
    gradient_checkpointing=True,
    optim="paged_adamw_32bit",
    logging_steps=10,
    save_strategy="epoch",
    learning_rate=5e-4,
    # bf16=True, # Only on A100
    fp16=True, # On V100
    # tf32=True, #Only on A100
    max_grad_norm=0.3,
    warmup_ratio=0.03,
    lr_scheduler_type="constant",
    disable_tqdm=True # disable tqdm since with packing values are in correct
)

In [None]:
#Set trainer settings

max_seq_length = 500 #Can be optimised based on sample length in dataset

trainer = SFTTrainer(
    model=model,
    train_dataset=dataset_lyrics,
    peft_config=peft_config,
    max_seq_length=max_seq_length,
    tokenizer=tokenizer,
    packing=True,
    formatting_func=format_instruction,
    args=args,
)

In [None]:
#Maximising the space available for model loading on GPU
os.environ['PYTORCH_CUDA_ALLOC_CONF']="max_split_size_mb:512"

In [None]:
#Train
trainer.train() # there will not be a progress bar since tqdm is disabled

#Save model
trainer.save_model()

{'loss': 1.3464, 'learning_rate': 0.0005, 'epoch': 0.08}
{'loss': 1.3198, 'learning_rate': 0.0005, 'epoch': 0.15}
{'loss': 1.2712, 'learning_rate': 0.0005, 'epoch': 0.23}
{'loss': 1.2614, 'learning_rate': 0.0005, 'epoch': 0.31}
{'loss': 1.2492, 'learning_rate': 0.0005, 'epoch': 0.38}
{'loss': 1.25, 'learning_rate': 0.0005, 'epoch': 0.46}
{'loss': 1.2028, 'learning_rate': 0.0005, 'epoch': 0.53}
{'loss': 1.2854, 'learning_rate': 0.0005, 'epoch': 0.61}
{'loss': 1.2336, 'learning_rate': 0.0005, 'epoch': 0.69}
{'loss': 1.2775, 'learning_rate': 0.0005, 'epoch': 0.76}
{'loss': 1.2377, 'learning_rate': 0.0005, 'epoch': 0.84}
{'loss': 1.2591, 'learning_rate': 0.0005, 'epoch': 0.92}
{'loss': 1.2261, 'learning_rate': 0.0005, 'epoch': 0.99}
{'loss': 1.1913, 'learning_rate': 0.0005, 'epoch': 1.07}




{'loss': 1.2617, 'learning_rate': 0.0005, 'epoch': 1.03}
{'loss': 1.1621, 'learning_rate': 0.0005, 'epoch': 1.1}
{'loss': 1.1546, 'learning_rate': 0.0005, 'epoch': 1.18}
{'loss': 1.188, 'learning_rate': 0.0005, 'epoch': 1.26}
{'loss': 1.1413, 'learning_rate': 0.0005, 'epoch': 1.33}
{'loss': 1.0094, 'learning_rate': 0.0005, 'epoch': 1.41}
{'loss': 1.106, 'learning_rate': 0.0005, 'epoch': 1.48}
{'loss': 1.0679, 'learning_rate': 0.0005, 'epoch': 1.56}
{'loss': 1.1239, 'learning_rate': 0.0005, 'epoch': 1.64}
{'loss': 1.131, 'learning_rate': 0.0005, 'epoch': 1.71}
{'loss': 1.1334, 'learning_rate': 0.0005, 'epoch': 1.79}
{'loss': 1.1672, 'learning_rate': 0.0005, 'epoch': 1.87}
{'loss': 1.0772, 'learning_rate': 0.0005, 'epoch': 1.94}
{'loss': 1.0363, 'learning_rate': 0.0005, 'epoch': 2.02}
{'loss': 1.101, 'learning_rate': 0.0005, 'epoch': 2.1}




{'loss': 0.9971, 'learning_rate': 0.0005, 'epoch': 2.05}
{'loss': 1.0153, 'learning_rate': 0.0005, 'epoch': 2.13}
{'loss': 0.9316, 'learning_rate': 0.0005, 'epoch': 2.21}
{'loss': 0.934, 'learning_rate': 0.0005, 'epoch': 2.28}
{'loss': 0.8861, 'learning_rate': 0.0005, 'epoch': 2.36}
{'loss': 0.9474, 'learning_rate': 0.0005, 'epoch': 2.44}
{'loss': 0.9517, 'learning_rate': 0.0005, 'epoch': 2.51}
{'loss': 0.9567, 'learning_rate': 0.0005, 'epoch': 2.59}
{'loss': 0.9101, 'learning_rate': 0.0005, 'epoch': 2.66}
{'loss': 0.8853, 'learning_rate': 0.0005, 'epoch': 2.74}
{'loss': 0.9246, 'learning_rate': 0.0005, 'epoch': 2.82}
{'loss': 0.8847, 'learning_rate': 0.0005, 'epoch': 2.89}
{'loss': 0.9064, 'learning_rate': 0.0005, 'epoch': 2.97}
{'loss': 0.9469, 'learning_rate': 0.0005, 'epoch': 3.05}




{'loss': 0.9417, 'learning_rate': 0.0005, 'epoch': 3.0}
{'loss': 0.743, 'learning_rate': 0.0005, 'epoch': 3.08}
{'loss': 0.7257, 'learning_rate': 0.0005, 'epoch': 3.16}
{'loss': 0.7792, 'learning_rate': 0.0005, 'epoch': 3.23}
{'loss': 0.7119, 'learning_rate': 0.0005, 'epoch': 3.31}
{'loss': 0.7124, 'learning_rate': 0.0005, 'epoch': 3.39}
{'loss': 0.7518, 'learning_rate': 0.0005, 'epoch': 3.46}
{'loss': 0.7126, 'learning_rate': 0.0005, 'epoch': 3.54}
{'loss': 0.7163, 'learning_rate': 0.0005, 'epoch': 3.61}
{'loss': 0.7312, 'learning_rate': 0.0005, 'epoch': 3.69}
{'loss': 0.7352, 'learning_rate': 0.0005, 'epoch': 3.77}
{'loss': 0.7236, 'learning_rate': 0.0005, 'epoch': 3.84}
{'loss': 0.7474, 'learning_rate': 0.0005, 'epoch': 3.92}
{'loss': 0.784, 'learning_rate': 0.0005, 'epoch': 4.0}
{'loss': 0.7032, 'learning_rate': 0.0005, 'epoch': 4.07}




{'loss': 0.7123, 'learning_rate': 0.0005, 'epoch': 4.03}
{'loss': 0.5264, 'learning_rate': 0.0005, 'epoch': 4.11}
{'loss': 0.5811, 'learning_rate': 0.0005, 'epoch': 4.18}
{'loss': 0.5385, 'learning_rate': 0.0005, 'epoch': 4.26}
{'loss': 0.5866, 'learning_rate': 0.0005, 'epoch': 4.34}
{'loss': 0.5599, 'learning_rate': 0.0005, 'epoch': 4.41}
{'loss': 0.535, 'learning_rate': 0.0005, 'epoch': 4.49}
{'loss': 0.5438, 'learning_rate': 0.0005, 'epoch': 4.56}
{'loss': 0.6351, 'learning_rate': 0.0005, 'epoch': 4.64}
{'loss': 0.5723, 'learning_rate': 0.0005, 'epoch': 4.72}
{'loss': 0.603, 'learning_rate': 0.0005, 'epoch': 4.79}
{'loss': 0.5983, 'learning_rate': 0.0005, 'epoch': 4.87}
{'loss': 0.5516, 'learning_rate': 0.0005, 'epoch': 4.95}
{'loss': 0.5647, 'learning_rate': 0.0005, 'epoch': 5.02}
{'loss': 0.5621, 'learning_rate': 0.0005, 'epoch': 5.1}




{'loss': 0.48, 'learning_rate': 0.0005, 'epoch': 5.06}
{'loss': 0.4428, 'learning_rate': 0.0005, 'epoch': 5.13}
{'loss': 0.3718, 'learning_rate': 0.0005, 'epoch': 5.21}
{'loss': 0.3979, 'learning_rate': 0.0005, 'epoch': 5.29}
{'loss': 0.4352, 'learning_rate': 0.0005, 'epoch': 5.36}
{'loss': 0.5018, 'learning_rate': 0.0005, 'epoch': 5.44}
{'loss': 0.4215, 'learning_rate': 0.0005, 'epoch': 5.52}
{'loss': 0.3887, 'learning_rate': 0.0005, 'epoch': 5.59}
{'loss': 0.4216, 'learning_rate': 0.0005, 'epoch': 5.67}
{'loss': 0.4366, 'learning_rate': 0.0005, 'epoch': 5.74}
{'loss': 0.3961, 'learning_rate': 0.0005, 'epoch': 5.82}
{'loss': 0.4107, 'learning_rate': 0.0005, 'epoch': 5.9}
{'loss': 0.4423, 'learning_rate': 0.0005, 'epoch': 5.97}
{'loss': 0.4587, 'learning_rate': 0.0005, 'epoch': 6.05}




{'loss': 0.4675, 'learning_rate': 0.0005, 'epoch': 6.01}
{'loss': 0.2954, 'learning_rate': 0.0005, 'epoch': 6.08}
{'loss': 0.3173, 'learning_rate': 0.0005, 'epoch': 6.16}
{'loss': 0.2678, 'learning_rate': 0.0005, 'epoch': 6.24}
{'loss': 0.3404, 'learning_rate': 0.0005, 'epoch': 6.31}
{'loss': 0.305, 'learning_rate': 0.0005, 'epoch': 6.39}
{'loss': 0.3376, 'learning_rate': 0.0005, 'epoch': 6.47}
{'loss': 0.3221, 'learning_rate': 0.0005, 'epoch': 6.54}
{'loss': 0.2982, 'learning_rate': 0.0005, 'epoch': 6.62}
{'loss': 0.3214, 'learning_rate': 0.0005, 'epoch': 6.69}
{'loss': 0.3311, 'learning_rate': 0.0005, 'epoch': 6.77}
{'loss': 0.3175, 'learning_rate': 0.0005, 'epoch': 6.85}
{'loss': 0.2966, 'learning_rate': 0.0005, 'epoch': 6.92}
{'loss': 0.36, 'learning_rate': 0.0005, 'epoch': 7.0}
{'loss': 0.3337, 'learning_rate': 0.0005, 'epoch': 7.08}




{'loss': 0.2868, 'learning_rate': 0.0005, 'epoch': 7.03}
{'loss': 0.2469, 'learning_rate': 0.0005, 'epoch': 7.11}
{'loss': 0.2162, 'learning_rate': 0.0005, 'epoch': 7.19}
{'loss': 0.2457, 'learning_rate': 0.0005, 'epoch': 7.26}
{'loss': 0.2286, 'learning_rate': 0.0005, 'epoch': 7.34}
{'loss': 0.2027, 'learning_rate': 0.0005, 'epoch': 7.42}
{'loss': 0.2363, 'learning_rate': 0.0005, 'epoch': 7.49}
{'loss': 0.2317, 'learning_rate': 0.0005, 'epoch': 7.57}
{'loss': 0.2255, 'learning_rate': 0.0005, 'epoch': 7.65}
{'loss': 0.2492, 'learning_rate': 0.0005, 'epoch': 7.72}
{'loss': 0.2307, 'learning_rate': 0.0005, 'epoch': 7.8}
{'loss': 0.2394, 'learning_rate': 0.0005, 'epoch': 7.87}
{'loss': 0.2612, 'learning_rate': 0.0005, 'epoch': 7.95}
{'loss': 0.2519, 'learning_rate': 0.0005, 'epoch': 8.03}
{'loss': 0.2491, 'learning_rate': 0.0005, 'epoch': 8.1}




{'loss': 0.1698, 'learning_rate': 0.0005, 'epoch': 8.06}
{'loss': 0.1649, 'learning_rate': 0.0005, 'epoch': 8.14}
{'loss': 0.1651, 'learning_rate': 0.0005, 'epoch': 8.21}
{'loss': 0.1778, 'learning_rate': 0.0005, 'epoch': 8.29}
{'loss': 0.1646, 'learning_rate': 0.0005, 'epoch': 8.37}
{'loss': 0.1665, 'learning_rate': 0.0005, 'epoch': 8.44}
{'loss': 0.1774, 'learning_rate': 0.0005, 'epoch': 8.52}
{'loss': 0.1893, 'learning_rate': 0.0005, 'epoch': 8.6}
{'loss': 0.1679, 'learning_rate': 0.0005, 'epoch': 8.67}
{'loss': 0.1901, 'learning_rate': 0.0005, 'epoch': 8.75}
{'loss': 0.1942, 'learning_rate': 0.0005, 'epoch': 8.82}
{'loss': 0.1928, 'learning_rate': 0.0005, 'epoch': 8.9}
{'loss': 0.195, 'learning_rate': 0.0005, 'epoch': 8.98}
{'loss': 0.1789, 'learning_rate': 0.0005, 'epoch': 9.05}
{'train_runtime': 5200.3916, 'train_samples_per_second': 2.011, 'train_steps_per_second': 0.252, 'train_loss': 0.64468693787815, 'epoch': 9.05}


In [None]:
#Push peft adapter to huggingface hub

peft_model_id = "niclasfw/schlager-bot-004"
trainer.push_to_hub(peft_model_id)

adapter_model.bin:   0%|          | 0.00/134M [00:00<?, ?B/s]

Upload 2 LFS files:   0%|          | 0/2 [00:00<?, ?it/s]

training_args.bin:   0%|          | 0.00/4.60k [00:00<?, ?B/s]

'https://huggingface.co/niclasfw/schlager-bot-004/tree/main/'

__________________________
### Merge Model and Adapter Layer

To make text inference quicker and the model set-up easier to use in the future, we merge the adapter layer and instance of the model and push it to the hub.



In [None]:
#Load base model and tokenizer

model_id = "LeoLM/leo-hessianai-7b"

#Load base LLM model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    low_cpu_mem_usage=True,
    torch_dtype=torch.float16,
    load_in_4bit=True,
)
tokenizer = AutoTokenizer.from_pretrained(model_id)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [None]:
#Load adapter layer, merge model and push to huggingface hub

peft_model_id = "niclasfw/schlager-bot-004"

#Accelerate Model with Text Generation Inference
model = AutoPeftModelForCausalLM.from_pretrained(
    pretrained_model_name_or_path=peft_model_id,
    low_cpu_mem_usage=True,
)

#Merge LoRA and base model
merged_model = model.merge_and_unload()

#Save the merged model
merged_model.save_pretrained("merged_model",safe_serialization=True)
tokenizer.save_pretrained("merged_model")

#push merged model to the hub
merged_model.push_to_hub(peft_model_id)
tokenizer.push_to_hub(peft_model_id)

Downloading (…)/adapter_config.json:   0%|          | 0.00/444 [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Downloading adapter_model.bin:   0%|          | 0.00/134M [00:00<?, ?B/s]

Using pad_token, but it is not set yet.
Using pad_token, but it is not set yet.


pytorch_model-00002-of-00003.bin:   0%|          | 0.00/9.89G [00:00<?, ?B/s]

pytorch_model-00001-of-00003.bin:   0%|          | 0.00/9.88G [00:00<?, ?B/s]

pytorch_model-00003-of-00003.bin:   0%|          | 0.00/7.18G [00:00<?, ?B/s]

Upload 3 LFS files:   0%|          | 0/3 [00:00<?, ?it/s]

Using pad_token, but it is not set yet.
Using pad_token, but it is not set yet.


CommitInfo(commit_url='https://huggingface.co/niclasfw/schlager-bot-004/commit/d99377e8f249e0ded77409d37732b230d774e0f7', commit_message='Upload tokenizer', commit_description='', oid='d99377e8f249e0ded77409d37732b230d774e0f7', pr_url=None, pr_revision=None, pr_num=None)

________________________________________________
## Inference Testing

Finally, we want to test the model and tweak the generation function parameters to make sure it performs as well as it can.**bold text**

In [2]:
import torch
from transformers import AutoModelForCausalLM
from transformers import AutoTokenizer

In [4]:
model_id = "niclasfw/schlager-bot-004"

# load base LLM model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    low_cpu_mem_usage=True,
    torch_dtype=torch.float16,
    load_in_4bit=True,
)
tokenizer = AutoTokenizer.from_pretrained(model_id)

Downloading config.json:   0%|          | 0.00/829 [00:00<?, ?B/s]

Downloading adapter_config.json:   0%|          | 0.00/444 [00:00<?, ?B/s]

Downloading config.json:   0%|          | 0.00/708 [00:00<?, ?B/s]

Downloading (…)model.bin.index.json:   0%|          | 0.00/23.9k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

Downloading (…)l-00001-of-00002.bin:   0%|          | 0.00/9.98G [00:00<?, ?B/s]

Downloading (…)l-00002-of-00002.bin:   0%|          | 0.00/3.50G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Downloading generation_config.json:   0%|          | 0.00/183 [00:00<?, ?B/s]

Downloading adapter_model.bin:   0%|          | 0.00/134M [00:00<?, ?B/s]

Downloading tokenizer_config.json:   0%|          | 0.00/943 [00:00<?, ?B/s]

Downloading tokenizer.json:   0%|          | 0.00/1.84M [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/72.0 [00:00<?, ?B/s]

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


In [8]:
#Choose a random example to use as a verse; preferably something the model hasn't seen during training

sample = f"""Wer ist der attraktivste Spieler der Bundesliga?/n
Es ist Rosenfelder!\n
Max Rosenfelder/n
Wir wollen mit dem schonsten Mann der Liga feiern gehen!
"""

prompt = f"""### Instruction:
Benuzte den gegebenen Input um ein Schlager Lied zu schreiben.

### Input:
{sample}

### Response:
"""

#Generate 5 outputs to check consistency
for n in range(5):
        input_ids = tokenizer(prompt, return_tensors="pt", truncation=True).input_ids.cuda()
        # with torch.inference_mode():
        outputs = model.generate(input_ids=input_ids, pad_token_id=tokenizer.eos_token_id, max_new_tokens=500, do_sample=True, top_p=0.75, temperature=0.95, top_k=15)

        print(f"Version {n+1}")
        print(f"Prompt:\n{sample}\n")
        print(f"Generated output:\n{tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=True)[0][len(prompt):]}")

Version 1
Prompt:
Wer ist der attraktivste Spieler der Bundesliga?/n
Es ist Rosenfelder!

Max Rosenfelder/n
Wir wollen mit dem schonsten Mann der Liga feiern gehen!


Generated output:
Wer ist der attraktivste Spieler der Bundesliga?/n
Es ist Rosenfelder!
 Max Rosenfelder/n
 Wir wollen mit dem schonsten Mann der Liga feiern gehen!
 Es ist Rosenfelder!
 Rosenfelder! Rosenfelder! Rosenfelder!
 Wir wollen jetzt Max Rosenfelder singen!
 Rosenfelder! Rosenfelder!
 Wer ist der attraktivste Spieler der Bundesliga?/n
 Es ist Rosenfelder!
 Max Rosenfelder/n
 Wir wollen mit dem schonsten Mann der Liga feiern gehen!
 Es ist Rosenfelder!
 Rosenfelder!


Version 2
Prompt:
Wer ist der attraktivste Spieler der Bundesliga?/n
Es ist Rosenfelder!

Max Rosenfelder/n
Wir wollen mit dem schonsten Mann der Liga feiern gehen!


Generated output:
Wer ist der attraktivste Spieler der Bundesliga?/n
Es ist Rosenfelder!
 Max Rosenfelder/n
Wir wollen mit dem schonsten Mann der Liga feiern gehen!
 Es ist Rosenfelde