# **Fine Tuning Google Gemma ( Gemma-2b-it )**

Import libraries :

In [None]:
%%capture 
%pip install -U bitsandbytes 
%pip install -U transformers 
%pip install -U peft 
%pip install -U accelerate 
%pip install -U trl
%pip install -U datasets
%pip install -U bitsandbytes
%pip install -U transformers
%pip install -U accelerate
!pip install -q tensorflow-cpu
!pip install -q -U keras-nlp tensorflow-hub
!pip install -q -U keras>=3
!pip install -q -U tensorflow-text

Update transformers and then restart the kernel :

In [None]:
pip install --upgrade transformers

Start the Fine Tuning :

In [1]:
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig,
    HfArgumentParser,
    TrainingArguments,
    pipeline,
    logging,
)
from peft import (
    LoraConfig,
    PeftModel,
    prepare_model_for_kbit_training,
    get_peft_model,
)
import os, torch, wandb
from datasets import load_dataset
from trl import SFTTrainer

In [2]:
base_model = "google/gemma-2b-it"
dataset_name = "aymanboufarhi/fstt_data"
new_model = "chat-bot-fstt"

In [3]:
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()

key of hugging face for read

In [4]:
!huggingface-cli login --token $'read key'

The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: read).
Your token has been saved to /root/.cache/huggingface/token
Login successful


add your wandb api token in Kaggle secrets [ click (Add-ons) ]

In [5]:
secret_wandb = user_secrets.get_secret("wandb")

# Monitoring the LLM
wandb.login(key = secret_wandb)
run = wandb.init(
    project='Fine tuning Gemma 7B', 
    job_type="training", 
    anonymous="allow"
)

[34m[1mwandb[0m: W&B API key is configured. Use [1m`wandb login --relogin`[0m to force relogin
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc
[34m[1mwandb[0m: Currently logged in as: [33maymanboufarhi1[0m ([33mfstt1[0m). Use [1m`wandb login --relogin`[0m to force relogin


In [6]:
#Loading the dataset
dataset = load_dataset(dataset_name, split="train[0:1000]")
dataset["text"][100]

Downloading data:   0%|          | 0.00/166k [00:00<?, ?B/s]

Generating train split:   0%|          | 0/276 [00:00<?, ? examples/s]

"<|system|> FSTT c'est la Faculté des Sciences et Techniques de Tanger <</s>> <|user|>  Donne l'objectif de Biotechnologies (Options : animale et végétale) </s> <|assistant|> L’objectif de la licence Biotechnologie consiste à former des étudiants dans les différents champs disciplinaires de la biologie moderne et de permettre aux étudiants d’acquérir de solides connaissances scientifiques pour une réelle compréhension du Vivant. L’enseignement est conçu pour allier concepts fondamentaux et apprentissage des outils méthodologiques. Il est également conçu pour étendre les compétences au-delà du domaine de la biologie, en particulier vers le monde économique. Ce parcours de licence vise donc à :  Donner des bases fondamentales et des techniques dans le domaine de la biologie moléculaire, de la génétique, de l’immunologie, du génie microbiologique, de l’enzymologie, de la physiologie animale ou végétale, de la biotechnologie animale ou végétale, de l’Ecologie appliquée à la biologie et de 

In [7]:
# Load base model(Gemma 2B-it)
bnbConfig = BitsAndBytesConfig(
    load_in_4bit = True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
)

model = AutoModelForCausalLM.from_pretrained(
        base_model,
        quantization_config=bnbConfig,
        device_map="auto"
)

model.config.use_cache = False # silence the warnings. Please re-enable for inference!
model.config.pretraining_tp = 1
model.gradient_checkpointing_enable()

config.json:   0%|          | 0.00/627 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/13.5k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/4.95G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/67.1M [00:00<?, ?B/s]

`config.hidden_act` is ignored, you should use `config.hidden_activation` instead.
Gemma's activation function will be set to `gelu_pytorch_tanh`. Please, use
`config.hidden_activation` if you want to override this behaviour.
See https://github.com/huggingface/transformers/pull/29402 for more details.


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/137 [00:00<?, ?B/s]

In [8]:
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model)
tokenizer.padding_side = 'right'
tokenizer.pad_token = tokenizer.eos_token
tokenizer.add_eos_token = True
tokenizer.add_bos_token, tokenizer.add_eos_token

tokenizer_config.json:   0%|          | 0.00/34.2k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/4.24M [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.5M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/636 [00:00<?, ?B/s]

(True, True)

In [9]:
model = prepare_model_for_kbit_training(model)
peft_config = LoraConfig(
    lora_alpha=16,
    lora_dropout=0.1,
    r=64,
    bias="none",
    task_type="CAUSAL_LM",
    target_modules=['o_proj', 'q_proj', 'up_proj', 'v_proj', 'k_proj', 'down_proj', 'gate_proj']
)
model = get_peft_model(model, peft_config)

In [16]:
training_arguments = TrainingArguments(
    output_dir="./gemma-2b-it-v2-fstt",
    num_train_epochs=18,
    per_device_train_batch_size=2,
    gradient_accumulation_steps=1,
    optim="paged_adamw_32bit",
    save_strategy="epoch",
    logging_steps=100,
    logging_strategy="steps",
    learning_rate=2e-4,
    fp16=False,
    bf16=False,
    group_by_length=True,
    report_to="wandb"
)

In [17]:
trainer = SFTTrainer(
    model=model,
    train_dataset=dataset,
    peft_config=peft_config,
    max_seq_length= 512,
    dataset_text_field="text",
    tokenizer=tokenizer,
    args=training_arguments,
    packing= False,
)

In [18]:
trainer.train()



Step,Training Loss
100,0.1435
200,0.1792
300,0.1776
400,0.1537
500,0.1296
600,0.1169
700,0.1117
800,0.0908
900,0.0915
1000,0.0897




TrainOutput(global_step=2484, training_loss=0.09323269330361039, metrics={'train_runtime': 2705.9986, 'train_samples_per_second': 1.836, 'train_steps_per_second': 0.918, 'total_flos': 8941399784546304.0, 'train_loss': 0.09323269330361039, 'epoch': 18.0})

In [19]:
wandb.finish()
model.config.use_cache = True

VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
train/epoch,▁▁▂▂▂▂▃▃▃▄▄▄▅▅▁▁▂▂▂▂▃▃▃▄▄▄▅▅▅▅▆▆▆▇▇▇▇██
train/global_step,▁▁▂▂▂▂▃▃▃▄▄▄▅▅▁▁▂▂▂▂▃▃▃▄▄▄▅▅▅▅▆▆▆▇▇▇▇██
train/grad_norm,█▅▆▆▇▇▅▄▆▄▃▃▃▃▃▅▄▆▄▃▂▂▃▃▂▂▂▂▂▁▁▁▂▂▂▂▂
train/learning_rate,█▇▇▆▆▅▄▄▃▃▂▂▁██▇▇▇▆▆▆▆▅▅▅▄▄▄▃▃▃▃▂▂▂▁▁
train/loss,█▅▄▃▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
total_flos,8941399784546304.0
train/epoch,18.0
train/global_step,2484.0
train/grad_norm,0.27
train/learning_rate,1e-05
train/loss,0.0594
train_loss,0.09323
train_runtime,2705.9986
train_samples_per_second,1.836
train_steps_per_second,0.918


add your Hugging face acces token ( write ) :

In [20]:
!huggingface-cli login --token $'write key'

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: write).
Your token has been saved to /root/.cache/huggingface/token
Login successful


In [21]:
# Save the adaptor
trainer.model.save_pretrained(new_model)
trainer.model.push_to_hub(new_model, use_temp_dir=False)



adapter_model.safetensors:   0%|          | 0.00/314M [00:00<?, ?B/s]

CommitInfo(commit_url='https://huggingface.co/aymanboufarhi/chat-bot-fstt/commit/0aa96fa8b6dc5c1140da8d0cdc9c244c7c6eaca7', commit_message='Upload model', commit_description='', oid='0aa96fa8b6dc5c1140da8d0cdc9c244c7c6eaca7', pr_url=None, pr_revision=None, pr_num=None)

In [None]:
prompt = '''<|system|>FSTT c'est la Faculté des Sciences et Techniques de Tanger 
<|user|> Donne le Coordinnateur de Génie Informatique 
<|assistant|>'''
    
inputs = tokenizer(prompt, return_tensors='pt', padding=True, truncation=True).to("cuda")

outputs = model.generate(**inputs, max_length=500, num_return_sequences=1)

text = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(text)