<a href="https://colab.research.google.com/github/sanchit-gandhi/notebooks/blob/main/gemma-transformers-streamlined.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Gemma in the Hugging Face Ecosystem

<p align="center">
  <img src="https://github.com/sanchit-gandhi/notebooks/blob/main/gemma_pipeline.jpg?raw=true" width="800"/>
</p>

## Set-up Python environment

In [None]:
!pip install --upgrade --quiet transformers datasets accelerate trl peft

## Inference with Transformers

Define quantization config:

In [1]:
import torch
from transformers import BitsAndBytesConfig

quantization_config = BitsAndBytesConfig(
    load_in_4bit=True, bnb_4bit_compute_dtype=torch.float16,
)

  from .autonotebook import tqdm as notebook_tqdm


Load models from pre-trained:

In [2]:
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "google/gemma-2b", low_cpu_mem_usage=True, quantization_config=quantization_config,
)
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2b", use_fast=True)

Downloading shards: 100%|███████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 4040.76it/s]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████| 2/2 [00:02<00:00,  1.27s/it]


Encode the inputs:

In [3]:
input_ids = tokenizer("Recipe for pasta:", return_tensors="pt").input_ids
input_ids = input_ids.to(model.device)

Auto-regressively generate:

In [4]:
from transformers import set_seed

set_seed(0)
pred_ids = model.generate(input_ids, do_sample=True, temperature=0.6, max_new_tokens=256)

Decode the outputs:

In [5]:
pred_text = tokenizer.batch_decode(pred_ids, skip_special_tokens=True)
print(pred_text[0])

Recipe for pasta:

* 4 1/2 cups of all-purpose flour
* 2 tsp of salt
* 2 tsp of sugar
* 12 oz. of warm water
* 2 tsp of active dry yeast
* 3 tbsp of vegetable oil
* 3/4 tsp of granulated garlic
* 1 tsp of granulated onion
* 1/2 tsp of dried marjoram
* 1/4 tsp of ground black pepper
* 1/4 tsp of cayenne pepper

In a large bowl, sift together the flour, salt, sugar, and garlic, onion, and marjoram. Make a well in the dry ingredients and pour in the yeast. Add the oil and use your hands to mix.

Add the water and knead the dough together into a smooth, elastic, and extensible dough.

Cover the dough and let it rise for about 20 minutes.

Divide the dough into two or three loaves and shape them into your desired shape.

Cover the loaves and let them rise for about 30 minutes.

Preheat the oven to 375°F.

Brush a little oil or melted butter on the baking sheet and place the loaves on the sheet.

Stick a


## Datasets

In [2]:
from datasets import load_dataset

dataset = load_dataset("timdettmers/openassistant-guanaco")

  from .autonotebook import tqdm as notebook_tqdm


In [3]:
sample = dataset["train"][0]
sample

{'text': '### Human: Can you write a short introduction about the relevance of the term "monopsony" in economics? Please use examples related to potential monopsonies in the labour market and cite relevant research.### Assistant: "Monopsony" refers to a market structure where there is only one buyer for a particular good or service. In economics, this term is particularly relevant in the labor market, where a monopsony employer has significant power over the wages and working conditions of their employees. The presence of a monopsony can result in lower wages and reduced employment opportunities for workers, as the employer has little incentive to increase wages or provide better working conditions.\n\nRecent research has identified potential monopsonies in industries such as retail and fast food, where a few large companies control a significant portion of the market (Bivens & Mishel, 2013). In these industries, workers often face low wages, limited benefits, and reduced bargaining po

## Training with TRL

In [4]:
from trl import SFTTrainer, ModelConfig, get_peft_config
from transformers import TrainingArguments

  torch.utils._pytree._register_pytree_node(
  torch.utils._pytree._register_pytree_node(


Define training + model arguments:

In [25]:
training_args = TrainingArguments(
    per_device_train_batch_size=4,
    per_gpu_eval_batch_size=4,
    num_train_epochs=3,
    learning_rate=1e-4,
    logging_steps=10,
    gradient_checkpointing=True,
    output_dir="gemma-2b-fine-tuned",
)

In [5]:
model_args = ModelConfig(
    torch_dtype=torch.float16,
    use_peft=True,
)

NameError: name 'torch' is not defined

Instantiate supervised fine-tuning (SFT) trainer:

In [27]:
trainer = SFTTrainer(
    model,
    tokenizer=tokenizer,
    args=training_args,
    train_dataset=dataset["train"],
    eval_dataset=dataset["test"],
    dataset_text_field="text",
    max_seq_length=1024,
    peft_config=get_peft_config(model_args)
)



In [28]:
trainer.train()



Step,Training Loss
10,2.3323
20,2.1197
30,2.0482
40,1.9308
50,2.1659
60,1.8028
70,1.9096


KeyboardInterrupt: 