# Gemma in the Hugging Face Ecosystem

<p align="center">
  <img src="https://github.com/sanchit-gandhi/notebooks/blob/main/gemma_pipeline.jpg?raw=true" width="800"/>
</p>

## Inference with Transformers

In [1]:
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("google/gemma-2b")
model = AutoModelForCausalLM.from_pretrained("google/gemma-2b", device_map="auto")

input_text = "Write me a poem about Machine Learning."
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")

outputs = model.generate(**input_ids, do_sample=True, max_new_tokens=256)
print(tokenizer.decode(outputs[0]))

  from .autonotebook import tqdm as notebook_tqdm
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████| 2/2 [00:04<00:00,  2.48s/it]


<bos>Write me a poem about Machine Learning.
Tell me about the future of human interaction,
About what’s new, what’s true and what’s fake.
Tell me the meaning of life as you understand it,
A feeling that I’ve never gotten through my eyes.

What is this life without all I need and want?
I never understand what’s true and what’s fake.
How do you think to handle the reality?
I know you’ve lost your humanity the feeling.
Tell me about the future of reality.
Can you describe the nature within reality?
Will we all be better, no matter what?
Or will I be the last of us to live within this world?

I’m not going to be in this world ever again.
I am living in the moment as always.
Do you think there’s a better place when we’re all gone?
Will I ever be right in these things that I’m saying?
This place called reality.
All my life I never know what’s true and what’s fake.

I’m not here just for fun or games.
I’m trying to become the best person who’s had a home game.
Let’


## Datasets

In [2]:
from datasets import load_dataset

dataset = load_dataset("timdettmers/openassistant-guanaco")

Downloading readme: 100%|██████████████████████████████████████████████████████████████| 395/395 [00:00<00:00, 3.14MB/s]
Downloading data: 100%|████████████████████████████████████████████████████████████| 20.9M/20.9M [00:03<00:00, 6.47MB/s]
Downloading data: 100%|████████████████████████████████████████████████████████████| 1.11M/1.11M [00:00<00:00, 5.39MB/s]
Generating train split: 9846 examples [00:00, 254668.95 examples/s]
Generating test split: 518 examples [00:00, 133283.20 examples/s]


In [3]:
sample = dataset["train"][0]
sample

{'text': '### Human: Can you write a short introduction about the relevance of the term "monopsony" in economics? Please use examples related to potential monopsonies in the labour market and cite relevant research.### Assistant: "Monopsony" refers to a market structure where there is only one buyer for a particular good or service. In economics, this term is particularly relevant in the labor market, where a monopsony employer has significant power over the wages and working conditions of their employees. The presence of a monopsony can result in lower wages and reduced employment opportunities for workers, as the employer has little incentive to increase wages or provide better working conditions.\n\nRecent research has identified potential monopsonies in industries such as retail and fast food, where a few large companies control a significant portion of the market (Bivens & Mishel, 2013). In these industries, workers often face low wages, limited benefits, and reduced bargaining po

## Training with TRL

In [4]:
from trl import SFTTrainer
from transformers import TrainingArguments

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Define training arguments:

In [8]:
training_args = TrainingArguments(
    per_device_train_batch_size=4,
    per_device_eval_batch_size=4,
    num_train_epochs=3,
    learning_rate=1e-4,
    logging_steps=2,
    gradient_checkpointing=True,
    output_dir="gemma-2b-fine-tuned",
)

Instantiate supervised fine-tuning (SFT) trainer:

In [9]:
trainer = SFTTrainer(
    model,
    tokenizer=tokenizer,
    args=training_args,
    train_dataset=dataset["train"],
    eval_dataset=dataset["test"],
    dataset_text_field="text",
    max_seq_length=1024,
)

Map: 100%|█████████████████████████████████████████████████████████████████| 9846/9846 [00:01<00:00, 7841.76 examples/s]
Map: 100%|███████████████████████████████████████████████████████████████████| 518/518 [00:00<00:00, 7732.32 examples/s]
Detected kernel version 5.4.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.


In [10]:
trainer.train()



Step,Training Loss
2,2.0884
4,3.8903
6,2.5808
8,1.8348
10,2.0704
12,1.7434
14,1.4565
16,2.949
18,3.1589
20,3.4943


KeyboardInterrupt: 