# Gemma 2 9B 💎

Gemma 2 is Google's latest iteration of open LLMs. It comes in two sizes, 9 billion and 27 billion parameters with base (pre-trained) and instruction-tuned versions. Gemma is based on Google Deepmind Gemini and has a context length of 8K tokens:

- [gemma-2-9b](https://huggingface.co/google/gemma-7b): Base 9B model.
- [gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it): Instruction fine-tuned version of the base 9B model.
- [gemma-2-27b](https://huggingface.co/google/gemma-2-27b): Base 27B model.
- [gemma-2-27b-it](https://huggingface.co/google/gemma-2-27b-it): Instruction fine-tuned version of the base 27B model.

The Gemma 2 models were trained on ~2x more data than their first iteration, totaling 13 trillion tokens for the 27B version and 8 trillion tokens for the 9B version of web data (primarily English), code, and math. We don’t know the exact details of the training mix, and we can only guess that bigger and more careful data curation was a big factor in the improved performance.

Gemma 2 comes with the [same license](https://ai.google.dev/gemma/terms) as the first iteration, which is a permissive license that allows redistribution, fine-tuning, commercial use, and derivative works.

## Setup Inference Environment



In [None]:
!pip install -q git+https://github.com/huggingface/transformers.git accelerate

  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m314.1/314.1 kB[0m [31m9.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m21.3/21.3 MB[0m [31m53.2 MB/s[0m eta [36m0:00:00[0m
[?25h  Building wheel for transformers (pyproject.toml) ... [?25l[?25hdone


## Initialise the Text Generation pipeline

P.S. Make sure to accept the terms and conditions from the model page [here](https://huggingface.co/google/gemma-2-9b-it)

In [None]:
from huggingface_hub import notebook_login

notebook_login()


VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [None]:
from transformers import pipeline
import torch

model_id = "google/gemma-2-9b-it"

pipe = pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/857 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/39.1k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/4 [00:00<?, ?it/s]

model-00001-of-00004.safetensors:   0%|          | 0.00/4.90G [00:00<?, ?B/s]

model-00002-of-00004.safetensors:   0%|          | 0.00/4.95G [00:00<?, ?B/s]

model-00003-of-00004.safetensors:   0%|          | 0.00/4.96G [00:00<?, ?B/s]

model-00004-of-00004.safetensors:   0%|          | 0.00/3.67G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/173 [00:00<?, ?B/s]



tokenizer_config.json:   0%|          | 0.00/40.6k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/4.24M [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.5M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/636 [00:00<?, ?B/s]

## Define the prompt similar to messages API

You can pretty much define any query, question, discussion that you want the LLM to process below.

In [None]:
messages = [
    {"role": "user", "content": "Can we put Ketchup in Italian Pizza."},
]

## Pass the outputs to the pipeline

In [None]:
outputs = pipe(
    messages,
    max_new_tokens=256,
    do_sample=False,
)

## Generate response

In [None]:
assistant_response = outputs[0]["generated_text"][-1]["content"]
print(assistant_response)

It's a matter of personal preference! 

**Traditionally, Italian pizza does not use ketchup.**  The classic toppings are usually simple and focus on fresh ingredients like:

* **Tomato sauce:**  The base of most Italian pizzas.
* **Mozzarella cheese:**  The quintessential Italian pizza cheese.
* **Fresh basil:**  Adds a bright, herbaceous flavor.
* **Other toppings:**  Pepperoni, mushrooms, olives, onions, peppers, etc.

**However, there's no rule against adding ketchup to your pizza if you enjoy it!**  

Ultimately, pizza is a customizable dish, and you should enjoy it the way you like it best. 

If you're feeling adventurous, you could try a "Hawaiian" style pizza with ham and pineapple, which is another topping combination that might surprise some traditionalists.



In [None]:
messages = [
    {"role": "user", "content": "Can we put Ketchup in Italian Pizza. act as an italian!"},
]
outputs = pipe(
    messages,
    max_new_tokens=256,
    do_sample=False,
)
assistant_response = outputs[0]["generated_text"][-1]["content"]
print(assistant_response)

*Scoffs and throws hands up in the air*

Ketchup on pizza?! *Mamma mia!*  What kind of barbarity is this?!  

Pizza is a sacred dish, a work of art!  It deserves respect, not that... that *red goop*!  

We use fresh tomatoes, basil, oregano, maybe a sprinkle of chili flakes for a little kick.  But ketchup?  *Per favore!*  Save that for your hot dogs! 

Go make a proper pizza, with love and respect for the ingredients.  Then maybe, just maybe, you'll understand the true beauty of Italian pizza.  






In [None]:
messages = [
    {"role": "user", "content": "Say thanks to the audiance in Italian for their presnece, say it in Milano accent."},
]
outputs = pipe(
    messages,
    max_new_tokens=256,
    do_sample=False,
)
assistant_response = outputs[0]["generated_text"][-1]["content"]


In [None]:
print(assistant_response)

NameError: name 'assistant_response' is not defined