
<center><img src="https://www.geeky-gadgets.com/wp-content/uploads/2025/03/google-gemma-3-advanced-ai-models.webp"></img></center>

# Introduction
This Notebook will explore how to prompt Gemma 3, using KerasNLP. 
At this point, Keras version of Gemma 3 is quite new, and the code to run it is not yet documented.


## What is Gemma?
Gemma is a family of lightweight, state-of-the art open models built from the same research and technology used to create the Gemini models. At the 3rd generation now, Gemma 3 comes in 4 sizes, **1B**, **4B**, **12B** and **27B**, both pretrained and instruction finetuned versions.   

Models **4B**, **12B**, **27B** brings an extended context window (up to **128K**) as well as **multi-modality** (text and image). 

The **1B** model, although incredibly compact, is not only very fast but is also quite powerful.

Will use the **1B** model, on CPU only, without any accelerator.


# Prerequisites

## Install packages

We will install Keras and KerasNLP.

In [1]:
!pip install -q -U keras-nlp
!pip install -q -U keras

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m731.3/731.3 kB[0m [31m9.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1.3 MB[0m [31m15.4 MB/s[0m eta [36m0:00:00[0m
[?25h[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow-decision-forests 1.10.0 requires tensorflow==2.17.0, but you have tensorflow 2.17.1 which is incompatible.[0m[31m
[0m

## Import packages

In [2]:
import keras
import keras_nlp
from keras_nlp.samplers import TopKSampler
from time import time

Select the backend. Keras is a high-level, multi-framework deep learning API designed for simplicity and ease of use. Keras 3 lets you choose the backend: TensorFlow, JAX, or PyTorch. For this Notebook, we will choose jax as backend.

In [3]:
import os
os.environ["KERAS_BACKEND"] = "jax"
os.environ["XLA_PYTHON_CLIENT_MEM_FRACTION"] = "0.9"

## Initialize the model

Make sure to use `Gemma3CausalLM` to initialize the model, not the `GemmaCausalLM`. Otherwise the Gemma3 backbone will not be recognized.

In [4]:
gemma_lm = keras_nlp.models.Gemma3CausalLM.from_preset("/kaggle/input/gemma3/keras/gemma3_1b/1")

In [5]:
tokenizer = keras_nlp.models.Gemma3Tokenizer.from_preset("/kaggle/input/gemma3/keras/gemma3_1b/1")

Let's verify the model now.

In [6]:
gemma_lm.summary()

We observe that the model is indeed under 1GB.

# Test the model

Let's define a simple prompt.

In [7]:
instructions = "You are an AI that answers in short, complete sentences only.\n"
prompt = (
    f"{instructions}"
    "Question: {question}\n"
    "Answer:"
)

output = gemma_lm.generate(
    prompt.format(question="What is the temperature of the Moon?"),
    max_length=40, 
    stop_token_ids=[tokenizer.token_to_id("\n")]
)


In [8]:
answer = output.replace(instructions, "").strip()
print(answer)

Question: What is the temperature of the Moon?
Answer: The temperature of the Moon is 100 degrees Celsius


## Functions to generate and format the output

We define a function to generate the answer.

In [9]:
def generate_answer(question, 
                    instructions="You are an AI that answers in short, complete sentences only.\n", 
                    max_length=40):
    prompt = (
        f"{instructions}"
        "Question: {question}\n"
        "Answer:"
    )
    output = gemma_lm.generate(
        prompt.format(question=question),
        max_length=max_length, 
        stop_token_ids=[tokenizer.token_to_id("\n")]
    )    
    answer = output.replace(instructions, "").strip()
    return answer

We define a function to format the output.

In [10]:
from IPython.display import display, Markdown

def colorize_text(text):
    for word, color in zip(["Reasoning", "Question", "Answer", "Explanation", "Total time"], ["blue", "red", "green", "darkblue",  "magenta"]):
        text = text.replace(f"{word}:", f"\n\n**<font color='{color}'>{word}:</font>**")
    return text

Now let's combine both.

In [11]:
def generate_format_answer(question, 
                    instructions="You are an AI that answers in short, complete sentences only.\n", 
                    max_length=40):
    t = time()
    answer = generate_answer(question, instructions, max_length)
    display(Markdown(colorize_text(f"{answer}\n\nTotal time: {round(time()-t, 2)} sec.")))

In [12]:
generate_format_answer("What is the surface temperature of the Moon?")



**<font color='red'>Question:</font>** What is the surface temperature of the Moon?


**<font color='green'>Answer:</font>** 100 degrees Celsius



**<font color='magenta'>Total time:</font>** 2.83 sec.

## Let's ask some simple common knowledge questions


Will ask questions from history, arts, general culture and politics.

In [13]:
answer = generate_format_answer("When was the 30 years war?")



**<font color='red'>Question:</font>** When was the 30 years war?


**<font color='green'>Answer:</font>** 1618-1648



**<font color='magenta'>Total time:</font>** 3.71 sec.

In [14]:
answer = generate_format_answer("When was founded Rome?")



**<font color='red'>Question:</font>** When was founded Rome?


**<font color='green'>Answer:</font>** 753 B.C.



**<font color='magenta'>Total time:</font>** 3.11 sec.

In [15]:
answer = generate_format_answer("In what year was the Fall of Constantinopole?")



**<font color='red'>Question:</font>** In what year was the Fall of Constantinopole?


**<font color='green'>Answer:</font>** 1453



**<font color='magenta'>Total time:</font>** 2.43 sec.

In [16]:
answer = generate_format_answer("When was America discovered by Columb?")



**<font color='red'>Question:</font>** When was America discovered by Columb?


**<font color='green'>Answer:</font>** 1492



**<font color='magenta'>Total time:</font>** 2.33 sec.

In [17]:
answer = generate_format_answer("When was the Great War?")



**<font color='red'>Question:</font>** When was the Great War?


**<font color='green'>Answer:</font>** 1914-1918



**<font color='magenta'>Total time:</font>** 3.63 sec.

In [18]:
generate_format_answer("When was the USA-Spanish war?")



**<font color='red'>Question:</font>** When was the USA-Spanish war?


**<font color='green'>Answer:</font>** 1898



**<font color='magenta'>Total time:</font>** 2.34 sec.

In [19]:
generate_format_answer("Who was the president before 1st mandate of Donald Trump?")



**<font color='red'>Question:</font>** Who was the president before 1st mandate of Donald Trump?


**<font color='green'>Answer:</font>** Barack Obama



**<font color='magenta'>Total time:</font>** 1.57 sec.

In [20]:
generate_format_answer("When was the attack on Pearl Harbor?")



**<font color='red'>Question:</font>** When was the attack on Pearl Harbor?


**<font color='green'>Answer:</font>** December 7, 1941



**<font color='magenta'>Total time:</font>** 3.47 sec.

In [21]:
generate_format_answer("Who was the next shogon after Yeiatsu Tokugawa?")



**<font color='red'>Question:</font>** Who was the next shogon after Yeiatsu Tokugawa?


**<font color='green'>Answer:</font>** Tokugawa Ieyasu



**<font color='magenta'>Total time:</font>** 2.35 sec.

This answer is not correct. The correct answer is Hidetada Tokugawa.

In [22]:
generate_format_answer("Who was the first American president?")



**<font color='red'>Question:</font>** Who was the first American president?


**<font color='green'>Answer:</font>** George Washington



**<font color='magenta'>Total time:</font>** 1.51 sec.

In [23]:
generate_format_answer("What is drosophila melanogaster?")



**<font color='red'>Question:</font>** What is drosophila melanogaster?


**<font color='green'>Answer:</font>** Fruit fly



**<font color='magenta'>Total time:</font>** 1.56 sec.

In [24]:
generate_format_answer("Which country in South America use Portuguese?")



**<font color='red'>Question:</font>** Which country in South America use Portuguese?


**<font color='green'>Answer:</font>** Brazil



**<font color='magenta'>Total time:</font>** 1.27 sec.

In [25]:
generate_format_answer("What family are horses?")



**<font color='red'>Question:</font>** What family are horses?


**<font color='green'>Answer:</font>** Horses are members of the family Equidae.



**<font color='magenta'>Total time:</font>** 3.41 sec.

In [26]:
generate_format_answer("With which country has France the largest border in South America?")



**<font color='red'>Question:</font>** With which country has France the largest border in South America?


**<font color='green'>Answer:</font>** Brazil



**<font color='magenta'>Total time:</font>** 1.31 sec.

In [27]:
generate_format_answer("In what year was Fukushima incident?")



**<font color='red'>Question:</font>** In what year was Fukushima incident?


**<font color='green'>Answer:</font>** 2011



**<font color='magenta'>Total time:</font>** 2.28 sec.

In [28]:
generate_format_answer("What emperor succeeded to Traian?")



**<font color='red'>Question:</font>** What emperor succeeded to Traian?


**<font color='green'>Answer:</font>** Hadrian



**<font color='magenta'>Total time:</font>** 1.28 sec.

In [29]:
generate_format_answer("What nationality was Marguerite Yourcenar?")



**<font color='red'>Question:</font>** What nationality was Marguerite Yourcenar?


**<font color='green'>Answer:</font>** French



**<font color='magenta'>Total time:</font>** 1.25 sec.

The answer is wrong, she was Belgian, from nobility, and lived in US for most of her life. She was member of French Academy, though.

In [30]:
generate_format_answer("In which branch of US Army server JD Vance?")



**<font color='red'>Question:</font>** In which branch of US Army server JD Vance?


**<font color='green'>Answer:</font>** 1st Armored Division



**<font color='magenta'>Total time:</font>** 2.52 sec.

This is wrong, he served as an enlisted journalist in Marine Corps.

In [31]:
generate_format_answer("Who composed Fur Elise?")



**<font color='red'>Question:</font>** Who composed Fur Elise?


**<font color='green'>Answer:</font>** Ludwig van Beethoven



**<font color='magenta'>Total time:</font>** 1.81 sec.

In [32]:
generate_format_answer("Name one of the most famous opera by Verdi?")



**<font color='red'>Question:</font>** Name one of the most famous opera by Verdi?


**<font color='green'>Answer:</font>** La Traviata



**<font color='magenta'>Total time:</font>** 1.8 sec.

In [33]:
generate_format_answer("Which type of music composed JS Bach?")



**<font color='red'>Question:</font>** Which type of music composed JS Bach?


**<font color='green'>Answer:</font>** Classical



**<font color='magenta'>Total time:</font>** 1.29 sec.

This is wrong, he was mostly composing Baroque (pre-classical) music.

In [34]:
generate_format_answer("Who composed Goldberg Variations?")



**<font color='red'>Question:</font>** Who composed Goldberg Variations?


**<font color='green'>Answer:</font>** Bach



**<font color='magenta'>Total time:</font>** 1.27 sec.

In [35]:
generate_format_answer("Who composed the most famous Requiem?")



**<font color='red'>Question:</font>** Who composed the most famous Requiem?


**<font color='green'>Answer:</font>** Mozart



**<font color='magenta'>Total time:</font>** 1.24 sec.

In [36]:
generate_format_answer("Name an opera by Enesco.")



**<font color='red'>Question:</font>** Name an opera by Enesco.


**<font color='green'>Answer:</font>** The Magic Flute



**<font color='magenta'>Total time:</font>** 2.03 sec.

This is wrong. This opera belongs to Mozart.

In [37]:
generate_format_answer("Who is the author of 'The Russian Ballets'?")



**<font color='red'>Question:</font>** Who is the author of 'The Russian Ballets'?


**<font color='green'>Answer:</font>** Sergei Diaghilev



**<font color='magenta'>Total time:</font>** 2.28 sec.

In [38]:
generate_format_answer("To what school belongs Le Corbusier?")



**<font color='red'>Question:</font>** To what school belongs Le Corbusier?


**<font color='green'>Answer:</font>** The School of Architecture in Paris



**<font color='magenta'>Total time:</font>** 2.94 sec.

A more correct answer will be "The international school of architecture".

In [39]:
generate_format_answer("Name a work by Gaudi.")



**<font color='red'>Question:</font>** Name a work by Gaudi.


**<font color='green'>Answer:</font>** The Sagrada Familia.



**<font color='magenta'>Total time:</font>** 2.32 sec.

The model is able to answer well to a variety of subjects. It is not very good with more in-depth questions. Here we will give few examples.

In [40]:
generate_format_answer("What is ostracism?", max_length=100)



**<font color='red'>Question:</font>** What is ostracism?


**<font color='green'>Answer:</font>** When someone is excluded from a group or community.



**<font color='magenta'>Total time:</font>** 22.82 sec.

The correct answer will refer first to the Ancient Greece mechanism to protect the City-state against tyrany.

In [41]:
generate_format_answer("What was Pnyx?", max_length=100)



**<font color='red'>Question:</font>** What was Pnyx?


**<font color='green'>Answer:</font>** A place where the gods lived.



**<font color='magenta'>Total time:</font>** 3.49 sec.

The correct answer would mention the assembly place in Ancient Athens.

In [42]:
generate_format_answer("What was Graphe Paranomon?", max_length=100)



**<font color='red'>Question:</font>** What was Graphe Paranomon?


**<font color='green'>Answer:</font>** A person who is a genius at writing.



**<font color='magenta'>Total time:</font>** 4.44 sec.

The correct answer will refer to a mechanism of judicial review of unlawful laws in Ancient Athens constitution in IV-th century B.C.

## Let's explore more about technology



In [43]:
generate_format_answer("What is XGBoost?", 
                       max_length=50)



**<font color='red'>Question:</font>** What is XGBoost?


**<font color='green'>Answer:</font>** XGBoost is a machine learning algorithm that is used to solve classification problems. It is a gradient boosting algorithm that is used to



**<font color='magenta'>Total time:</font>** 25.45 sec.

In [44]:
generate_format_answer("What is AlphaFold?")



**<font color='red'>Question:</font>** What is AlphaFold?


**<font color='green'>Answer:</font>** AlphaFold is a computer program that predicts the structure of proteins. It is



**<font color='magenta'>Total time:</font>** 4.74 sec.

In [45]:
generate_format_answer("What is RandomForest?")



**<font color='red'>Question:</font>** What is RandomForest?


**<font color='green'>Answer:</font>** A machine learning algorithm that uses a combination of decision trees to make predictions.



**<font color='magenta'>Total time:</font>** 4.93 sec.

## Let's do math

In [46]:
generate_format_answer("What is 123 + 11?")



**<font color='red'>Question:</font>** What is 123 + 11?


**<font color='green'>Answer:</font>** 134

You are an AI



**<font color='magenta'>Total time:</font>** 3.05 sec.

In [47]:
generate_format_answer("What is 25 x 25?")



**<font color='red'>Question:</font>** What is 25 x 25?


**<font color='green'>Answer:</font>** 625



**<font color='magenta'>Total time:</font>** 1.99 sec.

## More math and Python code

In [48]:
prompt = """
You are an AI assistant designed to write simple Python code.
Please answer with the listing of the Python code.
Make sure to format the output using correct Markdown for code.
Question: {question}
Answer:
"""

In [49]:
t = time()
response = gemma_lm.generate(prompt.format(question="Please write a function in Python to calculate the area of a circle of radius r"), max_length=128)
display(Markdown(colorize_text(f"{response}\n\nTotal time: {round(time()-t, 2)} sec.")))


You are an AI assistant designed to write simple Python code.
Please answer with the listing of the Python code.
Make sure to format the output using correct Markdown for code.


**<font color='red'>Question:</font>** Please write a function in Python to calculate the area of a circle of radius r


**<font color='green'>Answer:</font>**
def area(r):
return 3.14 * r * r
print(area(10))




**<font color='magenta'>Total time:</font>** 27.07 sec.

In [50]:
t = time()
response = gemma_lm.generate(prompt.format(question="Please write a function to order a list in Python"), max_length=256)
display(Markdown(colorize_text(f"{response}\n\nTotal time: {round(time()-t, 2)} sec.")))


You are an AI assistant designed to write simple Python code.
Please answer with the listing of the Python code.
Make sure to format the output using correct Markdown for code.


**<font color='red'>Question:</font>** Please write a function to order a list in Python


**<font color='green'>Answer:</font>**
def order_list(list):
for i in range(len(list)):
for j in range(i+1, len(list)):
if list[i] > list[j]:
list[i], list[j] = list[j], list[i]
return list




**<font color='magenta'>Total time:</font>** 38.59 sec.

# Conclusions


The Gemma 3 model (1B) from Google DeepMind is powerful for its size and can answer to a lot of questions about history, politics, culture, art as well as simple math.  
When prompted with more difficult questions, requiring more in-depth knowledge about certain domains, the model didn't provide an accurate answer. 
It is not great with writing code and formating the output when asked to write code.  
The model will perform best with questions related to common knowledge.