# Unlock the Power of Luganda with Ganda Gemma 1B!

Dive into the world of Luganda artificial intelligence with this comprehensive guide to using `CraneAILabs/ganda-gemma-1b`. As a fine-tuned version of Google's Gemma 3 1B model, Ganda Gemma is specially designed for English-to-Luganda translation and conversational AI. This notebook will walk you through everything you need to know, from the model's capabilities to hands-on-code examples.

### **Model at a Glance**

*   **Model Name:** `CraneAILabs/ganda-gemma-1b`
*   **Base Model:** `google/gemma-3-1b-it`
*   **Parameters:** 1 Billion
*   **Input Languages:** English and Luganda
*   **Output Language:** Luganda
*   **Primary Focus:** English-to-Luganda translation and Luganda conversational AI

### **Capabilities: What Can Ganda Gemma 1B Do?**

This powerful model is equipped with a range of capabilities to serve various linguistic needs:

*   **English-to-Luganda Translation:** Seamlessly translate text from English to Luganda.
*   **Conversational AI:** Engage in natural, human-like conversations in Luganda.
*   **Text Summarization:** Condense longer Luganda texts into concise summaries.
*   **Creative and Informational Writing:** Generate a variety of written content in Luganda.
*   **Question Answering:** Provide answers to general knowledge questions in Luganda.

### **Performance that Speaks for Itself**

Ganda Gemma 1B has been rigorously evaluated, demonstrating impressive performance in translation quality. Here's a look at how it stacks up against other models based on BLEU and chrF++ scores, evaluated on 1,012 translation samples:

| Model | BLEU Score | chrF++ |
| :--- | :--- | :--- |
| **Ganda Gemma 1B** | **6.98** | **40.63** |
| ChatGPT-4o-latest | 6.46 | 39.98 |
| Llama-4-Maverick | 4.29 | 33.52 |
| Llama-4-Scout | 3.28 | 27.33 |

---



## Let's Get Coding!

### **Setup: Installing the Essentials**

First, let's make sure you have the necessary libraries installed. We'll need `transformers` for interacting with the model and `torch` for the underlying computations.


In [None]:
!pip install transformers
!pip install torch

### **Use Case 1: English-to-Luganda Translation**

Translating from English to Luganda is a core feature of Ganda Gemma 1B. Here’s how you can do it:


In [None]:
from huggingface_hub import notebook_login
notebook_login()

In [None]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load the model and tokenizer
model_name = "CraneAILabs/ganda-gemma-1b"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

In [None]:
# Create the chat-formatted prompt
messages = [
    {"role": "user", "content": "Translate to Luganda: Good morning, how did you sleep?"}
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt")

# Generate new tokens
with torch.no_grad():
    outputs = model.generate(
        inputs["input_ids"],
        max_new_tokens=128,
        temperature=0.3,
        top_p=0.95,
        top_k=64,
        repetition_penalty=1.1,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )

# The generated output will now contain both the formatted prompt and the answer.
# We still need to decode and then extract just the model's response.
full_response = tokenizer.decode(outputs[0], skip_special_tokens=True)

# Extract only the model's part of the response
# Gemma's response starts after '<start_of_turn>model\n'
model_response = full_response.split('<start_of_turn>model\n')[-1]

print(f"Model's Response: {model_response}")

In [None]:
from transformers import pipeline, AutoTokenizer
import torch

model_name = "CraneAILabs/ganda-gemma-1b"

# Load the tokenizer separately to build the prompt
tokenizer = AutoTokenizer.from_pretrained(model_name)

# --- Most Robust Pipeline for CPU ---
generator = pipeline(
    "text-generation",
    model=model_name,
    tokenizer=tokenizer,
    torch_dtype="auto",
    device=-1
)

# 1. Apply the chat template just like before
messages = [
    {"role": "user", "content": "Translate to Luganda: Welcome to our school"}
]
prompt_from_template = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

# 2. Pass the formatted prompt to the generator
result = generator(
    prompt_from_template,
    max_new_tokens=100,
    temperature=0.3,
    do_sample=True,
    return_full_text=False
)

print("\n--- Pipeline Translation ---")
print(f"Generated Text: '{result[0]['generated_text']}'")

### **Use Case 2: Luganda Conversational AI**

Engage directly with the model in Luganda for a conversational experience.


In [None]:
# --- Direct Luganda Conversation ---
prompt = "Mwasuze mutya! Nnyinza ntya okukuyamba leero?" # (Good morning! How can I help you today?)
inputs = tokenizer(prompt, return_tensors="pt")

outputs = model.generate(**inputs, max_length=100, temperature=0.3)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(f"User Prompt: {prompt}")
print(f"Ganda Gemma's Response: {response}")


In [None]:
from transformers import pipeline, AutoTokenizer
import torch

# Define the model name
model_name = "CraneAILabs/ganda-gemma-1b"

# Load the tokenizer separately. This is crucial for using the chat template.
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Create the text-generation pipeline with CPU optimizations
# torch_dtype="auto" helps torch select the right data type for your hardware (e.g., float32 for CPU)
chat_pipeline = pipeline(
    "text-generation",
    model=model_name,
    tokenizer=tokenizer,
    torch_dtype="auto",
    device=-1  # Use -1 to explicitly set to CPU
)

print("Chat pipeline is ready. You can now start the conversation.")

In [None]:
print("--- Ganda Gemma Interactive Chat ---")
print("Model is ready. Type your message or a translation request.")
print("Type 'quit' or 'exit' to end the chat.\n")

while True:
    # 1. Get input from the user
    user_input = input("You: ")

    # 2. Check for an exit command
    if user_input.lower() in ["quit", "exit"]:
        print("Goodbye! The chat session has ended.")
        break

    # 3. Format the input using the model's official chat template
    # This is the most reliable way to get good responses from Gemma models.
    messages = [
        {"role": "user", "content": user_input}
    ]
    prompt = chat_pipeline.tokenizer.apply_chat_template(
        messages,
        tokenize=False,
        add_generation_prompt=True
    )

    # 4. Generate a response using the pipeline
    # We use 'max_new_tokens' and 'return_full_text=False' to get a clean response.
    response = chat_pipeline(
        prompt,
        max_new_tokens=150,
        do_sample=True,
        temperature=0.7, # A slightly higher temperature can make chat more interesting
        top_k=50,
        top_p=0.95,
        return_full_text=False
    )

    # 5. Print the model's generated response
    print(f"Ganda Gemma: {response[0]['generated_text'].strip()}")
    print("-" * 20) # Separator for clarity

***

### **Use Case 4: Text Summarization in Luganda**

Beyond translation, Ganda Gemma can understand and process Luganda text to extract the most important information. This is incredibly useful for condensing long articles, reports, or paragraphs into a short, easy-to-read summary.

We'll use prompt engineering to instruct the model to act as a summarizer.


In [None]:
# --- The Text to Summarize ---
long_luganda_text = """
Ebyenjigiriza bya muwendo nnyo mu bulamu bwa buli muntu ne mu nkulaakulana y'eggwanga. Biyamba abantu okufuna amagezi, obukugu, n'empisa ennungi ebiyamba okweyimirizaawo. Okuva ku baana abato okutuuka ku bantu abakulu, okusoma kubayamba okumanya ebigenda mu maaso mu nsi, okutegeera eddembe lyabwe, n'okuba ab'omugaso mu kitundu. Eggwanga eririna abantu abasomye bulungi lisobola okukulaakulana mangu mu by'enfuna, eby'obufuzi, n'embeera z'abantu.
"""

# --- The Instruction (Prompt) ---
# We tell the model exactly what to do with the text.
instruction = f"Summarize the following Luganda text into one concise sentence: \n\n{long_luganda_text}"

# Format using the chat template for best results
messages = [{"role": "user", "content": instruction}]
prompt = chat_pipeline.tokenizer.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True
)

# Generate the summary
# We use a low temperature for fact-based, non-creative output.
summary_result = chat_pipeline(
    prompt,
    max_new_tokens=100,
    do_sample=True,
    temperature=0.2, # Low temperature for factual summary
    return_full_text=False
)

print("--- Luganda Text Summarization ---")
print(f"Original Text:\n{long_luganda_text.strip()}")
print("\n" + "="*25 + "\n")
print(f"Generated Summary:\n{summary_result[0]['generated_text'].strip()}")


### **Use Case 5: Creative and Informational Writing**

Need to write a short story, a poem, or an informational paragraph in Luganda? Ganda Gemma can be your creative partner. By providing a topic or a starting sentence, you can generate a variety of written content.

For creative tasks, we'll increase the `temperature` parameter slightly to encourage the model to generate more diverse and interesting text.


In [None]:
# --- Creative Writing: A Short Story ---
story_idea = "Write a short story in Luganda about a clever rabbit (akamyu) and a friendly elephant (enjovu)."

messages_story = [{"role": "user", "content": story_idea}]
prompt_story = chat_pipeline.tokenizer.apply_chat_template(
    messages_story, tokenize=False, add_generation_prompt=True
)

# Use a higher temperature for more creative and unpredictable results
creative_result = chat_pipeline(
    prompt_story,
    max_new_tokens=250, # Allow for a longer story
    do_sample=True,
    temperature=0.8, # Higher temperature for creativity
    top_p=0.95,
    return_full_text=False
)

print("--- Creative Writing Example ---")
print(f"Prompt: {story_idea}")
print("\n" + "="*25 + "\n")
print(f"Generated Story:\n{creative_result[0]['generated_text'].strip()}")

In [None]:
# --- Informational Writing: Factual Content ---
info_idea = "Write a short paragraph in Luganda explaining the importance of boiling drinking water."

messages_info = [{"role": "user", "content": info_idea}]
prompt_info = chat_pipeline.tokenizer.apply_chat_template(
    messages_info, tokenize=False, add_generation_prompt=True
)

# Use a lower temperature for factual, informational text
info_result = chat_pipeline(
    prompt_info,
    max_new_tokens=200,
    do_sample=True,
    temperature=0.3, # Lower temperature for factual content
    return_full_text=False
)

print("\n\n--- Informational Writing Example ---")
print(f"Prompt: {info_idea}")
print("\n" + "="*25 + "\n")
print(f"Generated Paragraph:\n{info_result[0]['generated_text'].strip()}")

### **Use Case 6: General Knowledge Question Answering**

You can ask Ganda Gemma general knowledge questions directly in Luganda. The model will tap into the vast information it learned during its pre-training to provide an answer.

This is another form of zero-shot learning. For the best factual recall, it's important to keep the `temperature` low to prevent the model from making things up (hallucinating).


In [None]:
# --- Ask a Question in Luganda ---
question = "Kibuga ki ekikulu ekya Uganda?" # "What is the capital city of Uganda?"

# Format the question using the chat template.
messages_qa = [{"role": "user", "content": question}]
prompt_qa = chat_pipeline.tokenizer.apply_chat_template(
    messages_qa, tokenize=False, add_generation_prompt=True
)

# Generate the answer with a low temperature for accuracy.
answer_result = chat_pipeline(
    prompt_qa,
    max_new_tokens=50,
    do_sample=True,
    temperature=0.1, # Very low temperature for factual answers
    return_full_text=False
)

print("--- Question Answering Example ---")
print(f"Question: {question}")
print("\n" + "="*25 + "\n")
print(f"Generated Answer: {answer_result[0]['generated_text'].strip()}")


In [None]:
# --- Another Example ---
question_2 = "Lwaki ebirime byetaaga amazzi okukula?" # "Why do crops need water to grow?"

messages_qa_2 = [{"role": "user", "content": question_2}]
prompt_qa_2 = chat_pipeline.tokenizer.apply_chat_template(
    messages_qa_2, tokenize=False, add_generation_prompt=True
)

answer_result_2 = chat_pipeline(
    prompt_qa_2,
    max_new_tokens=150,
    do_sample=True,
    temperature=0.2,
    return_full_text=False
)

print("\n\n--- Another Question Answering Example ---")
print(f"Question: {question_2}")
print("\n" + "="*25 + "\n")
print(f"Generated Answer: {answer_result_2[0]['generated_text'].strip()}")

### **Optimal Generation Parameters**

To get the best results from Ganda Gemma 1B, it's recommended to use the following generation parameters:

*   **`temperature`: 0.3** - For focused and coherent responses.
*   **`top_p`: 0.95** - Utilizes nucleus sampling.
*   **`top_k`: 64** - Employs top-k sampling.
*   **`max_length`: 128** - Sets a limit on the response length.
*   **`repetition_penalty`: 1.1** - Helps in reducing word repetition.
*   **`do_sample`: True** - Enables sampling for more dynamic responses.


### **Important Limitations to Consider**

While Ganda Gemma 1B is a powerful tool, it's important to be aware of its limitations:

*   **Luganda-Only Output:** The model is designed to respond exclusively in Luganda.
*   **General Knowledge Base:** It has not been trained on specific factual datasets, so its knowledge is general.
*   **No Coding or Math:** The model is not designed for programming or mathematical tasks.
*   **Context Length:** For optimal performance, the context length is limited to 4,096 tokens.
*   **Domain-Specific Fine-Tuning:** For specialized domains, further fine-tuning may be required.



### **License Information**

The Ganda Gemma 1B model is released under the Gemma Terms of Use. Please review the terms before using the model.

We hope this notebook serves as an excellent starting point for your journey with Luganda AI. Happy coding