> <p><small><small>This Notebook is made available subject to the licence and terms set out in the <a href = "http://www.github.com/google-deepmind/ai-foundations">AI Research Foundations Github README file</a>.

<img src="https://storage.googleapis.com/dm-educational/assets/ai_foundations/GDM-Labs-banner-image-C5-white-bg.png">

# Lab: Prompt Your SLM with Questions

<a href='https://colab.research.google.com/github/google-deepmind/ai-foundations/blob/master/course_5/gdm_lab_5_1_prompt_your_slm_with_questions.ipynb' target='_parent'><img src='https://colab.research.google.com/assets/colab-badge.svg' alt='Open In Colab'/></a>

15 minutes

Investigate how your small language model responds to prompts in different formats.

## Overview

In the previous activity, you made predictions about how your pre-trained model would behave. In this lab, you will put those predictions to the test. You will load the transformer model trained on the Africa Galore dataset and observe its performance when given different types of prompts, from simple sentence continuations to direct questions.



### What you will learn:

By the end of this lab, you will be able to:

* Compare the quality of continuations for prompts formatted as statements and questions generated by a model that has only been trained on next-token prediction.    



### Tasks

In this short lab, you will put your pre-trained transformer to the test to explore its capabilities and limitations.

**In this lab, you will**:

* Load your pre-trained transformer model.

* Prompt the model with a series of different inputs, including both statements and questions.

* Analyze the generated outputs to see the model's behavior on different types of prompts.


If you are able to, we **highly recommend running the code in this lab on a Colab instance with a GPU**. See the section "How to use Google Colaboratory (Colab)" below for instructions on how to do this.



## How to use Google Colaboratory (Colab)

Google Colaboratory (also known as Google Colab) is a platform that allows you to run Python code in your browser. The code is written in cells that are executed on a remote server.

To run a cell, hover over a cell, and click the `run` button to its left. The run button is the circle with the triangle (â–¶). Alternatively, you can also click a cell and use the keyboard combination Ctrl+Return (or âŒ˜+Return if you are using a Mac).

To try this out, run the following cell. This should print today's day of the week below it.

In [None]:
from datetime import datetime
print(f"Today is {datetime.today():%A}.")

Note that the order in which you run the cells matters. When you are working through a lab, make sure to always run all cells in order. Otherwise, the code might not work. If you take a break while working on a lab, Colab may disconnect you; in that case, you have to execute all cells again before  continuing your work. To make this easier, you can select the cell you are currently working on and then choose __Runtime â†’ Run before__  from the menu above (or use the keyboard combination Ctrl/âŒ˜ + F8). This will re-execute all cells before the current one.

### Using Colab with a GPU

Follow these steps to run the activities in this lab on a GPU:

1.  In the top menu bar, click **Runtime**.
2.  Select **Change runtime type** from the dropdown menu.
3.  In the pop-up window under **Hardware Accelerator**, select **GPU** (usually listed as `T4 GPU`).
4.  Click **Save**.

Your Colab session will now restart with GPU access.

Note that access to GPUs is limited and at times, you may not be able to run this lab on a GPU. All activities will still work but they will run slower and you will have to wait longer for some of the cells to finish running.


## Imports



In this lab, you will use the transformer that you trained in the previous courses. You will also use Keras for defining the model and loading the model parameters.

Run the following cell to import these packages.

In [None]:
%%capture
# Install the custom package for this course.
!pip install "git+https://github.com/google-deepmind/ai-foundations.git@main"

# Packages used.
import os # For setting Keras configuration variables.

# The following line provides configuration for Keras.
os.environ["KERAS_BACKEND"] = "jax"

import keras # For defining and loading the model.
import re # For splitting text on whitespace.
from urllib import request # For downloading model weights.

# For displaying better formatted error messages.
from IPython.display import display, HTML

from ai_foundations import training # For defining your model.
from ai_foundations import generation # For prompting your model.
# For loading the tokenizer.
from ai_foundations.tokenization import BPEWordTokenizer

## Load the tokenizer and the model

In this lab, you will work with a tokenizer and a model that has already been trained. Instead of training these components again on a dataset from scratch, the following cell loads the merge rules for a BPE tokenizer and the parameters of a transformer model. The model and the tokenizer have both been trained on the Africa Galore dataset.

In [None]:
url = "https://storage.googleapis.com/dm-educational/assets/ai_foundations/africa_galore_qa_tokenizer_3000.pkl"

tokenizer = BPEWordTokenizer.from_url(url)

model_url = "https://storage.googleapis.com/dm-educational/assets/ai_foundations/africa_galore_qa_1000ep.weights.h5"
model_filename = "africa_galore_qa_1000ep.weights.h5"
request.urlretrieve(model_url, model_filename)

# Define the model.
model = training.create_model(
    max_length=399,
    vocabulary_size=tokenizer.vocabulary_size,
)

# Load the model parameters from a file.
model.load_weights(model_filename)

print(
    f"Initialized and loaded transformer model with {model.count_params():,}"
    f" parameters."
)

## Activity 1: Prompt the model

Now that you have loaded the pre-trained model, you can test its behavior.

------
> ðŸ’» **Your task:**
>
> 1. Use the text field below to prompt the model with each of the following inputs. Run the cell after each prompt to observe the generated text.
>
>    * Jollof rice is
>    * What is Jollof rice?
>    * Tagine is
>    * What is Tagine?
>
> 2. After you have tested all four prompts, reflect on the outputs.
>    * How did the model perform on the statements (e.g., "Jollof rice is")?
>    * How did its performance compare on the questions?
>    * Did the model perform as you expected?
>
------

In [None]:
# @title Prompt your small language model
prompt = "Jollof rice is" #@param {type: "string"}
num_tokens_to_generate = 10 #@param {type: "number"}
generated_text, probs = generation.generate_text(
    prompt,
    num_tokens_to_generate,
    model=model,
    tokenizer=tokenizer,
    pad_token_id=tokenizer.pad_token_id,
    sampling_mode="greedy" # To generate the tokens with highest probability.
)

print(f"Generated text: {generated_text}")

## Summary

In this lab, you prompted a **base model** that was trained on the next-token prediction task. You observed directly that while the model excels at its trained task of completing statements, it fails to reliably answer direct questions. This highlights the crucial difference between a general text completer and an instruction-tuned model.