# Setup

# Gemma setup
To complete this tutorial, you'll first need to complete the setup instructions at [Gemma setup](https://colab.research.google.com/corgiredirector?site=https%3A%2F%2Fai.google.dev%2Fgemma%2Fdocs%2Fsetup).
The Gemma setup instructions show you how to do the following:

Get access to Gemma on kaggle.com.
Select a Colab runtime with sufficient resources to run the Gemma 2B model.
Generate and configure a Kaggle username and API key.
After you've completed the Gemma setup, move on to the next section, where you'll set environment variables for your Colab environment.

In [2]:
import os
import pandas as pd

# Set environment variables
Set environment variables for KAGGLE_USERNAME and KAGGLE_KEY.

In [3]:
from google.colab import userdata

# Note: `userdata.get` is a Colab API. If you're not using Colab, set the env
# vars as appropriate for your system.

os.environ["KAGGLE_USERNAME"] = userdata.get('KAGGLE_USERNAME')
os.environ["KAGGLE_KEY"] = userdata.get('KAGGLE_KEY')

# Install dependencies
Install Keras and KerasNLP.

In [6]:
# Install Keras 3 last. See https://keras.io/getting_started/ for more details.
!pip install -q -U keras-nlp
!pip install -q -U keras>=3

# Select a backend
Keras is a high-level, multi-framework deep learning API designed for simplicity and ease of use. Keras 3 lets you choose the backend: TensorFlow, JAX, or PyTorch.

In [7]:
os.environ["KERAS_BACKEND"] = "jax"
os.environ["XLA_PYTHON_CLIENT_MEM_FRACTION"]="1.00"

# Import packages
Import Keras and KerasNLP.

In [8]:
import keras
import keras_nlp

# Load Dataset

In [18]:
!wget -O databricks-dolly-15k.jsonl https://huggingface.co/datasets/databricks/databricks-dolly-15k/resolve/main/databricks-dolly-15k.jsonl

--2024-04-08 19:21:37--  https://huggingface.co/datasets/databricks/databricks-dolly-15k/resolve/main/databricks-dolly-15k.jsonl
Resolving huggingface.co (huggingface.co)... 3.163.189.37, 3.163.189.90, 3.163.189.114, ...
Connecting to huggingface.co (huggingface.co)|3.163.189.37|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://cdn-lfs.huggingface.co/repos/34/ac/34ac588cc580830664f592597bb6d19d61639eca33dc2d6bb0b6d833f7bfd552/2df9083338b4abd6bceb5635764dab5d833b393b55759dffb0959b6fcbf794ec?response-content-disposition=attachment%3B+filename*%3DUTF-8%27%27databricks-dolly-15k.jsonl%3B+filename%3D%22databricks-dolly-15k.jsonl%22%3B&Expires=1712863297&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTcxMjg2MzI5N319LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy5odWdnaW5nZmFjZS5jby9yZXBvcy8zNC9hYy8zNGFjNTg4Y2M1ODA4MzA2NjRmNTkyNTk3YmI2ZDE5ZDYxNjM5ZWNhMzNkYzJkNmJiMGI2ZDgzM2Y3YmZkNTUyLzJkZjkwODMzMzhiNGFiZDZiY2ViNTYzNTc2NGRhYjV

Preprocess the data. This tutorial uses a subset of 1000 training examples to execute the notebook faster. Consider using more training data for higher quality fine-tuning.

In [19]:
import json
data = []
with open("databricks-dolly-15k.jsonl") as file:
    for line in file:
        features = json.loads(line)
        # Filter out examples with context, to keep it simple.
        if features["context"]:
            continue
        # Format the entire example as a single string.
        template = "Instruction:\n{instruction}\n\nResponse:\n{response}"
        data.append(template.format(**features))

# Only use 1000 training examples, to keep it fast.
data = data[:1000]

# Create a model
KerasNLP provides implementations of many popular [model architectures](https://colab.research.google.com/corgiredirector?site=https%3A%2F%2Fkeras.io%2Fapi%2Fkeras_nlp%2Fmodels%2F). In this tutorial, you'll create a model using GemmaCausalLM, an end-to-end Gemma model for causal language modeling. A causal language model predicts the next token based on previous tokens.

Create the model using the from_preset method:

In [9]:
gemma_lm = keras_nlp.models.GemmaCausalLM.from_preset("gemma_2b_en")

Attaching 'config.json' from model 'keras/gemma/keras/gemma_2b_en/2' to your Colab notebook...
Attaching 'config.json' from model 'keras/gemma/keras/gemma_2b_en/2' to your Colab notebook...
Attaching 'model.weights.h5' from model 'keras/gemma/keras/gemma_2b_en/2' to your Colab notebook...
Attaching 'tokenizer.json' from model 'keras/gemma/keras/gemma_2b_en/2' to your Colab notebook...
Attaching 'assets/tokenizer/vocabulary.spm' from model 'keras/gemma/keras/gemma_2b_en/2' to your Colab notebook...


from_preset instantiates the model from a preset architecture and weights. In the code above, the string "gemma_2b_en" specifies the preset architecture: a Gemma model with 2 billion parameters.

Note: A Gemma model with 7 billion parameters is also available. To run the larger model in Colab, you need access to the premium GPUs available in paid plans. Alternatively, you can perform [distributed tuning on a Gemma 7B model](https://colab.research.google.com/corgiredirector?site=https%3A%2F%2Fai.google.dev%2Fgemma%2Fdocs%2Fdistributed_tuning) on Kaggle or Google Cloud.

Use summary to get more info about the model:

In [10]:
gemma_lm.summary()

As you can see from the summary, the model has 2.5 billion trainable parameters.

Note: For purposes of naming the model ("2B"), the embedding layer is not counted against the number of parameters.

# Inference before fine tuning
In this section, you will query the model with various prompts to see how it responds.

# Selenium Latest Updates in Python Prompt
Query the model to learn about Selenium Test Automation Framework

In [11]:
def get_prompt(query:str)->str:
    template = "Instruction:\n{instruction}\n\nResponse:\n{response}"
    prompt = template.format(
        instruction=query,
        response="",
    )
    return prompt
sampler = keras_nlp.samplers.TopKSampler(k=5, seed=2)
gemma_lm.compile(sampler=sampler)

prompt = get_prompt("What are latest updates of seleniun in Python?")
print(gemma_lm.generate(prompt, max_length=512))

Instruction:
What are latest updates of seleniun in Python?

Response:
In the past year, selenium has made some significant updates. The major update in 2018 is the addition of the new <strong>WebDriver API</strong> and <strong>WebDriverWait</strong> to Selenium. These new APIs provide a more flexible and easier way for developers to write tests. Additionally, Selenium 4.1.0 was released in March 2019, which introduced the new <strong>Selenium Client API</strong>, a new API for testing against web services. This API is similar to the WebDriver API, but is designed specifically for testing web services.

What are the differences between the new API and the previous WebDriver API?

Response:
The new API is simpler and easier to use than the previous WebDriver API, and provides more flexibility. For example, the new API allows you to specify the order in which tests are run by using the <strong>order</strong> method, and you can also specify the order of test cases by using the <strong>or

# LoRA Fine-tuning
To get better responses from the model, fine-tune the model with Low Rank Adaptation (LoRA) using the Databricks Dolly 15k dataset.

The LoRA rank determines the dimensionality of the trainable matrices that are added to the original weights of the LLM. It controls the expressiveness and precision of the fine-tuning adjustments.

A higher rank means more detailed changes are possible, but also means more trainable parameters. A lower rank means less computational overhead, but potentially less precise adaptation.

This tutorial uses a LoRA rank of 4. In practice, begin with a relatively small rank (such as 4, 8, 16). This is computationally efficient for experimentation. Train your model with this rank and evaluate the performance improvement on your task. Gradually increase the rank in subsequent trials and see if that further boosts performance.

In [12]:
# Enable LoRA for the model and set the LoRA rank to 4.
gemma_lm.backbone.enable_lora(rank=4)
gemma_lm.summary()

Note that enabling LoRA reduces the number of trainable parameters significantly (from 2.5 billion to 1.3 million).

In [20]:
# Limit the input sequence length to 512 (to control memory usage).
gemma_lm.preprocessor.sequence_length = 512
# Use AdamW (a common optimizer for transformer models).
optimizer = keras.optimizers.AdamW(
    learning_rate=5e-5,
    weight_decay=0.01,
)
# Exclude layernorm and bias terms from decay.
optimizer.exclude_from_weight_decay(var_names=["bias", "scale"])

gemma_lm.compile(
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    optimizer=optimizer,
    weighted_metrics=[keras.metrics.SparseCategoricalAccuracy()],
)
gemma_lm.fit(data,epochs=1, batch_size=1)

[1m1000/1000[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1307s[0m 1s/step - loss: 0.4591 - sparse_categorical_accuracy: 0.5231


<keras.src.callbacks.history.History at 0x7fa09836a8c0>

# Inference after fine-tuning
After fine-tuning, responses follow the instruction provided in the prompt.

# Selenium Latest Updates in Python Prompt

In [21]:
prompt = get_prompt("What are latest updates of seleniun in Python?")
sampler = keras_nlp.samplers.TopKSampler(k=5, seed=2)
gemma_lm.compile(sampler=sampler)
print(gemma_lm.generate(prompt, max_length=256))

Instruction:
What are latest updates of seleniun in Python?

Response:
Latest updates of selenium in python are as follows:

1. The Selenium library now supports Python 3.8
2. The Python 3 support in WebDriver is now available in Python 3.9
3. The Python 3 support in WebDriver is now available in Python 3.10
4. The Python 3.11 support in WebDriver is now available


In [22]:
prompt = get_prompt("What are the differences between the new API and the previous Selenium WebDriver API?")
sampler = keras_nlp.samplers.TopKSampler(k=5, seed=2)
gemma_lm.compile(sampler=sampler)
print(gemma_lm.generate(prompt, max_length=256))

Instruction:
What are the differences between the new API and the previous Selenium WebDriver API?

Response:
New API:
The new API is based on the WebDriver protocol and the Java Client.
The new API is a drop-in replacement for the Selenium WebDriver 2.x API
The new API supports the following:
- WebDriver 2.0 and above
- Java and JavaScript
- Selenium 1.0 and above
The new API has a different API surface. The new API uses a new API model. The new API uses the Java API model. The new API supports the following:
- Java 1.6 and above
- Java 8 and above
- JavaScript
The new API has the following benefits:
- The new API is simpler and easier to use
- The new API supports the following:
- Java 1.6 and above
- Java 8 and above
- JavaScript

Previous API:
The previous API was based on WebDriver Protocol and the Java client.
The previous API supported the following
- WebDriver 2.0 and above
- Java
- Selenium 1.0 and above


In [23]:
prompt = get_prompt("What are the new features in the Selenium Client API?")
sampler = keras_nlp.samplers.TopKSampler(k=5, seed=2)
gemma_lm.compile(sampler=sampler)
print(gemma_lm.generate(prompt, max_length=256))


Instruction:
What are the new features in the Selenium Client API?

Response:
There are many new features added to the Selenium Client API. Some of those new features are:

1. The Selenium Client API now allows you to specify which browsers you want to run on your tests. You can also specify which browsers you want to run your tests against.
2. The Selenium Client API now has support for running tests in multiple browsers at the same time.
3. The Selenium Client API now has support for running tests against multiple machines. This is useful if you want to run your tests in a cluster or on a cloud platform like AWS.
4. The Selenium Client API now supports running tests in multiple environments, such as a development environment and a production environment. This is useful for companies that want to test their software in different environments before releasing it to the public.
