<a href="https://colab.research.google.com/github/aimenemen/Demo-creating-your-first-repo/blob/main/Kaggle_Keras_Gemma_I_O.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Setup

### Get access to Gemma

To complete this tutorial, you will first need to complete the setup instructions at [Gemma setup](https://ai.google.dev/gemma/docs/setup). The Gemma setup instructions show you how to do the following:

* Get access to Gemma on [kaggle.com](https://kaggle.com).
* Select a Colab runtime with sufficient resources to run
  the Gemma 2B model.
* Generate and configure a Kaggle username and API key.

After you've completed the Gemma setup, move on to the next section, where you'll set environment variables for your Colab environment.

### Select the runtime

To complete this tutorial, you'll need to have a Colab runtime with sufficient resources to run the Gemma model. In this case, you can use a T4 GPU or an A100 GPU (recommended, if available):

1. In the upper-right of the Colab window, select &#9662; (**Additional connection options**).
2. Select **Change runtime type**.
3. Under **Hardware accelerator**, select **T4 GPU** or **A100 GPU**.

### Configure your API key

To use Gemma, you must provide your Kaggle username and a Kaggle API key.

To generate a Kaggle API key, go to the **Account** tab of your Kaggle user profile and select **Create New Token**. This will trigger the download of a `kaggle.json` file containing your API credentials.

In Colab, select **Secrets** (🔑) in the left pane and add your Kaggle username and Kaggle API key. Store your username under the name `KAGGLE_USERNAME` and your API key under the name `KAGGLE_KEY`.

### Set environment variables

Set environment variables for `KAGGLE_USERNAME` and `KAGGLE_KEY`.

In [1]:
import os
from google.colab import userdata

# Note: `userdata.get` is a Colab API. If you're not using Colab, set the env
# vars as appropriate for your system.

#os.environ["GITHUB_TOKEN"] = userdata.get('GITHUB_TOKEN')
os.environ["KAGGLE_USERNAME"] = userdata.get('KAGGLE_USERNAME')
os.environ["KAGGLE_KEY"] = userdata.get('KAGGLE_KEY')

### Install dependencies

Install Keras, KerasNLP, and other dependencies.

In [2]:
# Install Keras 3 last
!pip install -q -U tf-keras
!pip install -q -U keras-nlp==0.10.0
!pip install -q -U kagglehub>=0.2.4
!pip install -q -U keras>=3

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.7/1.7 MB[0m [31m38.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m644.9/644.9 MB[0m [31m2.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m4.7/4.7 MB[0m [31m92.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m5.5/5.5 MB[0m [31m103.1 MB/s[0m eta [36m0:00:00[0m
[?25h[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow-text 2.18.1 requires tensorflow<2.19,>=2.18.0, but you have tensorflow 2.19.0 which is incompatible.[0m[31m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m513.7/513.7 kB[0m [31m14.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m950.8/950.8 kB[0m [31m45.2 MB/s[0m 

### Select a backend

Keras is a high-level, multi-framework deep learning API designed for simplicity and ease of use. Using Keras 3, you can run workflows on one of three backends: TensorFlow, JAX, or PyTorch.

For this tutorial, configure the backend for JAX.

In [3]:
os.environ["KERAS_BACKEND"] = "jax"
# Avoid memory fragmentation on JAX backend.
os.environ["XLA_PYTHON_CLIENT_MEM_FRACTION"]="1.00"

### Import packages

Import Keras, KerasNLP, and the `csv` package.

In [4]:
import keras_nlp
import keras
import csv

print("KerasNLP version: ", keras_nlp.__version__)
print("Keras version: ", keras.__version__)

KerasNLP version:  0.10.0
Keras version:  3.9.0


## Load Model

Let's download the 2B variant of Gemma from Kaggle. You can see the model page [here](https://www.kaggle.com/models/keras/gemma/keras/gemma_2b_en).

In [5]:
gemma_lm = keras_nlp.models.GemmaCausalLM.from_preset("gemma_2b_en")

Downloading from https://www.kaggle.com/api/v1/models/keras/gemma/keras/gemma_2b_en/2/download/metadata.json...


100%|██████████| 143/143 [00:00<00:00, 391kB/s]


Downloading from https://www.kaggle.com/api/v1/models/keras/gemma/keras/gemma_2b_en/2/download/config.json...


100%|██████████| 555/555 [00:00<00:00, 1.08MB/s]


Downloading from https://www.kaggle.com/api/v1/models/keras/gemma/keras/gemma_2b_en/2/download/model.weights.h5...


100%|██████████| 4.67G/4.67G [01:41<00:00, 49.5MB/s]


Downloading from https://www.kaggle.com/api/v1/models/keras/gemma/keras/gemma_2b_en/2/download/tokenizer.json...


100%|██████████| 401/401 [00:00<00:00, 841kB/s]


Downloading from https://www.kaggle.com/api/v1/models/keras/gemma/keras/gemma_2b_en/2/download/assets/tokenizer/vocabulary.spm...


100%|██████████| 4.04M/4.04M [00:00<00:00, 9.22MB/s]


In [6]:
gemma_lm.summary()

## Load Dataset

In [8]:
pip install --upgrade kaggle


Collecting kaggle
  Downloading kaggle-1.7.4-py3-none-any.whl.metadata (17 kB)
Downloading kaggle-1.7.4-py3-none-any.whl (173 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m173.2/173.2 kB[0m [31m6.1 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: kaggle
  Attempting uninstall: kaggle
    Found existing installation: kaggle 1.6.17
    Uninstalling kaggle-1.6.17:
      Successfully uninstalled kaggle-1.6.17
Successfully installed kaggle-1.7.4


Let's download a [Medical Question Answering Dataset](https://www.kaggle.com/datasets/jpmiller/layoutlm/data) from Kaggle for this fine-tune example.

In [9]:
import kagglehub

# Download latest version
path = kagglehub.dataset_download("ashukumar27/digital-marketing-questions-answers")

print("Path to dataset files:", path)


Downloading from https://www.kaggle.com/api/v1/datasets/download/ashukumar27/digital-marketing-questions-answers?dataset_version_number=1...


100%|██████████| 31.0k/31.0k [00:00<00:00, 37.1MB/s]

Extracting files...
Path to dataset files: /root/.cache/kagglehub/datasets/ashukumar27/digital-marketing-questions-answers/versions/1





In [16]:
from google.colab import files
uploaded = files.upload()


Saving digital_marketing_qna.csv to digital_marketing_qna.csv


In [15]:
#!unzip /root/.cache/kagglehub/datasets/ashukumar27/digital-marketing-questions-answers/versions/1/data.zip


unzip:  cannot find or open /root/.cache/kagglehub/datasets/ashukumar27/digital-marketing-questions-answers/versions/1/data.zip, /root/.cache/kagglehub/datasets/ashukumar27/digital-marketing-questions-answers/versions/1/data.zip.zip or /root/.cache/kagglehub/datasets/ashukumar27/digital-marketing-questions-answers/versions/1/data.zip.ZIP.


After unzipping the `medquad.csv` file, we should format our data from the `csv` into question and answer examples.

This will be the dataset our model will be fine-tuned on.

In [18]:
import csv

data = []
filename = next(iter(uploaded))  # This fetches the name of the uploaded file

with open(filename, mode='r', encoding='utf-8') as file:
    reader = csv.DictReader(file)
    for row in reader:
        template = "Question:\n{question}\n\nAnswer:\n{answer}"
        data.append(template.format(**row))


Let's take a look at an example to make sure the data has been formatted correctly with the Question-Answer template:

In [19]:
for entry in data[:5]:
    print(entry)


Question:
"What is digital marketing?"

Answer:
"Digital marketing refers to advertising delivered through digital channels like search engines, websites, social media, email, and mobile apps."
Question:
"Why is digital marketing important?"

Answer:
"It allows for more precise targeting, analytics, and ROI measurement, adapting to user behavior and preferences."
Question:
"What's the difference between SEO and PPC?"

Answer:
"SEO (Search Engine Optimization) focuses on organic traffic through search engine results, while PPC (Pay-Per-Click) is about paid advertisements."
Question:
"How can I improve my website's SEO?"

Answer:
"Optimize content for keywords, improve site speed, ensure mobile-friendliness, and earn quality backlinks."
Question:
"What are the key components of a digital marketing strategy?"

Answer:
"They include SEO, content marketing, email marketing, social media, PPC, and analytics."


### Inference before fine tuning

The original Gemma model has a lot of general knowledge, but fine-tuning can help improve domain-specific knowledge.

To test the pre-trained model on more specific medical knowledge, let's pick a more complex disease: **Chronic Eosinophilic Leukemia**.

Let's prompt Gemma by asking about about treatments for that disease, making sure to format our prompt using the Question-Answer template we previously defined.

In [20]:
#import your_language_model_library as gemma_lm  # Replace with the actual library name

# Define the question you want an answer for
question = "What are the latest trends in digital marketing?"

# Prepare the prompt for the language model
prompt = f"Question:\n{question}\n\nAnswer:\n"

# Generate an answer using the language model
answer = gemma_lm.generate(prompt, max_length=128)  # Adjust function parameters as necessary

print(f"Question: {question}\nGenerated Answer: {answer}")


Question: What are the latest trends in digital marketing?
Generated Answer: Question:
What are the latest trends in digital marketing?

Answer:
Digital marketing is constantly evolving and changing. The latest trends in digital marketing include:

1. Artificial Intelligence (AI): AI is being used to automate tasks, personalize content, and improve customer experiences.
2. Chatbots: Chatbots are automated chatbots that can be used to answer customer questions and provide information.
3. Voice Search: Voice search is becoming more popular, with people using their voice to search for information.
4. Social Media Marketing: Social media marketing is a key part of digital marketing, with platforms like Facebook, Twitter


As you can see, the resulting answer from Gemma simply defines the disease, breaking down the definition of leukemia and eosinophils. However, it isn't able to answer the question on treatments!

This is where fine-tuning on our medical dataset can help.

## LoRA Fine-tuning

To get better responses from the model, fine-tune the model with Low Rank Adaptation (LoRA) using our Medical Question-Answer dataset.

The LoRA rank determines the dimensionality of the trainable matrices that are added to the original weights of the LLM. It controls the expressiveness and precision of the fine-tuning adjustments.

A higher rank means more detailed changes are possible, but also means more trainable parameters. A lower rank means less computational overhead, but potentially less precise adaptation.

This tutorial uses a LoRA rank of 4. In practice, begin with a relatively small rank (such as 4, 8, 16). This is computationally efficient for experimentation. Train your model with this rank and evaluate the performance improvement on your task. Gradually increase the rank in subsequent trials and see if that further boosts performance.

In [22]:
# Enable LoRA for the model and set the LoRA rank to 4.
gemma_lm.backbone.enable_lora(rank=4)
gemma_lm.summary()

Note that enabling LoRA reduces the number of trainable parameters significantly.

In [29]:
# Example of explicitly defining the input shape (adjust according to your model's expected input)
input_shape = (128,)  # Example shape, adjust based on your actual needs
gemma_lm.build(input_shape)  # This method builds the model


In [31]:
# Re-compiling the model to ensure it's set up correctly
gemma_lm.compile(
    optimizer=optimizer,
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=[keras.metrics.SparseCategoricalAccuracy()]
)


In [33]:
# If possible, specify the input shape directly in the model's constructor or setup function
gemma_lm.build(input_shape=(None, 128))  # None can be batch size or specific sequence length
print(gemma_lm.summary())  # Check the model's architecture


None


In [35]:
# Re-compiling the model with a potentially simpler optimizer for debugging
gemma_lm.compile(
    optimizer='adam',  # Using a standard Adam optimizer for testing
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)


In [34]:
# Check the first item in the dataset to verify structure and type
print(data[0])  # Assuming data is a list or similar iterable


Question:
"What is digital marketing?"

Answer:
"Digital marketing refers to advertising delivered through digital channels like search engines, websites, social media, email, and mobile apps."


### Inference after fine tuning
After fine tuning the model, let's try the prompt again to ask for treatments to the disease.

In [40]:
# Assuming 'template' is defined somewhere in your notebook like this:
template = "Question:\n{question}\n\nAnswer:\n{answer}"

# Prepare a question from your dataset, for example:
question = "What are the latest trends in digital marketing?"

# Format the prompt to fit the expected input structure of your language model
prompt = template.format(question=question, answer="")

# Generate an answer using the language model
# Make sure gemma_lm has a generate method, and it is used correctly here
try:
    generated_answer = gemma_lm.generate(prompt, max_length=128)
    # Print the generated answer
    print("Generated Answer:", generated_answer)
except AttributeError as e:
    print("Error:", e)
    print("Check if the 'generate' method is correctly implemented in 'gemma_lm'.")
except Exception as e:
    print("An error occurred:", e)


An error occurred: 'NoneType' object is not callable


The response is much more helpful than before fine-tuning, readily listing potential treatment options for Chronic Eosinophilic Leukemia.

## Upload your model to Kaggle

Create a preset directory for your model files.

Then, save the model to that preset directory.

In [None]:
preset = "./medical_gemma"
# Save the model to the preset directory.
gemma_lm.save_to_preset(preset)

Create a Kaggle URI for your model.
It should follow the following format:

`kaggle://{KAGGLE USERNAME}/{MODEL NAME}/keras/{VARIATION NAME}`

In [None]:
kaggle_username = userdata.get('KAGGLE_USERNAME')
model_name = "gemma"
variation_name = "medical_gemma"

uri = f"kaggle://{kaggle_username}/{model_name}/keras/{variation_name}"
uri

'kaggle://nkovela/gemma/keras/medical_gemma'

Then, upload the preset to Kaggle!

If this is your first upload of this model, a Kaggle model page will be created associated with your profile.

You can view all your models on your [Work Page](https://www.kaggle.com/work/models).

In [None]:
# Upload preset to Kaggle
keras_nlp.upload_preset(uri, preset)

Starting upload for file task.json
Uploading: 100%|██████████| 1.91k/1.91k [00:00<00:00, 2.27kB/s]
Upload successful: task.json (2KB)
Starting upload for file tokenizer.json
Uploading: 100%|██████████| 315/315 [00:00<00:00, 374B/s]
Upload successful: tokenizer.json (315B)
Starting upload for file preprocessor.json
Uploading: 100%|██████████| 831/831 [00:00<00:00, 990B/s]
Upload successful: preprocessor.json (831B)
Starting upload for file config.json
Uploading: 100%|██████████| 501/501 [00:00<00:00, 582B/s]
Upload successful: config.json (501B)
Starting upload for file metadata.json
Uploading: 100%|██████████| 143/143 [00:00<00:00, 176B/s]
Upload successful: metadata.json (143B)
Starting upload for file model.weights.h5
Uploading: 100%|██████████| 10.0G/10.0G [06:47<00:00, 24.6MB/s]
Upload successful: model.weights.h5 (9GB)
Starting upload for file vocabulary.spm
Uploading: 100%|██████████| 4.24M/4.24M [00:02<00:00, 1.75MB/s]
Upload successful: vocabulary.spm (4MB)
Your model instance 