<link rel="stylesheet" href="/site-assets/css/gemma.css">

<link rel="stylesheet" href="https://fonts.googleapis.com/css2?family=Google+Symbols:opsz,wght,FILL,GRAD@20..48,100..700,0..1,-50..200" />

##### Copyright 2024 Google LLC.

In [None]:
#@title Licensed under the Apache License, Version 2.0 (the "License");

# you may not use this file except in compliance with the License.

# You may obtain a copy of the License at

#

# https://www.apache.org/licenses/LICENSE-2.0

#

# Unless required by applicable law or agreed to in writing, software

# distributed under the License is distributed on an "AS IS" BASIS,

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

# See the License for the specific language governing permissions and

# limitations under the License.

# Fine-tune Gemma models in Keras using LoRA

<table class="tfo-notebook-buttons" align="left">

  <td>

    <a target="_blank" href="https://ai.google.dev/gemma/docs/lora_tuning"><img src="https://ai.google.dev/static/site-assets/images/docs/notebook-site-button.png" height="32" width="32" />View on ai.google.dev</a>

  <td>

    <a target="_blank" href="https://colab.research.google.com/github/google/generative-ai-docs/blob/main/site/en/gemma/docs/lora_tuning.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>

  </td>

  <td>

    <a target="_blank" href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/google/generative-ai-docs/main/site/en/gemma/docs/lora_tuning.ipynb"><img src="https://ai.google.dev/images/cloud-icon.svg" width="40" />Open in Vertex AI</a>

  </td>

  <td>

    <a target="_blank" href="https://github.com/google/generative-ai-docs/blob/main/site/en/gemma/docs/lora_tuning.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>

  </td>

</table>

## Overview



Gemma is a family of lightweight, state-of-the art open models built from the same research and technology used to create the Gemini models.



Large Language Models (LLMs) like Gemma have been shown to be effective at a variety of NLP tasks. An LLM is first pre-trained on a large corpus of text in a self-supervised fashion. Pre-training helps LLMs learn general-purpose knowledge, such as statistical relationships between words. An LLM can then be fine-tuned with domain-specific data to perform downstream tasks (such as sentiment analysis).



LLMs are extremely large in size (parameters in the order of billions). Full fine-tuning (which updates all the parameters in the model) is not required for most applications because typical fine-tuning datasets are relatively much smaller than the pre-training datasets.



[Low Rank Adaptation (LoRA)](https://arxiv.org/abs/2106.09685) is a fine-tuning technique which greatly reduces the number of trainable parameters for downstream tasks by freezing the weights of the model and inserting a smaller number of new weights into the model. This makes training with LoRA much faster and more memory-efficient, and produces smaller model weights (a few hundred MBs), all while maintaining the quality of the model outputs.



This tutorial walks you through using KerasNLP to perform LoRA fine-tuning on a Gemma 2B model using the [Databricks Dolly 15k dataset](https://huggingface.co/datasets/databricks/databricks-dolly-15k). This dataset contains 15,000 high-quality human-generated prompt / response pairs specifically designed for fine-tuning LLMs.

## Setup

### Get access to Gemma



To complete this tutorial, you will first need to complete the setup instructions at [Gemma setup](https://ai.google.dev/gemma/docs/setup). The Gemma setup instructions show you how to do the following:



* Get access to Gemma on [kaggle.com](https://kaggle.com).

* Select a Colab runtime with sufficient resources to run

  the Gemma 2B model.

* Generate and configure a Kaggle username and API key.



After you've completed the Gemma setup, move on to the next section, where you'll set environment variables for your Colab environment.

### Select the runtime



To complete this tutorial, you'll need to have a Colab runtime with sufficient resources to run the Gemma model. In this case, you can use a T4 GPU:



1. In the upper-right of the Colab window, select &#9662; (**Additional connection options**).

2. Select **Change runtime type**.

3. Under **Hardware accelerator**, select **T4 GPU**.

### Configure your API key



To use Gemma, you must provide your Kaggle username and a Kaggle API key.



To generate a Kaggle API key, go to the **Account** tab of your Kaggle user profile and select **Create New Token**. This will trigger the download of a `kaggle.json` file containing your API credentials.



In Colab, select **Secrets** (🔑) in the left pane and add your Kaggle username and Kaggle API key. Store your username under the name `KAGGLE_USERNAME` and your API key under the name `KAGGLE_KEY`.

### Set environment variables



Set environment variables for `KAGGLE_USERNAME` and `KAGGLE_KEY`.

In [None]:
pip install keras keras-nlp huggingface-hub tensorflow


Note: you may need to restart the kernel to use updated packages.


In [None]:
import os

# Set these directly if not using Colab; replace 'your_kaggle_username' and 'your_kaggle_key' with actual values
os.environ["KAGGLE_USERNAME"] = "oluidiakhoa"
os.environ["KAGGLE_KEY"] = "6b8f69f428789daad475b6e04f03975e"


### Install dependencies



Install Keras, KerasNLP, and other dependencies.

In [None]:
# Install Keras 3 last. See https://keras.io/getting_started/ for more details.

!pip install -q -U keras-nlp

!pip install -q -U "keras>=3"

### Select a backend



Keras is a high-level, multi-framework deep learning API designed for simplicity and ease of use. Using Keras 3, you can run workflows on one of three backends: TensorFlow, JAX, or PyTorch.



For this tutorial, configure the backend for JAX.

In [None]:
os.environ["KERAS_BACKEND"] = "jax"  # Or "torch" or "tensorflow".

# Avoid memory fragmentation on JAX backend.

os.environ["XLA_PYTHON_CLIENT_MEM_FRACTION"]="1.00"

### Import packages



Import Keras and KerasNLP.

In [None]:
import keras

import keras_nlp

## Load Dataset

In [None]:
##!wget -O databricks-dolly-15k.jsonl https://huggingface.co/datasets/databricks/databricks-dolly-15k/resolve/main/databricks-dolly-15k.jsonl

Preprocess the data. This tutorial uses a subset of 1000 training examples to execute the notebook faster. Consider using more training data for higher quality fine-tuning.

In [None]:
import json
import random

# Load and filter data
data = []

# Replace 'med_qa.jsonl' with the actual file name
with open("/kaggle/input/med-dataset/formatted_data.jsonl") as file:
    for line in file:
        features = json.loads(line)

        # Filter out examples with context to keep it simple
        if features["context"]:
            continue

        # Format the entire example as a single string
        template = "Instruction:\n{instruction}\n\nResponse:\n{response}"
        data.append(template.format(**features))

# Shuffle and limit to 1300 examples
# random.shuffle(data)
data = data[:3000]


## Load Model



KerasNLP provides implementations of many popular [model architectures](https://keras.io/api/keras_nlp/models/). In this tutorial, you'll create a model using `GemmaCausalLM`, an end-to-end Gemma model for causal language modeling. A causal language model predicts the next token based on previous tokens.



Create the model using the `from_preset` method:

In [None]:
gemma_lm = keras_nlp.models.GemmaCausalLM.from_preset("/kaggle/input/gemma2/keras/gemma2_2b_en/1")

gemma_lm.summary()

normalizer.cc(51) LOG(INFO) precompiled_charsmap is empty. use identity normalization.


The `from_preset` method instantiates the model from a preset architecture and weights. In the code above, the string "gemma2_2b_en" specifies the preset architecture — a Gemma model with 2 billion parameters.



NOTE: A Gemma model with 7

billion parameters is also available. To run the larger model in Colab, you need access to the premium GPUs available in paid plans. Alternatively, you can perform [distributed tuning on a Gemma 7B model](https://ai.google.dev/gemma/docs/distributed_tuning) on Kaggle or Google Cloud.

### Symptoms of Glaucoma Prompt



Query the model for suggestions on what to do on a trip to Europe.

## Inference before fine tuning



In this section, you will query the model with various prompts to see how it responds.

### Med Q & A  Prompt



Prompt the model to explain photosynthesis in terms simple enough for a 5 year old child to understand.

In [None]:
prompt = template.format(
 instruction="What common risk factors for Lymphocytic Choriomeningitis (LCMV) should be highlighted in patient education materials?",
 response="",
)
sampler = keras_nlp.samplers.TopKSampler(k=5, seed=2)
gemma_lm.compile(sampler=sampler) # Removed extra spaces before this line
print(gemma_lm.generate(prompt, max_length=256))

Instruction:
What common risk factors for Lymphocytic Choriomeningitis (LCMV) should be highlighted in patient education materials?

Response:
A) A history of travel to an area where the virus is endemic, especially to areas of Africa and South America, is the most significant risk factor in LCMV.
B) Pregnant women and those with a compromised immune system are at risk for LCMV.
C) LCMV is a rare infection that can cause a wide range of clinical symptoms, including meningitis.
D) The incubation period for LCMV is typically two weeks, although it may vary depending on the patient's immune status.

Rationale:
The incubation period for LCMV is typically two weeks, although it may vary depending on the patient's immune status. Pregnant women and those with a compromised immune system are at risk for LCMV. The risk of infection is increased in areas where the virus is endemic, such as Africa and South America. LCMV can be transmitted through contact with infected urine or feces, as well as 

The model responds with generic tips on how to plan a trip.

In [None]:
# Define the prompt with an instruction to identify causes of sudden weight loss
prompt = template.format(
    instruction="What are the primary diagnostic steps for LCMV, and what challenges may arise in accurately diagnosing it?",
    response=""
)

# Generate a response from the language model
print(gemma_lm.generate(prompt, max_length=256))



Instruction:
What are the primary diagnostic steps for LCMV, and what challenges may arise in accurately diagnosing it?

Response:
Diagnostic steps:
-Clinical presentation
-Laboratory tests (CBC, serologic tests for LCMV, and serology for other viruses)
-Viral cultures and viral RNA detection
-Serologic tests are used in LCMV diagnosis, and they are also used to detect other viruses. Serologic tests may not be able to distinguish LCMV from other viruses that have similar symptoms.
-Viral culture can detect LCMV, but it is not as sensitive as PCR testing, which can detect LCMV in the blood.
-PCR testing can detect LCMV in the blood and may be the preferred method for LCMV diagnosis.

Challenges:
-LCMV infection can be difficult to distinguish from other viral infections.

-The incubation period of LCMV can vary, and it can be difficult to determine the onset of symptoms in some cases.

-LCMV can spread through direct contact, and it can be transmitted to people who have been exposed to 

In [None]:
# Define the prompt with an instruction about anemia and its treatment
prompt = template.format(
    instruction="What diagnostic tests are most effective in early detection of viral infections with neurological symptoms?",
    response=""
)

# Generate a response from the language model
print(gemma_lm.generate(prompt, max_length=256))



Instruction:
What diagnostic tests are most effective in early detection of viral infections with neurological symptoms?

Response:
In the early stages of a viral infection, it’s hard to determine which viruses are causing the symptoms. A physician may order a complete blood cell count (CBC). If there are abnormal results, a physician may order a viral panel, which is usually a combination of blood tests that screen for a wide variety of viruses, including HIV, cytomegalovirus, hepatitis, parvovirus, Epstein-Barr, and herpes simplex.

The physician may also order an antibody test to detect the presence of viral antibodies. If a virus is present, the antibodies may be found in the blood, urine, or cerebro spinal fluid. If a virus is not present, the test will be negative.

In some cases, the physician may order a viral culture. A culture of the virus is taken from the blood, urine, or cerebral spinal fluid. If there is no virus present, the culture will be negative.

The physician may a

In [None]:
# Define the prompt with an instruction on diagnosing chronic kidney disease
prompt = template.format(
    instruction="What are the early symptoms of common zoonotic infections, and how can they be differentiated from similar diseases?",
    response=""
)

# Generate a response from the language model
print(gemma_lm.generate(prompt, max_length=256))


Instruction:
What are the early symptoms of common zoonotic infections, and how can they be differentiated from similar diseases?

Response:
The early symptoms of common zoonotic infections can include fever, malaise, headache, muscle aches, diarrhea, nausea, vomiting, abdominal pain, rash, and other symptoms. These symptoms can be similar to those of other diseases, such as the flu, so it's important to consult a healthcare provider to rule out other causes.

Differentiating between the early symptoms of common zoonotic infections and similar diseases can be challenging, but there are several key factors to consider.

1. Duration of symptoms: The duration of symptoms can be a helpful guide. For example, many zoonotic infections have an incubation period of 2-10 days before symptoms appear, while other common infections, such as the flu, have a shorter incubation period of 1-3 days. This can help distinguish between acute infections that are caused by zoonotic pathogens and those that 

In [None]:
# Define the prompt with an instruction about the causes of acid reflux and heartburn
prompt = template.format(
    instruction="What are the key considerations for managing viral infections in immunocompromised individuals?",
    response=""
)

# Generate a response from the language model
print(gemma_lm.generate(prompt, max_length=256))



Instruction:
What are the key considerations for managing viral infections in immunocompromised individuals?

Response:
Managing viral infections in immunocompromised individuals can be challenging due to the weakened immune system and the increased susceptibility to various viral diseases. Key considerations include:

1. Identifying the specific viral infection: Determining the type and strain of the virus is crucial for appropriate management and treatment.

2. Monitoring for complications: Immunocompromised individuals may experience complications such as sepsis, pneumonia, or encephalitis, so it is important to monitor vital signs, assess for signs of infection, and provide supportive care as needed.

3. Vaccination: Immunosuppressive therapy may reduce the effectiveness of vaccines, so it is essential to consider vaccination schedules and recommendations for the specific viral infections.

4. Infection control measures: Implementing strict infection control practices, such as pers

The model response contains words that might not be easy to understand for a child such as chlorophyll.

## LoRA Fine-tuning



To get better responses from the model, fine-tune the model with Low Rank Adaptation (LoRA) using the Databricks Dolly 15k dataset.



The LoRA rank determines the dimensionality of the trainable matrices that are added to the original weights of the LLM. It controls the expressiveness and precision of the fine-tuning adjustments.



A higher rank means more detailed changes are possible, but also means more trainable parameters. A lower rank means less computational overhead, but potentially less precise adaptation.



This tutorial uses a LoRA rank of 4. In practice, begin with a relatively small rank (such as 4, 8, 16). This is computationally efficient for experimentation. Train your model with this rank and evaluate the performance improvement on your task. Gradually increase the rank in subsequent trials and see if that further boosts performance.

In [None]:
# Enable LoRA for the model and set the LoRA rank to 4.

gemma_lm.backbone.enable_lora(rank=8)

gemma_lm.summary()

Note that enabling LoRA reduces the number of trainable parameters significantly (from 2.6 billion to 2.9 million).

In [None]:
# Uncomment the line below if you want to enable mixed precision training on GPUs

#keras.mixed_precision.set_global_policy('mixed_bfloat16')

In [None]:


# Limit the input sequence length to 256 (to control memory usage).

gemma_lm.preprocessor.sequence_length = 256

# Use AdamW (a common optimizer for transformer models).

optimizer = keras.optimizers.AdamW(

    learning_rate=5e-5,

    weight_decay=0.01,

)

# Exclude layernorm and bias terms from decay.

optimizer.exclude_from_weight_decay(var_names=["bias", "scale"])



gemma_lm.compile(

    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),

    optimizer=optimizer,

    weighted_metrics=[keras.metrics.SparseCategoricalAccuracy()],

)

gemma_lm.fit(data, epochs=5, batch_size=1)

Epoch 1/5
[1m3000/3000[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1307s[0m 426ms/step - loss: 1.1003 - sparse_categorical_accuracy: 0.5921
Epoch 2/5
[1m3000/3000[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1281s[0m 420ms/step - loss: 0.9810 - sparse_categorical_accuracy: 0.6169
Epoch 3/5
[1m3000/3000[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1260s[0m 420ms/step - loss: 0.9405 - sparse_categorical_accuracy: 0.6316
Epoch 4/5
[1m3000/3000[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1260s[0m 420ms/step - loss: 0.9014 - sparse_categorical_accuracy: 0.6440
Epoch 5/5
[1m3000/3000[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1261s[0m 420ms/step - loss: 0.8631 - sparse_categorical_accuracy: 0.6547


<keras.src.callbacks.history.History at 0x7de7482179d0>

### Note on mixed precision fine-tuning on NVIDIA GPUs



Full precision is recommended for fine-tuning. When fine-tuning on NVIDIA GPUs, note that you can use mixed precision (`keras.mixed_precision.set_global_policy('mixed_bfloat16')`) to speed up training with minimal effect on training quality. Mixed precision fine-tuning does consume more memory so is useful only on larger GPUs.





For inference, half-precision (`keras.config.set_floatx("bfloat16")`) will work and save memory while mixed precision is not applicable.

## Inference after fine-tuning

After fine-tuning, responses follow the instruction provided in the prompt.

In [None]:
prompt = template.format(
 instruction="What common risk factors for Lymphocytic Choriomeningitis (LCMV) should be highlighted in patient education materials?",
 response="",
)
sampler = keras_nlp.samplers.TopKSampler(k=5, seed=2)
gemma_lm.compile(sampler=sampler) # Removed extra spaces before this line
print(gemma_lm.generate(prompt, max_length=256))

Instruction:
What common risk factors for Lymphocytic Choriomeningitis (LCMV) should be highlighted in patient education materials?

Response:
Scientists are not sure how the virus spreads from infected mice to humans. However, it is thought that people can become infected by touching infected mice or their feces, by breathing in virus-filled particles in the air, and by getting infected through contact with urine and other fluids from infected rodents.


In [None]:
# Define the prompt with an instruction to identify causes of sudden weight loss
prompt = template.format(
    instruction="What are the primary diagnostic steps for LCMV, and what challenges may arise in accurately diagnosing it?",
    response=""
)

# Generate a response from the language model
print(gemma_lm.generate(prompt, max_length=256))



Instruction:
What are the primary diagnostic steps for LCMV, and what challenges may arise in accurately diagnosing it?

Response:
Tests that examine the urine and blood are used to test for LCMV. Tests that examine the urine and blood are also used to identify whether a person has antibodies (proof that the person has had a previous infection) against LCMV.
                    
        The tests used to detect and diagnose LCMV are:          
                
                    -   Urine immunoassay. The urine immunoassay is not used to diagnose LCMV in children younger than age 1. This test is used to diagnose LCMV infection and to monitor the effectiveness of antiviral treatment. This test is not available at all laboratories.     -   ELISA test for IgM antibodies to LCMV. The ELISA test for IgM antibody to LCMV can be used to diagnose acute LCMV infection. This test is not available at all laboratories.    -   ELISA test for IgG antibodies to LCMV.     -   PCR test for viral RNA (

In [None]:
# Define the prompt with an instruction on diagnosing chronic kidney disease
prompt = template.format(
    instruction="What are the early symptoms of common zoonotic infections, and how can they be differentiated from similar diseases?",
    response=""
)

# Generate a response from the language model
print(gemma_lm.generate(prompt, max_length=256))



Instruction:
What are the early symptoms of common zoonotic infections, and how can they be differentiated from similar diseases?

Response:
Most zoonotic infections have no signs or symptoms. Some people who become infected with a zoonotic agent will have flu-like symptoms, such as a fever or muscle aches.
                
The symptoms of zoonotic infections vary, depending on the specific pathogen that causes the infection. The following table shows some of the common zoonotic infections and the signs and symptoms they may cause.
                
     Sign and Symptoms of Zoonotic Infections   Infected person may have the following symptoms:          - Fever.     - Muscle aches.     - Headache.     - Sore throat.     - Nausea.     - Abdominal pain.     - Diarrhea.     - Rash.         Some zoonotic infections may cause more severe symptoms, such as:         - Coughing.     - Breathing problems (which can cause pneumonia).     - Eye problems, such as blindness.     - Confusion, weaknes

In [None]:
# Define the prompt with an instruction about anemia and its treatment
prompt = template.format(
    instruction="What diagnostic tests are most effective in early detection of viral infections with neurological symptoms?",
    response=""
)

# Generate a response from the language model
print(gemma_lm.generate(prompt, max_length=256))


Instruction:
What diagnostic tests are most effective in early detection of viral infections with neurological symptoms?

Response:
Blood tests and spinal fluid tests are used to identify and diagnose viral infections with neurological symptoms.
                    Blood tests. Blood tests are done to check for certain viruses and other substances in the blood. The following types of blood tests are used in the diagnosis of viral infections with neurological symptoms:         -   Antibody blood test (serologic test): This type of blood test checks for antibodies (proteins made by the immune system to fight infection) in the blood. The test detects antibodies that are made by the immune system in response to a certain virus. The results can indicate if a person has had a past infection with a certain virus and how recently the infection occurred.    -   Enzyme-linked immunosorbent assay (ELISA) test: This type of blood test can be done on a blood sample or a cerebrospinal fluid (CSF) sa

In [None]:
# Define the prompt with an instruction about the causes of acid reflux and heartburn
prompt = template.format(
    instruction="What are the key considerations for managing viral infections in immunocompromised individuals?",
    response=""
)

# Generate a response from the language model
print(gemma_lm.generate(prompt, max_length=256))


Instruction:
What are the key considerations for managing viral infections in immunocompromised individuals?

Response:
Key Points
                    - Immunocompromised patients should be treated with antiviral drugs to prevent or treat viral infections.    - The following antiviral drugs are used to treat viral infections in immunocompromised patients:         -  Antiviral drugs are available for treatment of herpesviruses.        -  Antiviral drugs are available for treatment of human immunodeficiency virus (HIV) infection and cytomegalovirus (CMV) infection.        -  Antiviral drugs are available for treatment of varicella-zoster virus (VZV) infection.        - Other antiviral drugs may be available for treatment of certain viral infections in immunocompromised patients.        - Patients should have regular tests to look for signs of infection.    - This summary was updated on July 28, 2016.
                
                
                    - Immunocompromised patients shoul

#Saving My Finetuned Model to Kaggle

In [None]:
# Upload the preset to Hugging Face Hub

#hf_uri = "hf://mgbam/finetune_gemma2_2b_en_medical_qa"



# Try uploading without specifying argument names

#keras_nlp.upload_preset(hf_uri, '/content/finetune_gemma2_2b_en_medical_qa')


## Inference of my Finetuned Model

In [None]:
#kaggle_username = "oluidiakhoa"



# Construct the Kaggle URI for uploading the preset as a new model variant

#kaggle_uri = f"kaggle://{kaggle_username}/gemma/keras/finetune1_gemma2_2b_en_medical_qa"

#finetuned_model = keras_nlp.models.GemmaCausalLM.from_preset(kaggle_uri)


In [None]:
#Define the prompt template

#template = "Instruction:\n{instruction}\n\nResponse:\n{response}"



#Format the example with an instruction for the model

#prompt = template.format( instruction="What is the medical definition of 'myelodysplastic syndrome", response="" )



#Set up a Top-K Sampler with k=5

#sampler = keras_nlp.samplers.TopKSampler(k=5, seed=2)



#Compile the fine-tuned model with the specified sampler

#finetuned_model.compile(sampler=sampler)



#Generate text based on the prompt with a maximum length of 256 tokens

#print(finetuned_model.generate(prompt, max_length=256))

In [None]:
# Define the prompt template

        # Format the entire example as a single string.

#template = "Instruction:\n{instruction}\n\nResponse:\n{response}"





#prompt = template.format(

 #   instruction="What is the medical definition of 'myelodysplastic syndrome",

 #   response="",

#)

#sampler = keras_nlp.samplers.TopKSampler(k=5, seed=2)

# Use the finetuned_model instead of gemma_lm for generation

# finetuned_model.compile(sampler=sampler)

#print(finetuned_model.generate(prompt, max_length=256))

In [None]:
# Ensure your validation data (x_val) is in text/string format.

# For example, if x_val is currently numeric, replace it with appropriate text or tokenized sequences.



# Sample x_val - ensure this is text or tokenized sequences

# This is just an example; replace with your actual validation data.

# Each entry in x_val should be a question or sentence for language model evaluation.



#x_val = [

 #   "How is chronic obstructive pulmonary disease (COPD) treated?",

 #   "What are the symptoms of epilepsy?",

  #  "What is the cause of osteoarthritis?"

#]



# Similarly, ensure y_val contains the appropriate labels in numeric form.

# y_val should have the true labels corresponding to the predictions expected from gemma_lm.



# Evaluate the model using the chosen metric (e.g., perplexity).

# Here, perplexity_value will provide an indication of the model's performance on x_val and y_val.



# The following line calculates and prints the perplexity for the language model on the validation data.

# Adjust batch_size as needed for your model and data size.

# perplexity_value = gemma_lm.evaluate(x_val, y_val, batch_size=32)

# print("Perplexity: ", perplexity_value)


In [None]:
# from sklearn.metrics import precision_score, recall_score, f1_score



# Generate predictions

# predictions = gemma_lm.predict(x_val)



# Assuming `y_val` contains true labels

# precision = precision_score(y_val, predictions, average='weighted')

# recall = recall_score(y_val, predictions, average='weighted')

# f1 = f1_score(y_val, predictions, average='weighted')



# print(f"Precision: {precision}, Recall: {recall}, F1 Score: {f1}")

In [None]:
#loss, accuracy = gemma_lm.evaluate(x_val, y_val)

#print(f"Validation Loss: {loss}, Validation Accuracy: {accuracy}")


In [None]:
#from nltk.translate.bleu_score import sentence_bleu



#reference = "This is the correct response."

#generated = gemma_lm.generate("Provide a response", max_length=256)

#score = sentence_bleu([reference.split()], generated.split())

#print("BLEU score: ", score)


The model now recommends places to visit in Europe.

The model now explains photosynthesis in simpler terms.

Note that for demonstration purposes, this tutorial fine-tunes the model on a small subset of the dataset for just one epoch and with a low LoRA rank value. To get better responses from the fine-tuned model, you can experiment with:



1. Increasing the size of the fine-tuning dataset

2. Training for more steps (epochs)

3. Setting a higher LoRA rank

4. Modifying the hyperparameter values such as `learning_rate` and `weight_decay`.

## Summary and next steps



This tutorial covered LoRA fine-tuning on a Gemma model using KerasNLP. Check out the following docs next:



* Learn how to [generate text with a Gemma model](https://ai.google.dev/gemma/docs/get_started).

* Learn how to perform [distributed fine-tuning and inference on a Gemma model](https://ai.google.dev/gemma/docs/distributed_tuning).

* Learn how to [use Gemma open models with Vertex AI](https://cloud.google.com/vertex-ai/docs/generative-ai/open-models/use-gemma).

* Learn how to [fine-tune Gemma using KerasNLP and deploy to Vertex AI](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_gemma_kerasnlp_to_vertexai.ipynb).