##### Copyright 2024 Google LLC.

In [None]:
#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Finetune G_mini models in Keras using LoRA


<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://ai.google.dev/gemma/docs/lora_tuning"><img src="https://ai.google.dev/static/site-assets/images/docs/notebook-site-button.png" height="32" width="32" />View on ai.google.dev</a>
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/google/generative-ai-docs/blob/main/site/en/gemma/docs/lora_tuning.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/335"><img src="https://ai.google.dev/images/cloud-icon.svg" width="40" />Open in Vertex AI</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/google/generative-ai-docs/blob/main/site/en/gemma/docs/lora_tuning.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>
  </td>
</table>

## Overview

Adapt Large Language Models (LLMs) like G_mini to perform specific tasks by fine-tuning them with domain-specific data. However, due to the large number of model parameters in these models, full fine-tuning which updates all the model parameters is mostly infeasible due to high GPU memory requirements.

[Low Rank Adaptation (LoRA)](https://arxiv.org/abs/2106.09685){:external} is a fine-tuning technique which greatly reduces the number of trainable parameters for downstream tasks by freezing the weights of the model and inserting a smaller number of new weights into the model. This makes training with LoRA much faster, memory-efficient, and produces smaller model weights (a few hundred MBs).

This tutorial walks you through using Keras NLP to perform LoRA fine-tuning on a G_mini model using the [IMDb dataset](https://www.kaggle.com/datasets/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews).

## Setup

Import required libraries.

In [None]:
import os
from google.colab import userdata

Set environment variables.

In [None]:
os.environ["KAGGLE_USERNAME"] = userdata.get('KAGGLE_USERNAME')
os.environ["KAGGLE_KEY"] = userdata.get('KAGGLE_KEY')

Install Keras and Keras NLP with the Gemma model.

In [None]:
# Install Keras 3 last. See https://keras.io/getting_started/ for more details.
!pip install -q -U keras-nlp
!pip install -q -U keras>=3

Reading package lists...
Building dependency tree...
Reading state information...
The following additional packages will be installed:
  python3-pip-whl python3-setuptools-whl
The following NEW packages will be installed:
  python3-pip-whl python3-setuptools-whl python3.10-venv
0 upgraded, 3 newly installed, 0 to remove and 32 not upgraded.
Need to get 2,473 kB of archives.
After this operation, 2,884 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu jammy-updates/universe amd64 python3-pip-whl all 22.0.2+dfsg-1ubuntu0.4 [1,680 kB]
Get:2 http://archive.ubuntu.com/ubuntu jammy-updates/universe amd64 python3-setuptools-whl all 59.6.0-1.2ubuntu0.22.04.1 [788 kB]
Get:3 http://archive.ubuntu.com/ubuntu jammy-updates/universe amd64 python3.10-venv amd64 3.10.12-1~22.04.3 [5,716 B]
Fetched 2,473 kB in 1s (1,878 kB/s)
Selecting previously unselected package python3-pip-whl.
(Reading database ... 121730 files and directories currently installed.)
Preparing to unpa

Set the Keras backend. This tutorial uses PyTorch.

In [None]:
os.environ["KERAS_BACKEND"] = "torch"  # Or "jax" or "tensorflow".

## Load Dataset

Use [TensorFlow Datasets (TFDS)](https://www.tensorflow.org/datasets/overview) to load the IMDb dataset.

In [None]:
import tensorflow_datasets as tfds

imdb_train = tfds.load(
    "imdb_reviews",
    split="train",
    as_supervised=True,
    batch_size=2,
)
# Drop labels.
imdb_train = imdb_train.map(lambda x, y: x)

imdb_train.unbatch().take(1).get_single_element()

Downloading and preparing dataset 80.23 MiB (download: 80.23 MiB, generated: Unknown size, total: 80.23 MiB) to /root/tensorflow_datasets/imdb_reviews/plain_text/1.0.0...


Dl Completed...: 0 url [00:00, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]

Generating splits...:   0%|          | 0/3 [00:00<?, ? splits/s]

Generating train examples...:   0%|          | 0/25000 [00:00<?, ? examples/s]

Shuffling /root/tensorflow_datasets/imdb_reviews/plain_text/1.0.0.incompleteCPFHXV/imdb_reviews-train.tfrecord…

Generating test examples...:   0%|          | 0/25000 [00:00<?, ? examples/s]

Shuffling /root/tensorflow_datasets/imdb_reviews/plain_text/1.0.0.incompleteCPFHXV/imdb_reviews-test.tfrecord*…

Generating unsupervised examples...:   0%|          | 0/50000 [00:00<?, ? examples/s]

Shuffling /root/tensorflow_datasets/imdb_reviews/plain_text/1.0.0.incompleteCPFHXV/imdb_reviews-unsupervised.t…

Dataset imdb_reviews downloaded and prepared to /root/tensorflow_datasets/imdb_reviews/plain_text/1.0.0. Subsequent calls will reuse this data.


<tf.Tensor: shape=(), dtype=string, numpy=b"This was an absolutely terrible movie. Don't be lured in by Christopher Walken or Michael Ironside. Both are great actors, but this must simply be their worst role in history. Even their great acting could not redeem this movie's ridiculous storyline. This movie is an early nineties US propaganda piece. The most pathetic scenes were those when the Columbian rebels were making their cases for revolutions. Maria Conchita Alonso appeared phony, and her pseudo-love affair with Walken was nothing but a pathetic emotional plug in a movie that was devoid of any real meaning. I am disappointed that there are movies like this, ruining actor's like Christopher Walken's good name. I could barely sit through it.">

Use a subset of the dataset for faster training.

In [None]:
imdb_train = imdb_train.take(2000)

## Load Model

In [None]:
import keras
import keras_nlp

g_mini_lm = keras_nlp.models.GMiniCausalLM.from_preset("g_mini_2b_en")
g_mini_lm.summary()

Downloading from https://www.kaggle.com/api/v1/models/keras/g_mini/keras/g_mini_2b_en/2/download/config.json...
100%|██████████| 557/557 [00:00<00:00, 1.22MB/s]
Downloading from https://www.kaggle.com/api/v1/models/keras/g_mini/keras/g_mini_2b_en/2/download/model.weights.h5...
100%|██████████| 9.34G/9.34G [02:12<00:00, 75.7MB/s]
Downloading from https://www.kaggle.com/api/v1/models/keras/g_mini/keras/g_mini_2b_en/2/download/tokenizer.json...
100%|██████████| 404/404 [00:00<00:00, 925kB/s]
Downloading from https://www.kaggle.com/api/v1/models/keras/g_mini/keras/g_mini_2b_en/2/download/assets/tokenizer/vocabulary.spm...
100%|██████████| 4.04M/4.04M [00:00<00:00, 16.1MB/s]


### Inference before fine tuning

Prompt the model with a movie name (Modern Times) to generate responses.

In [None]:
g_mini_lm.generate("Modern times ", max_length=64)

'Modern times 2022 is a year in which the world is facing a lot of changes. The most significant changes are taking place in the field of technology. The use of new and advanced technologies is becoming more and more prevalent in various areas, from healthcare to education.\n\nThe use of technology in healthcare'

The model generates some text but it is not related to a movie.

## LoRA Fine-tuning

In this section, you will perform LoRA fine-tuning on the model.

In [None]:
# Enable LoRA for the model and set the LoRA rank to 4.
g_mini_lm.backbone.enable_lora(rank=4)
g_mini_lm.summary()

Note that enabling LoRA reduces the number of trainable parameters significantly.

In [None]:
# Fine-tune on IMDb movie reviews.

# Limit the input sequence length to 128 (to control memory usage).
g_mini_lm.preprocessor.sequence_length = 128
# Use AdamW (a common optimizer for transformer models).
optimizer = keras.optimizers.AdamW(
    learning_rate=5e-5,
    weight_decay=0.01,
)
# Exclude layernorm and bias terms from decay.
optimizer.exclude_from_weight_decay(var_names=["bias", "scale"])

g_mini_lm.compile(
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    optimizer=optimizer,
    weighted_metrics=[keras.metrics.SparseCategoricalAccuracy()],
)
g_mini_lm.fit(imdb_train, epochs=1)

[1m2000/2000[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1642s[0m 820ms/step - loss: 2.9327 - sparse_categorical_accuracy: 0.4073


<keras.src.callbacks.history.History at 0x7d01b1b332e0>

### Inference after fine tuning
After fine tuning the model, the generated text is related to the movie.

In [None]:
g_mini_lm.generate("Modern times ", max_length=64)

'Modern times 1970s. A group of young friends decide to take revenge on their high school bully. They plan an elaborate revenge on their bully. They decide to kidnap him and take him to the woods. They plan to shoot him to death and make it look like a suicide. But they are'

## Summary

This tutorial walks you through using Keras NLP to fine-tune a G_mini model on the IMDb dataset.