<a href="https://colab.research.google.com/github/weedge/doraemon-nb/blob/main/Fine_tune_Gemma_models_in_Keras_using_LoRA.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##### Copyright 2024 Google LLC.

In [None]:
#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Fine-tune Gemma models in Keras using LoRA


<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://ai.google.dev/gemma/docs/lora_tuning"><img src="https://ai.google.dev/static/site-assets/images/docs/notebook-site-button.png" height="32" width="32" />View on ai.google.dev</a>
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/google/generative-ai-docs/blob/main/site/en/gemma/docs/lora_tuning.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/google/generative-ai-docs/main/site/en/gemma/docs/lora_tuning.ipynb"><img src="https://ai.google.dev/images/cloud-icon.svg" width="40" />Open in Vertex AI</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/google/generative-ai-docs/blob/main/site/en/gemma/docs/lora_tuning.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>
  </td>
</table>

## 概述

Gemma 是一系列轻量级、最先进开放模型的家族，它们基于与创建 Gemini 模型相同的研究和技术构建。

像 Gemma 这样的大型语言模型（LLMs）已被证明在多种自然语言处理（NLP）任务中非常有效。LLM 首先在大量文本上进行自监督预训练。预训练帮助 LLM 学习通用知识，例如单词之间的统计关系。然后，LLM 可以通过特定领域的数据进行微调，以执行下游任务（如情感分析）。

LLM 的规模非常庞大（参数数量以十亿计）。对于大多数应用来说，不需要完全微调（即更新模型中的所有参数），因为典型的微调数据集相对于预训练数据集来说要小得多。

[低秩适应（LoRA）](https://arxiv.org/abs/2106.09685) 是一种微调技术，它通过冻结模型的权重并插入较少的新权重，大大减少了下游任务的可训练参数数量。这使得使用 LoRA 进行训练速度更快，内存效率更高，并且产生的模型权重更小（几百MB），同时保持了模型输出的质量。

本教程将指导您如何使用 KerasNLP 对 Gemma 2B 模型进行 LoRA 微调，使用的是 [Databricks Dolly 15k 数据集](https://huggingface.co/datasets/databricks/databricks-dolly-15k)。该数据集包含 15,000 个专为微调 LLM 设计的高质量人工生成的提示/响应对。

## Setup

为了完成这个教程，您首先需要按照 [Gemma 设置](https://ai.google.dev/gemma/docs/setup) 的说明进行操作。Gemma 设置指南将向您展示如何执行以下操作：

- 在 [kaggle.com](https://kaggle.com) 上获取 Gemma 的访问权限。
- 选择一个具有足够资源的 Colab 运行环境来运行 Gemma 2B 模型。
- 生成并配置 Kaggle 用户名和 API 密钥。

完成 Gemma 设置后，请继续进行下一节，在那里您将为您的 Colab 环境设置环境变量。如果您在访问上述链接时遇到问题，可能是因为网络原因或链接本身的问题。建议您检查链接的有效性，并在网络状况良好时重试。如果您不需要这些链接的解析内容，或者有其他问题需要帮助，请告诉我，我会尽力为您提供帮助。

### Select the runtime

为了完成这个教程，您需要一个具有足够资源的 Colab 运行环境来运行 Gemma 模型。在这种情况下，您可以使用 T4 GPU。以下是选择运行环境的步骤：

1. 在 Colab 窗口的右上角，点击菜单按钮（通常是一个三角形的图标）来展开更多选项。
2. 在下拉菜单中选择“更改运行环境类型”（Change runtime type）。
3. 在“硬件加速器”（Hardware accelerator）选项下，选择“T4 GPU”。


### Configure your API key

要使用 Gemma，您必须提供您的 Kaggle 用户名和 Kaggle API 密钥。

生成 Kaggle API 密钥的方法是，登录到您的 Kaggle 用户账户，点击“账户”选项卡，然后选择“创建新令牌”。这将触发下载一个名为 `kaggle.json` 的文件，其中包含了您的 API 凭据。

在 Colab 中，选择左侧面板的 **Secrets**（🔑），然后添加您的 Kaggle 用户名和 Kaggle API 密钥。将您的用户名存储在名为 `KAGGLE_USERNAME` 的变量下，将您的 API 密钥存储在名为 `KAGGLE_KEY` 的变量下。

请注意，为了安全起见，您不应该在公共场合分享您的 Kaggle 用户名和 API 密钥。确保这些凭据在您的 Colab 环境中安全地存储和使用。如果您在设置过程中遇到任何问题，或者需要进一步的帮助，请随时告诉我。

### Set environment variables

Set environment variables for `KAGGLE_USERNAME` and `KAGGLE_KEY`.

In [2]:
import os
from google.colab import userdata

# Note: `userdata.get` is a Colab API. If you're not using Colab, set the env
# vars as appropriate for your system.

os.environ["KAGGLE_USERNAME"] = userdata.get('KAGGLE_USERNAME')
os.environ["KAGGLE_KEY"] = userdata.get('KAGGLE_KEY')

### Install dependencies

安装 Keras、KerasNLP 以及其他依赖项

In [3]:
# Install Keras 3 last. See https://keras.io/getting_started/ for more details.
!pip install -q -U keras-nlp
!pip install -q -U keras>=3

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m465.3/465.3 kB[0m [31m8.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m950.8/950.8 kB[0m [31m16.8 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m5.2/5.2 MB[0m [31m33.0 MB/s[0m eta [36m0:00:00[0m
[?25h[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow 2.15.0 requires keras<2.16,>=2.15.0, but you have keras 3.0.5 which is incompatible.[0m[31m
[0m

### Select a backend

Keras 是一个高级的、支持多框架的深度学习 API，旨在提供简单易用的体验。使用 Keras 3，您可以在三种后端之一上运行工作流：TensorFlow、JAX 或 PyTorch。

对于这个教程，配置 JAX 作为后端。

In [4]:
os.environ["KERAS_BACKEND"] = "jax"  # Or "torch" or "tensorflow".
# Avoid memory fragmentation on JAX backend.
os.environ["XLA_PYTHON_CLIENT_MEM_FRACTION"]="1.00"

### Import packages

Import Keras and KerasNLP.

In [5]:
import keras
import keras_nlp

## Load Dataset

In [6]:
!wget -O databricks-dolly-15k.jsonl https://huggingface.co/datasets/databricks/databricks-dolly-15k/resolve/main/databricks-dolly-15k.jsonl

--2024-03-06 12:24:05--  https://huggingface.co/datasets/databricks/databricks-dolly-15k/resolve/main/databricks-dolly-15k.jsonl
Resolving huggingface.co (huggingface.co)... 18.164.174.55, 18.164.174.17, 18.164.174.118, ...
Connecting to huggingface.co (huggingface.co)|18.164.174.55|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://cdn-lfs.huggingface.co/repos/34/ac/34ac588cc580830664f592597bb6d19d61639eca33dc2d6bb0b6d833f7bfd552/2df9083338b4abd6bceb5635764dab5d833b393b55759dffb0959b6fcbf794ec?response-content-disposition=attachment%3B+filename*%3DUTF-8%27%27databricks-dolly-15k.jsonl%3B+filename%3D%22databricks-dolly-15k.jsonl%22%3B&Expires=1709987045&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTcwOTk4NzA0NX19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy5odWdnaW5nZmFjZS5jby9yZXBvcy8zNC9hYy8zNGFjNTg4Y2M1ODA4MzA2NjRmNTkyNTk3YmI2ZDE5ZDYxNjM5ZWNhMzNkYzJkNmJiMGI2ZDgzM2Y3YmZkNTUyLzJkZjkwODMzMzhiNGFiZDZiY2ViNTYzNTc2NGR

数据预处理。本教程使用1000个训练样本的子集来加快笔记本的执行速度。为了获得更高质量的微调结果，请考虑使用更多的训练数据。



In [8]:
import json
data = []
with open("databricks-dolly-15k.jsonl") as file:
    for line in file:
        features = json.loads(line)
        # Filter out examples with context, to keep it simple.
        if features["context"]:
            continue
        # Format the entire example as a single string.
        template = "Instruction:\n{instruction}\n\nResponse:\n{response}"
        data.append(template.format(**features))

# Only use 1000 training examples, to keep it fast.
data = data[:1000]

## Load Model

KerasNLP 提供了许多流行的 模型架构 的实现。在这个教程中，您将使用 GemmaCausalLM 创建一个模型，这是一个用于因果语言建模的端到端 Gemma 模型。因果语言模型基于之前的标记预测下一个标记。

使用 from_preset 方法创建模型：


In [9]:
gemma_lm = keras_nlp.models.GemmaCausalLM.from_preset("gemma_2b_en")
gemma_lm.summary()

Attaching 'config.json' from model 'keras/gemma/keras/gemma_2b_en/2' to your Colab notebook...
Attaching 'config.json' from model 'keras/gemma/keras/gemma_2b_en/2' to your Colab notebook...
Attaching 'model.weights.h5' from model 'keras/gemma/keras/gemma_2b_en/2' to your Colab notebook...
Attaching 'tokenizer.json' from model 'keras/gemma/keras/gemma_2b_en/2' to your Colab notebook...
Attaching 'assets/tokenizer/vocabulary.spm' from model 'keras/gemma/keras/gemma_2b_en/2' to your Colab notebook...


`from_preset` 方法根据预设的架构和权重实例化模型。在上面的代码中，字符串 "gemma_2b_en" 指定了预设的架构——一个具有 20 亿参数的 Gemma 模型。

注意：也有一个具有 70 亿参数的 Gemma 模型可用。要在 Colab 中运行更大的模型，您需要访问付费计划中提供的高级 GPU。或者，您可以在 Kaggle 或 Google Cloud 上对 Gemma 7B 模型进行[分布式微调](https://ai.google.dev/gemma/docs/distributed_tuning)。

## 微调前的推理

在本节中，您将在各种提示下查询模型，以查看它如何响应。

### Europe Trip Prompt

向模型查询关于去欧洲旅行应该做什么的建议。

In [10]:
prompt = template.format(
    instruction="What should I do on a trip to Europe?",
    response="",
)
sampler = keras_nlp.samplers.TopKSampler(k=5, seed=2)
gemma_lm.compile(sampler=sampler)
print(gemma_lm.generate(prompt, max_length=256))

Instruction:
What should I do on a trip to Europe?

Response:
It's easy, you just need to follow these steps:

First you must book your trip with a travel agency.
Then you must choose a country and a city.
Next you must choose your hotel, your flight, and your travel insurance
And last you must pack for your trip.
 


What are the benefits of a travel agency?

Response:
Travel agents have the best prices, they know how to negotiate and they can find deals that you won't find on your own.

What are the disadvantages of a travel agency?

Response:
Travel agents are not as flexible as you would like. If you need to change your travel plans last minute, they may charge you a fee for that.
 


How do I choose a travel agency?

Response:
There are a few things you can do to choose the right travel agent. First, check to see if they are accredited by the Better Business Bureau. Second, check their website and see what kind of information they offer. Third, look at their reviews online to see 

该模型会给出如何计划旅行的一般性建议。

### ELI5 Photosynthesis Prompt

提示模型用简单到5岁的孩子都能理解的方式来解释光合作用。

In [11]:
prompt = template.format(
    instruction="Explain the process of photosynthesis in a way that a child could understand.",
    response="",
)
print(gemma_lm.generate(prompt, max_length=256))

Instruction:
Explain the process of photosynthesis in a way that a child could understand.

Response:
Plants use light energy and carbon dioxide to make sugar and oxygen. This is a simple chemical change because the chemical bonds in the sugar and oxygen are unchanged. Plants also release oxygen during photosynthesis.

Instruction:
Explain how photosynthesis is an example of chemical change.

Response:
Photosynthesis is a chemical reaction that produces oxygen and sugar.

Instruction:
Explain how plants make their own food.

Response:
Plants use energy from sunlight to make sugar and oxygen during photosynthesis.

Instruction:
Explain how the chemical change in a plant during photosynthesis can be described as an example of a chemical reaction.

Response:
Photosynthesis is a chemical change that results in the formation of sugar from carbon dioxide, water, and energy from sunlight.

Instruction:
Explain the role of chlorophyll in plant photosynthesis.

Response:
Chlorophyll is a green 

模型响应中包含的单词对儿童来说可能不容易理解，例如叶绿素。

## LoRA微调

为了从模型中获得更好的响应，使用Databricks Dolly 15k数据集用低秩自适应(LoRA)对模型进行微调。

LoRA排序决定了可训练矩阵的维度，这些矩阵被添加到LLM的原始权重中。它控制微调调整的表现力和精度。

更高的排名意味着可能会有更详细的变化，但也意味着更多可训练的参数。较低的秩意味着更少的计算开销，但可能不那么精确的自适应。

本教程使用的LoRA等级为4。实际上，从一个相对较小的秩开始(如4、8、16)。对于实验来说，这在计算上是高效的。用这个排名来训练你的模型，并评估你任务的性能改进。在随后的试验中逐渐提高排名，看看是否能进一步提高性能。

In [12]:
# Enable LoRA for the model and set the LoRA rank to 4.
gemma_lm.backbone.enable_lora(rank=4)
gemma_lm.summary()

请注意，启用LoRA大大减少了可训练参数的数量(从25亿减少到130万)。

In [13]:
# Limit the input sequence length to 512 (to control memory usage).
gemma_lm.preprocessor.sequence_length = 512
# Use AdamW (a common optimizer for transformer models).
optimizer = keras.optimizers.AdamW(
    learning_rate=5e-5,
    weight_decay=0.01,
)
# Exclude layernorm and bias terms from decay.
optimizer.exclude_from_weight_decay(var_names=["bias", "scale"])

gemma_lm.compile(
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    optimizer=optimizer,
    weighted_metrics=[keras.metrics.SparseCategoricalAccuracy()],
)
gemma_lm.fit(data, epochs=1, batch_size=1)

[1m1000/1000[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1464s[0m 1s/step - loss: 0.4587 - sparse_categorical_accuracy: 0.5230


<keras.src.callbacks.history.History at 0x7b71a026c1c0>

###请注意NVIDIA gpu上的混合精度微调

对于微调，建议全精度。当在NVIDIA gpu上进行微调时，请注意，您可以使用混合精度(`keras.mixed_precision.set_global_policy('mixed_bfloat16')`)来加速训练，而对训练质量的影响最小。混合精度微调确实消耗更多内存，因此只在较大的gpu上有用。


对于推理，半精度(`keras.config.set_floatx("bfloat16")`)可以工作并节省内存，而混合精度则不适用。

In [14]:
# Uncomment the line below if you want to enable mixed precision training on GPUs
keras.mixed_precision.set_global_policy('mixed_bfloat16')

## 微调后的推理
微调之后，响应将按照提示符中提供的说明进行操作。

### Europe Trip Prompt


In [15]:
prompt = template.format(
    instruction="What should I do on a trip to Europe?",
    response="",
)
sampler = keras_nlp.samplers.TopKSampler(k=5, seed=2)
gemma_lm.compile(sampler=sampler)
print(gemma_lm.generate(prompt, max_length=256))

Instruction:
What should I do on a trip to Europe?

Response:
The first thing on my list is to visit the Louvre in Paris.  This world-famous museum has the Mona Lisa, Venus de Milo, and other world treasures.  I would love to see the Mona Lisa.  I would also like to visit other art museums in Paris, but I would like to see some other art museums in Europe as well.  I am a big fan of classical music, and the concert venues of Europe are some of my favorites, so I would like to check out some concerts as well.


该模型现在推荐欧洲的旅游景点。

### ELI5 Photosynthesis Prompt


In [16]:
prompt = template.format(
    instruction="Explain the process of photosynthesis in a way that a child could understand.",
    response="",
)
print(gemma_lm.generate(prompt, max_length=256))

Instruction:
Explain the process of photosynthesis in a way that a child could understand.

Response:
Photosynthesis is a chemical process in which plants and some other organisms convert light (sunlight) energy to chemical energy. This chemical energy is used to build sugars. Photosynthesis occurs in the chloroplasts, which are organelles that contain chlorophyll. Chlorophyll is a green molecule that absorbs light energy.


这个模型现在用更简单的术语解释了光合作用。

请注意，出于演示目的，本教程仅在一个epoch上对数据集的一个小子集进行微调，并使用低LoRA rank值。为了从微调后的模型中得到更好的响应，你可以尝试:

1. 增加微调数据集的大小
2. 训练更多步数(步数)
3. 设置更高的LoRA等级
4. 修改超参数值，如`learning_rate`和`weight_decay`。

## 总结和下一步

本教程介绍了使用KerasNLP对Gemma模型进行LoRA微调。接下来查看以下文档:

* 学习如何[使用Gemma模型生成文本](https://ai.google.dev/gemma/docs/get_started)。
* 学习如何[对Gemma模型进行分布式微调和推理](https://ai.google.dev/gemma/docs/distributed_tuning)。
* 学习如何[在Vertex AI中使用Gemma开放模型](https://cloud.google.com/vertex-ai/docs/generative-ai/open-models/use-gemma){:.external}。
* 学习如何[使用KerasNLP微调Gemma并部署到Vertex AI](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/model_garden/model_garden_gemma_kerasnlp_to_vertexai.ipynb){:.external}。