# 使用 Hugging Face Transformers 和 PEFT 库对 Gemma 模型进行参数高效微调（PEFT）
## 目标：  
微调Gemma使其可以生成名言金句

### 需要安装的库
- pip install bitsandbytes
- pip install git+https://github.com/huggingface/peft.git
- pip install -i https://pypi.org/simple/ bitsandbytes
- pip install accelerate
- pip install datasets
- pip install transformers
- pip install torch
- pip install trl
### 需要登录huggingface账号
- huggingface-cli login
### 代码需要再GPU下运行

In [1]:
from peft import LoraConfig

lora_config = LoraConfig(
    r=8,
    target_modules=["q_proj", "o_proj", "k_proj", "v_proj", "gate_proj", "up_proj", "down_proj"],
    task_type="CAUSAL_LM",
)

In [None]:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
import os

model_id = "google/gemma-2b"
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

# tokenizer = AutoTokenizer.from_pretrained(model_id, token=os.environ['HF_TOKEN'])
tokenizer = AutoTokenizer.from_pretrained(
    model_id, cache_dir="/content/drive/MyDrive/llm/gemma/model", token="")
# model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map={"":0}, token=os.environ['HF_TOKEN'])
model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map={"": 0}, cache_dir="/content/drive/MyDrive/llm/gemma/model", token="")

使用一个相当熟知的名言来测试一下 Gemma 模型

In [4]:
text = "Quote: Imagination is more"
device = "cuda:0"
inputs = tokenizer(text, return_tensors="pt").to(device)

outputs = model.generate(**inputs, max_new_tokens=20)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Quote: Imagination is more important than knowledge. Knowledge is limited. Imagination encircles the world.

- Albert Einstein

The


模型完成了一个合理的补全，尽管有一些额外的 token

## 开始微调

### 1、加载英文“名人名言”数据集

In [6]:
from datasets import load_dataset

data = load_dataset("Abirate/english_quotes", cache_dir="/content/drive/MyDrive/llm/gemma/data")
data = data.map(lambda samples: tokenizer(samples["quote"]), batched=True)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


### 2、微调

In [None]:
import transformers
from trl import SFTTrainer


def formatting_func(example):
    text = f"Quote: {example['quote'][0]}\nAuthor: {example['author'][0]}"
    return [text]


trainer = SFTTrainer(
    model=model,
    train_dataset=data["train"],
    args=transformers.TrainingArguments(
        per_device_train_batch_size=1,
        gradient_accumulation_steps=4,
        warmup_steps=2,
        max_steps=10,
        learning_rate=2e-4,
        fp16=True,
        logging_steps=1,
        output_dir="outputs",
        optim="paged_adamw_8bit"
    ),
    peft_config=lora_config,
    formatting_func=formatting_func,
)
trainer.train()

In [20]:
text = "Quote: Imagination is"
device = "cuda:0"
inputs = tokenizer(text, return_tensors="pt").to(device)

outputs = model.generate(**inputs, max_new_tokens=20)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Quote: Imagination is more important than knowledge. Knowledge is limited. Imagination encircles the world.

Author: Albert Einstein


## 在 TPU 环境下微调，可通过 SPMD 上的 FSDP 加速
如前所述，Hugging Face transformers 现支持 PyTorch/XLA 的最新 FSDP 实现，这可以显著加快微调速度。只需在 transformers.Trainer 中添加 FSDP 配置即可启用此功能：

In [None]:
from transformers import DataCollatorForLanguageModeling, Trainer, TrainingArguments

# Set up the FSDP config. To enable FSDP via SPMD, set xla_fsdp_v2 to True.
fsdp_config = {
    "fsdp_transformer_layer_cls_to_wrap": ["GemmaDecoderLayer"],
    "xla": True,
    "xla_fsdp_v2": True,
    "xla_fsdp_grad_ckpt": True
}

# Finally, set up the trainer and train the model.
trainer = Trainer(
    model=model,
    train_dataset=data,
    args=TrainingArguments(
        per_device_train_batch_size=64,  # This is actually the global batch size for SPMD.
        num_train_epochs=100,
        max_steps=-1,
        output_dir="./output",
        optim="adafactor",
        logging_steps=1,
        dataloader_drop_last = True,  # Required for SPMD.
        fsdp="full_shard",
        fsdp_config=fsdp_config,
    ),
    data_collator=DataCollatorForLanguageModeling(tokenizer, mlm=False),
)
trainer.train()