#  Federated ChatGLM Tuning with Parameter Efficient methods in FATE-LLM

In this tutorial, we will demonstrate how to efficiently train federated ChatGLM-6B using the FATE-LLM framework. In FATE-LLM, we introduce the "pellm"(Parameter Efficient Large Language Model) module, specifically designed for federated learning with large language models. We enable the implementation of parameter-efficient methods in federated learning, reducing communication overhead while maintaining model performance. In this tutorial we particularlly focus on ChatGLM-^b, and we will also emphasize the use of the Adapter mechanism for fine-tuning ChatGLM-6B, which enables us to effectively reduce communication volume and improve overall efficiency.


## ChatGLM-6B

ChatGLM-6B is a large transformer-based language model with 6.2 billion parameters, trained on about 1T tokens of Chinese and English corpus. ChatGLM-6B is an open bilingual language model based on General Language Model. You can download the pretrained model from [here](https://huggingface.co/THUDM/chatglm-6b), or let the program automatically download it when you use it later.

## Dataset: Advertising Text Generation

This is an advertising test generateion dataset, you can download dataset from the following links: 
- [data link 1](https://drive.google.com/file/d/13_vf0xRTQsyneRKdD1bZIr93vBGOczrk/view)
- [data link 2](https://cloud.tsinghua.edu.cn/f/b3f119a008264b1cabd1/?dl=1)
and place it in the examples/data folder. 

You can refer to following link for more details about [data](https://aclanthology.org/D19-1321.pdf)

In [5]:
import pandas as pd
df = pd.read_json('${fate_install}/examples/data/AdvertiseGen/train.json', lines=True)

In [6]:
df

Unnamed: 0,content,summary
0,类型#裤*版型#宽松*风格#性感*图案#线条*裤型#阔腿裤,宽松的阔腿裤这两年真的吸粉不少，明星时尚达人的心头爱。毕竟好穿时尚，谁都能穿出腿长2米的效果...
1,类型#裙*风格#简约*图案#条纹*图案#线条*图案#撞色*裙型#鱼尾裙*裙袖长#无袖,圆形领口修饰脖颈线条，适合各种脸型，耐看有气质。无袖设计，尤显清凉，简约横条纹装饰，使得整身...
2,类型#上衣*版型#宽松*颜色#粉红色*图案#字母*图案#文字*图案#线条*衣样式#卫衣*衣款...,宽松的卫衣版型包裹着整个身材，宽大的衣身与身材形成鲜明的对比描绘出纤瘦的身形。下摆与袖口的不...
3,类型#裙*版型#宽松*材质#雪纺*风格#清新*裙型#a字*裙长#连衣裙,踩着轻盈的步伐享受在午后的和煦风中，让放松与惬意感为你免去一身的压力与束缚，仿佛要将灵魂也寄...
4,类型#上衣*材质#棉*颜色#蓝色*风格#潮*衣样式#polo*衣领型#polo领*衣袖长#短...,想要在人群中脱颖而出吗？那么最适合您的莫过于这款polo衫短袖，采用了经典的polo领口和柔...
...,...,...
114594,类型#上衣*风格#运动*风格#休闲*衣样式#外套*衣领型#立领*衣袖长#长袖*衣门襟#拉链*...,基础的外套廓形，直筒，立领长袖，中间金属拉链穿脱，方便实用，带有浓浓的休闲运动味道。日常休闲...
114595,类型#上衣*风格#街头*图案#创意*衣样式#卫衣,在这件卫衣上，BRAND-white集合了女性化的柔美还有不变的街头风采，<UNK><UNK...
114596,类型#裙*版型#宽松*版型#显瘦*颜色#黑色*图案#撞色*裙型#直筒裙*裙款式#拼接,采用简洁大体的黑色格调，宽松舒适的裙子内里，配上落肩的袖子拼接，不惧夏日的炎热，穿出清凉舒适...
114597,类型#上衣*颜色#黑色*颜色#紫色*风格#性感*图案#字母*图案#文字*图案#线条*图案#刺...,卫衣的短款长度设计能够适当地露出腰线，打造出纤瘦的身材十分性感。衣身的字母刺绣图案有着小巧的...


## ChatGLM-6B with Adapter

In this section, we will guide you through the process of finetuning ChatGLM-6B with adapters using the FATE-LLM framework. Before starting this section, we recommend that you read through this tutorial first: [Model Customization](https://github.com/FederatedAI/FATE/blob/master/doc/tutorial/pipeline/nn_tutorial/Homo-NN-Customize-Model.ipynb).

ChatGLM model is located on fate_llm/model_zoo/chatglm.py, can be use directly

In [7]:
! ls ../../../fate/python/fate_llm/model_zoo/pellm

albert.py  bert.py     deberta.py     gpt2.py			  __pycache__
bart.py    chatglm.py  distilbert.py  parameter_efficient_llm.py  roberta.py


#### Adapters

We can directly use adapters from the peft. See details for adapters on this page [Adapter Methods](https://huggingface.co/docs/peft/index) for more details. By specifying the adapter name and the adapter
config dict we can insert adapters into our language models:

In [12]:
from peft import LoraConfig, TaskType

# define lora config
lora_config = LoraConfig(
    task_type=TaskType.SEQ_CLS,
    inference_mode=False, r=8, lora_alpha=32, lora_dropout=0.1,
    target_modules=['c_attn'],
)

#### Init ChatGLM Model 

In [14]:
import torch as t
from pipeline import fate_torch_hook
from pipeline.component.nn import save_to_fate_llm
fate_torch_hook(t)

model_path = "your download chatglm path"
model = t.nn.Sequential(
    t.nn.CustModel(module_name='pellm.chatglm', class_name='ChatGLMForConditionalGeneration',
                   peft_config=lora_config.to_dict(), peft_type='LoraConfig',
                   pretrained_path=model_path)
)


**During the training process, all weights of the pretrained language model will be frozen, and weights of adapters are traininable. Thus, FATE-LLM only train in the local training and aggregate adapters' weights in the fedederation process**

Now available adapters are [Adapters Overview](https://huggingface.co/docs/peft/index) for details.


#### Inint DeepSpeed Config

In [15]:
ds_config = {
    "train_micro_batch_size_per_gpu": 1,
    "optimizer": {
        "type": "Adam",
        "params": {
            "lr": 5e-4
        }
    },
    "bf16": {
        "enabled": True
    },
    "zero_optimization": {
        "stage": 2,
        "allgather_partitions": True,
        "allgather_bucket_size": 5e8,
        "overlap_comm": False,
        "reduce_scatter": True,
        "reduce_bucket_size": 5e8,
        "contiguous_gradients": True
    }
}


## Submit Federated Task
To run federated task, please make sure to ues fate>=v1.11.2 and deploy it with gpu machines. To running this code, make sure training data path is already binded. The following code shoud be copy to a script and run in a command line like "python federated_chatglm.py"

You can use this script to submit the model, but submitting the model will take a long time to train and generate a long log, so we won't do it here.

In [None]:
import torch as t
import os
from pipeline import fate_torch_hook
from pipeline.component import HomoNN
from pipeline.backend.pipeline import PipeLine
from pipeline.component import Reader
from pipeline.interface import Data
from pipeline.runtime.entity import JobParameters

fate_torch_hook(t)


guest_0 = 9999
host_1 = 10000
pipeline = PipeLine().set_initiator(role='guest', party_id=guest_0).set_roles(guest=guest_0, host=host_1,
                                                                              arbiter=guest_0)
data_guest = {"name": "ad_guest", "namespace": "experiment"}
data_host = {"name": "ad_host", "namespace": "experiment"}
guest_data_path = "${fate_install}/examples/data/AdvertiseGen/train.json_guest"
host_data_path = "${fate_install}/examples/data/AdvertiseGen/train.json_host"
# make sure the guest and host's training data are already binded. beforem

reader_0 = Reader(name="reader_0")
reader_0.get_party_instance(role='guest', party_id=guest_0).component_param(table=data_guest)
reader_0.get_party_instance(role='host', party_id=host_1).component_param(table=data_host)

## Add your pretriained model path here, will load model&tokenizer from this path

from peft import LoraConfig, TaskType
lora_config = LoraConfig(
    task_type=TaskType.CAUSAL_LM,
    inference_mode=False, r=8, lora_alpha=32, lora_dropout=0.1,
    target_modules=['query_key_value'],
)
ds_config = {
    "train_micro_batch_size_per_gpu": 1,
    "optimizer": {
        "type": "Adam",
        "params": {
            "lr": 5e-4
        }
    },
    "fp16": {
        "enabled": True
    },
    "zero_optimization": {
        "stage": 2,
        "allgather_partitions": True,
        "allgather_bucket_size": 5e8,
        "overlap_comm": False,
        "reduce_scatter": True,
        "reduce_bucket_size": 5e8,
        "contiguous_gradients": True
    }
}

model_path = "your download chatglm path"
from pipeline.component.homo_nn import DatasetParam, TrainerParam
model = t.nn.Sequential(
    t.nn.CustModel(module_name='pellm.chatglm', class_name='ChatGLMForConditionalGeneration',
                   peft_config=lora_config.to_dict(), peft_type='LoraConfig',
                   pretrained_path=model_path)
)

# DatasetParam
dataset_param = DatasetParam(dataset_name='glm_tokenizer', text_max_length=64, tokenizer_name_or_path=model_path,
                             padding_side="left")
# TrainerParam
trainer_param = TrainerParam(trainer_name='fedavg_trainer', epochs=5, batch_size=4, checkpoint_save_freqs=1, pin_memory=False, task_type="seq_2_seq_lm",
                             data_loader_worker=8, secure_aggregate=False, save_to_local_dir=True, # pay attention to tihs parameter
                             collate_fn="DataCollatorForSeq2Seq")


nn_component = HomoNN(name='nn_0', model=model , ds_config=ds_config)

# set parameter for client 1
nn_component.get_party_instance(role='guest', party_id=guest_0).component_param(
    dataset=dataset_param,
    trainer=trainer_param,
    torch_seed=100
)

# set parameter for client 2
nn_component.get_party_instance(role='host', party_id=host_1).component_param(
    dataset=dataset_param,
    trainer=trainer_param,
    torch_seed=100
)

# set parameter for server
nn_component.get_party_instance(role='arbiter', party_id=guest_0).component_param(
    trainer=trainer_param
)

pipeline.add_component(reader_0)
pipeline.add_component(nn_component, data=Data(train_data=reader_0.output.data))
pipeline.compile()

pipeline.fit(JobParameters(task_conf={
    "nn_0": {
        "launcher": "deepspeed",
        "world_size": 8 # world_size means num of gpus to train in a single client
    }
}))


### Training With P-Tuning V2 Adapter

To use another adapter lke P-Tuning V2, slightly changes is needed!

In [20]:
from pipeline.component.homo_nn import DatasetParam, TrainerParam
model = t.nn.Sequential(
    t.nn.CustModel(module_name='pellm.chatglm', class_name='ChatGLMForConditionalGeneration',
                   pre_seq_len=128, # only this parameters is needed
                   pretrained_path=model_path)
)

## Inference

Models trained with FATE-LLM can be find under the directory `${fate_install}/fateflow/model/$jobids/$cpn_name/{model.pkl, checkpoint_xxx.pkl/adapter_model.bin}`, users must may sure "save_to_local_dir=True".  
The following code is an example to load trained lora adapter weights:

In [None]:
import json
import sys
import torch
from peft import PeftModel, PeftConfig, LoraConfig, TaskType, get_peft_model
from transformers import AutoModel, AutoTokenizer


def load_model(pretrained_model_path):
    _tokenizer = AutoTokenizer.from_pretrained(pretrained_model_path, trust_remote_code=True)
    _model = AutoModel.from_pretrained(pretrained_model_path, trust_remote_code=True)

    _model = _model.half()
    _model = _model.eval()

    return _model, _tokenizer


def load_data(data_path):
    with open(data_path, "r") as fin:
        for _l in fin:
            yield json.loads(_l.strip())

chatglm_model_path = ""
model, tokenizer = load_model(chatglm_model_path)

test_data_path = "{fate_install}/examples/data/AdvertiseGen/dev.json"
dataset = load_data(test_data_path)

peft_path = trained_model_path
peft_config = LoraConfig(
    task_type=TaskType.CAUSAL_LM,
    inference_mode=False, r=8, lora_alpha=32, lora_dropout=0.1,
    target_modules=['query_key_value'],
)

model = get_peft_model(model, peft_config)
model.load_state_dict(torch.load(peft_path), strict=False)
model = model.half()
model.eval()

for p in model.parameters():
    if p.requires_grad:
        print(p)

model.cuda("cuda:0")

content = "advertisement keywords"
model.chat(tokenizer, content, do_sample=False)