### 安装 git lfs

In [None]:
!curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
!sudo apt-get install git-lfs && git lfs install

### 下载模型

In [None]:
!git clone https://huggingface.co/THUDM/chatglm-6b ../chatglm-6b

### 安装依赖

In [None]:
!pip install -r ./requirement.txt

### 数据集准备 

准备 `.jsonl` 格式的数据放到 ./data 目录下。数据格式为：


{"q": "问题", "a": "回答"}


### 对数据集进行分词

为了避免每次训练的时都要重新对数据集分词，先分好词形成特征后保存成可直接用于训练的数据集。相关参数说明：

* model_checkpoint: 模型目录
* input_file:  ./data 目录下的数据集文件名
* prompt_key:  数据集中 prompt 对应的字段（这里是 q）
* target_key:  数据集中 completion 对应的字段（这里是 a)
* save_name:  数据集保存目录，分词后的数据保存在 ./data/tokenized_data 下
* max_seq_length:  文本最大长度

In [None]:
!CUDA_VISIBLE_DEVICES=0 python ./script/tokenize_dataset_rows.py \
    --input_file dataset.jsonl \
    --prompt_key q \
    --target_key a \
    --save_name dataset \
    --max_seq_length 2000 \
    --skip_overlength False

### 使用 LoRA 微调

参数说明：

* tokenized_dataset: 分词后的数据集保存目录（即上一步 save_name 的值）
* tlora_rank: 设置 LoRA 的秩，推荐为4或8，显存够的话使用8
* tper_device_train_batch_size: 每块 GPU 上的 batch size
* tgradient_accumulation_steps: 梯度累加，可以在不提升显存占用的情况下增大 batch size
* tmax_steps: 训练步数
* tsave_steps: 多少步保存一次
* tsave_total_limit: 保存多少个checkpoint
* tlogging_steps: 多少步打印一次训练情况(loss, lr, etc.)
* toutput_dir: 模型文件保存地址

In [6]:
# 删除上次的微调模型
# !rm -rf /mnt/workspace/glm-fine-tuning/weights

!CUDA_VISIBLE_DEVICES=0 python ./script/chatglm_lora_tuning.py \
    --tokenized_dataset dataset \
    --lora_rank 8 \
    --per_device_train_batch_size 1 \
    --gradient_accumulation_steps 2 \
    --max_steps 2000 \
    --save_steps 200 \
    --save_total_limit 2 \
    --learning_rate 5e-6 \
    --fp16 \
    --remove_unused_columns false \
    --logging_steps 50 \
    --output_dir ./weights/api-fn

Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
'\nlen(dataset)=1199\n'
Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Loading checkpoint shards: 100%|██████████████████| 8/8 [00:07<00:00,  1.07it/s]
{'': 0}
You are adding a <class 'transformers.integrations.TensorBoardCallback'> to the callbacks of this Trainer, but there is already one. The currentlist of callbacks is
:DefaultFlowCallback
TensorBoardCallback
  0%|                                                  | 0/2000 [00:00<?, ?it/s]`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...
{'loss': 2.9936, 'learning_rate': 4.8875

### 加载微调模型

微调模型保存在上一步配置的 output_dir 目录下。至少需要其中的 adapter_model.bin、adapter_config.json 两个文件才能部署成功

### 启动 web 服务

In [None]:
!python ./server/web.py

Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Loading checkpoint shards: 100%|██████████████████| 8/8 [00:07<00:00,  1.02it/s]
  user_input = gr.Textbox(show_label=False, placeholder="Input...", lines=10).style(
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Traceback (most recent call last):
  File "/home/pai/lib/python3.9/site-packages/gradio/routes.py", line 439, in run_predict
    output = await app.get_blocks().process_api(
  File "/home/pai/lib/python3.9/site-packages/gradio/blocks.py", line 1384, in process_

### 通过 API 服务测试模型

In [5]:
# 安装 pyngrok 用来暴露服务

!npm install -g localtunnel

[K[?25h/etc/dsw/node/bin/lt -> /etc/dsw/node/lib/node_modules/localtunnel/bin/lt.jsming[0m [35maction:finalize[0m[0m[K
+ localtunnel@2.0.2
added 22 packages from 22 contributors in 10.96s


In [None]:
!lt --port 6006

In [None]:
# 后台运行 chatlm
get_ipython().system_raw("python ./server/api.py &")