modelscope · wenmengzhou · Sep 15, 2023 · Sep 4, 2023 · Sep 5, 2023 · Sep 5, 2023
diff --git a/README.md b/README.md
@@ -21,12 +21,17 @@ Currently supported approches (and counting):
 1. LoRA: [LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS](https://arxiv.org/abs/2106.09685)
 2. Adapter: [Parameter-Efficient Transfer Learning for NLP](http://arxiv.org/abs/1902.00751)
 3. Prompt Tuning: [Visual Prompt Tuning](https://arxiv.org/abs/2203.12119)
-4. All tuners offered on [Peft](https://github.com/huggingface/peft).
+4. Side: [Side-Tuning: A Baseline for Network Adaptation via Additive Side Networks](https://arxiv.org/abs/1912.13503)
+5. ResTuning-Bypass
+7. All tuners offered on [Peft](https://github.com/huggingface/peft)
 
 Key features:
 
 1. By integrating the ModelScope library, models can be readily obatined via a model-id.
 2. Tuners provided by SWIFT be combined together to allow exploration of multiple tuners on a model for best result.
+3. Support calling `activate_adapter`或`deactivate_adapter` to activate/deactivate a single tuner. User can use one model with multiple tuners in different threads.
+
+Users can check the [documentation of Swift](./docs/Get Started/1.Introduction.md) to get detail tutorials.
 
 ## LLM SFT Example
 [code link](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm)

diff --git a/README_CN.md b/README_CN.md
@@ -20,11 +20,16 @@ SWIFT（Scalable lightWeight Infrastructure for Fine-Tuning）是一个可扩展
 1. LoRA：[LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS](https://arxiv.org/abs/2106.09685)
 2. Adapter：[Parameter-Efficient Transfer Learning for NLP](http://arxiv.org/abs/1902.00751)
 3. Prompt Tuning: [Visual Prompt Tuning](https://arxiv.org/abs/2203.12119)
-4. 所有在[Peft](https://github.com/huggingface/peft)上提供的tuners。
+4. Side: [Side-Tuning: A Baseline for Network Adaptation via Additive Side Networks](https://arxiv.org/abs/1912.13503)
+5. ResTuning-Bypass
+6. 所有在[Peft](https://github.com/huggingface/peft)上提供的tuners
 
 关键特点：
 1. 通过集成ModelScope库，可以通过model id轻松获取模型。
 2. SWIFT提供的tuners可以组合在一起，以便在模型上探索多个tuners，以获得最佳结果。
+3. 支持调用`activate_adapter`或`deactivate_adapter`来使tuner激活或失活，用户可以在推理时用一个模型在不同线程中使用多种tuners而互不干扰。
+
+用户可以查看 [Swift官方文档](./docs/Get Started/1.Introduction.md) 来了解详细信息。
 
 ## 大模型微调的例子
 [code link](https://github.com/modelscope/swift/tree/main/examples/pytorch/llm)

diff --git a/docs/Get Started/1.Introduction.md b/docs/Get Started/1.Introduction.md
@@ -0,0 +1,103 @@
+# 介绍
+
+Swift是一个提供LLM模型轻量级训练和推理的开源框架。Swift提供的主要能力是`efficient tuners`，tuners是运行时动态加载到模型上的额外结构，在训练时将原模型的参数冻结，只训练tuner部分，这样可以达到快速训练、降低显存使用的目的。比如，最常用的tuner是LoRA。
+
+总之，在这个框架中提供了以下特性：
+
+- **具备SOTA特性的Efficient Tuners**：用于结合大模型实现轻量级（在商业级显卡上）训练和推理，并取得较好效果
+- **使用ModelScope Hub的Trainer**：基于`transformers trainer`提供，支持LLM模型的训练，并支持将训练后的模型上传到[ModelScope Hub](https://www.modelscope.cn/models)中
+- **可运行的模型Examples**：针对热门大模型提供的训练脚本和推理脚本，并针对热门开源数据集提供了预处理逻辑，可直接运行使用
+
+# 快速开始
+
+在本章节会介绍如何快速安装swift并设定好运行环境，并跑通一个用例。
+
+安装swift的方式非常简单，用户只需要在python>=3.8环境中运行：
+
+```shell
+pip install ms-swift
+```
+
+下面的代码使用LoRA在分类任务上训练了`bert-base-uncased`模型：
+
+**运行下面的代码前请额外安装modelscope: **
+
+```shell
+pip install modelscope>=1.9.0
+```
+
+```python
+import os
+os.environ['CUDA_VISIBLE_DEVICES'] = '0'
+
+from modelscope import AutoModelForSequenceClassification, AutoTokenizer, MsDataset
+from transformers import default_data_collator
+
+from swift import Trainer, LoRAConfig, Swift, TrainingArguments
+
+
+model = AutoModelForSequenceClassification.from_pretrained(
+            'AI-ModelScope/bert-base-uncased', revision='v1.0.0')
+tokenizer = AutoTokenizer.from_pretrained(
+    'AI-ModelScope/bert-base-uncased', revision='v1.0.0')
+lora_config = LoRAConfig(target_modules=['query', 'key', 'value'])
+model = Swift.prepare_model(model, config=lora_config)
+
+train_dataset = MsDataset.load('clue', subset_name='afqmc', split='train').to_hf_dataset().select(range(100))
+val_dataset = MsDataset.load('clue', subset_name='afqmc', split='validation').to_hf_dataset().select(range(100))
+
+
+def tokenize_function(examples):
+    return tokenizer(examples["sentence1"], examples["sentence2"],
+    padding="max_length", truncation=True, max_length=128)
+
+
+train_dataset = train_dataset.map(tokenize_function)
+val_dataset = val_dataset.map(tokenize_function)
+
+arguments = TrainingArguments(
+    output_dir='./outputs',
+    per_device_train_batch_size=16,
+)
+
+trainer = Trainer(model, arguments, train_dataset=train_dataset,
+                    eval_dataset=val_dataset,
+                    data_collator=default_data_collator,)
+
+trainer.train()
+```
+
+在上面的例子中，我们使用了`bert-base-uncased`作为基模型，将LoRA模块patch到了['query', 'key', 'value']三个Linear上，进行了一次训练。
+
+训练结束后可以看到outputs文件夹，它的文件结构如下：
+
+> outputs
+>
+>     |-- checkpoint-xx
+>
+>                     |-- configuration.json
+>
+>                     |-- default
+>
+>                               |-- adapter_config.json
+>
+>                               |-- adapter_model.bin
+>
+>                     |-- ...
+
+可以使用该文件夹执行推理：
+
+```python
+from modelscope import AutoModelForSequenceClassification, AutoTokenizer
+from swift import Trainer, LoRAConfig, Swift
+
+
+model = AutoModelForSequenceClassification.from_pretrained(
+            'AI-ModelScope/bert-base-uncased', revision='v1.0.0')
+tokenizer = AutoTokenizer.from_pretrained(
+    'AI-ModelScope/bert-base-uncased', revision='v1.0.0')
+lora_config = LoRAConfig(target_modules=['query', 'key', 'value'])
+model = Swift.from_pretrained(model, model_id='./outputs/checkpoint-21')
+
+print(model(**tokenizer('this is a test', return_tensors='pt')))
+```
diff --git a/docs/Get Started/2.Installation.md b/docs/Get Started/2.Installation.md
@@ -0,0 +1,25 @@
+# 安装和使用
+
+## Wheel包安装
+
+可以使用pip进行安装：
+
+```shell
+pip install ms-swift
+```
+
+## 源代码安装
+
+```shell
+git clone https://github.com/modelscope/swift.git
+cd swift
+pip install -e .
+```
+
+## Notebook环境
+
+Swift支持训练的绝大多数模型都可以在`A10`显卡上使用，用户可以使用ModelScope官方提供的免费显卡资源：
+
+1. 进入[ModelScope](https://www.modelscope.cn)官方网站并登录
+2. 点击左侧的`我的Notebook`并开启一个免费GPU实例
+3. 愉快地薅A10显卡羊毛
diff --git a/docs/Get Started/3.Use in train and infer.md b/docs/Get Started/3.Use in train and infer.md
@@ -0,0 +1,123 @@
+# Swift API
+
+## 在训练中使用Swift
+
+调用`Swift.prepare_model()`来将tuners添加到模型上：
+
+```python
+from modelscope import Model
+from swift import Swift, LoRAConfig
+import torch
+model = Model.from_pretrained('ZhipuAI/chatglm2-6b', torch_dtype=torch.bfloat16, device_map='auto')
+lora_config = LoRAConfig(
+                r=16,
+                target_modules=['query_key_value'],
+                lora_alpha=32,
+                lora_dropout=0.)
+model = Swift.prepare_model(model, lora_config)
+# use model to do other things
+```
+
+也可以同时使用多个tuners：
+
+```python
+from modelscope import Model
+from swift import Swift, LoRAConfig, AdapterConfig
+import torch
+model = Model.from_pretrained('ZhipuAI/chatglm2-6b', torch_dtype=torch.bfloat16, device_map='auto')
+lora_config = LoRAConfig(
+                r=16,
+                target_modules=['query_key_value'],
+                lora_alpha=32,
+                lora_dropout=0.)
+adapter_config = AdapterConfig(
+                dim=model.config.hidden_size,
+                target_modules=['mlp'],
+                method_name='forward',
+                hidden_pos=0,
+                adapter_length=32,
+            )
+model = Swift.prepare_model(model, {'first_tuner': lora_config, 'second_tuner': adapter_config})
+# use model to do other things
+```
+
+在使用多个tuners时，传入的第二个参数需要是Dict，key是tuner名字，value是tuner配置。
+
+训练后可以调用：
+
+```python
+model.save_pretrained(save_directory='./output')
+```
+
+来存储模型checkpoint。模型的checkpoint文件只会包括tuners的权重，不会包含模型本身的权重。存储后的结构如下：
+
+> outputs
+>
+>      |-- configuration.json
+>
+>      |-- first_tuner
+>
+>                |-- adapter_config.json
+>
+>                |-- adapter_model.bin
+>
+>      |-- second_tuner
+>
+>                |-- adapter_config.json
+>
+>                |-- adapter_model.bin
+>
+>      |-- ...
+
+如果只传入单独的config，则会使用默认的名称`default`：
+
+> outputs
+>
+>       |-- configuration.json
+>
+>       |-- default
+>
+>                 |-- adapter_config.json
+>
+>                 |-- adapter_model.bin
+>
+>       |-- ...
+
+## 在推理时使用Swift
+
+使用`Swift.from_pretrained()`来拉起训练后存储的checkpoint：
+
+```python
+from modelscope import Model
+from swift import Swift
+import torch
+model = Model.from_pretrained('ZhipuAI/chatglm2-6b', torch_dtype=torch.bfloat16, device_map='auto')
+model = Swift.from_pretrained(model, './output')
+```
+
+## 加载多个tuners并在不同线程中并行使用
+
+在模型提供服务时，很可能出现一个模型同时服务多个http线程的情况，其中每个线程代表了一类用户请求。Swift支持在不同线程中激活不同tuners：
+
+```python
+from modelscope import Model
+from swift import Swift
+import torch
+model = Model.from_pretrained('ZhipuAI/chatglm2-6b', torch_dtype=torch.bfloat16, device_map='auto')
+# 假设output中存在训练完成的a、b、c、d是个tuners
+model = Swift.from_pretrained(model, './output')
+
+# 假设两类请求，一类使用a、b两个tuner，一类使用c、d两个tuner
+type_1 = ['a', 'b', 'c']
+type_2 = ['a', 'c', 'd']
+
+def request(_input, _type):
+  if _type == 'type_1':
+    model.set_active_adapters(type_1)
+  elif _type == 'type_2':
+    model.set_active_adapters(type_2)
+  return model(**_input)
+
+```
+
+在不同线程中使用同一个tuner是安全的。
diff --git a/docs/Get Started/4.examples.md b/docs/Get Started/4.examples.md
@@ -0,0 +1,3 @@
+# LLM训练方案
+
+Swift提供了完整的LLM训练方案，可以查看[Examples的README](../../examples/pytorch/llm/README_CN.md).
diff --git a/docs/Modules/1.swift.md b/docs/Modules/1.swift.md
@@ -0,0 +1,69 @@
+# 接口介绍
+
+## Swift
+
+##### Swift.prepare_model(model: Union[nn.Module, 'SwiftModel'], config: Union[SwiftConfig, PeftConfig, Dict[str, SwiftConfig]], **kwargs)
+
+>该静态方法随机初始化指定类型的tuners
+>
+>model: 需要加载tuner的模型，可以是SwiftModel，后添加的tuners会和前面SwiftModel中的一起生效
+>
+>config：加载的tuner的config，可以是SwiftConfig或PeftConfig，或者带有名称的config的dict。如果不传递名称则名称默认为`default`
+>
+>kwargs:
+>
+>	    extra_state_keys: List[str] 需要被额外存储到文件的原始模型weights的key
+>
+>        inference_mode: bool 是否以推理模式启动
+
+SwiftConfig的具体参数可以查看每个tuner的文档。
+
+##### Swift.from_pretrained(model: Union[nn.Module, 'SwiftModel'], model_id: str = None, adapter_name: Union[str, List[str]] = None, revision: str = None, **kwargs)
+
+> 该静态方法拉起之前存储过的tuners的checkpoint
+>
+> model: 需要加载tuner的模型，可以是SwiftModel，后添加的tuners会和前面SwiftModel中的一起生效
+>
+> model_id：已存储的tuners的本地目录或modelscope hub id。
+>
+> adapter_name：需要被拉起的adapter名称，默认为None代表全部拉起
+>
+> kwargs：
+>
+>         inference_mode: bool 是否以推理模式启动
+>
+>         revision: model_id的revision
+>
+>         extra_state_keys: 下次save_pretrained时额外存储的weights
+
+## SwiftModel
+
+在`Swift.prepare_model`或`Swift.from_pretrained`拉起后，都会返回一个`SwiftModel`类型的实例。该实例包装了实际传入的模型。
+
+##### save_pretrained(self, save_directory: str, safe_serialization: bool = False, adapter_name: Union[str, List[str]] = None, **kwargs)
+
+> 实例方法，将模型存储到本地磁盘中，可直接被Swift.from_pretrained拉起
+>
+> save_directory：存储的目录
+>
+> safe_serialization: 是否存储safe_tensors
+>
+> adapter_name：待存储的adapter名称，默认为None代表全部存储
+
+##### set_active_adapters(self, adapter_names: List[str])
+
+> 实例方法，设置模型在当前线程中生效的所有adapter。如果将环境变量`USE_UNIQUE_THREAD`设置为'0'，则设置对所有线程同时生效。
+>
+> adapter_names：adapter名称列表
+
+##### activate_adapter(self, adapter_name)
+
+> 实例方法，在当前线程中单独激活某个adapter，如果将环境变量`USE_UNIQUE_THREAD`设置为'0'，则设置对所有线程同时生效。
+>
+> adapter_name：adapter名称
+
+##### deactivate_adapter(self, adapter_name)
+
+> 实例方法，在当前线程中单独激活某个adapter，如果将环境变量`USE_UNIQUE_THREAD`设置为'0'，则设置对所有线程同时生效。
+>
+> adapter_name：adapter名称
diff --git a/docs/Modules/2.lora.md b/docs/Modules/2.lora.md
@@ -0,0 +1,32 @@
+# LoRA
+
+LoRA是[LoRA: Low-Rank Adaptation of Large Language Models](https://arxiv.org/abs/2106.09685) 论文提供的轻量级训练组件。LoRA可以添加到Linear、Embedding、Conv2d等算子上生效。
+
+>```python
+>LoRAConfig (
+>    r: int LoRA结构的秩
+>    target_modules: Union[List[str], str] MLP结构的module_key，如果是str类型则进行full_match统配查找，如果是List，则进行末尾匹配
+>    lora_alpha: int LoRA结构的权重比例，lora_alpha/r的值是lora结构的权重
+>    lora_dropout: float LoRA结构的dropout比例
+>    merge_weights: bool 在推理时是否将loRA权重合并到原始weights上
+>    use_merged_linear: bool 是否是merged linear结构
+>    enable_lora: List[bool]: 如果是use_merged_linear，哪些module需要添加LoRA结构
+>    bias: str 偏置是否参与训练和存储，可以为`none`：所有偏置不参与训练, `all`：所有模块的偏置均参与训练, `lora_only`：仅loRA结构的偏置参与训练
+>)
+>```
+
+一个使用LoRA的例子如下：
+
+```python
+from modelscope import Model
+from swift import Swift, LoRAConfig
+import torch
+model = Model.from_pretrained('ZhipuAI/chatglm2-6b', torch_dtype=torch.bfloat16, device_map='auto')
+lora_config = LoRAConfig(
+                r=16,
+                target_modules=['query_key_value'],
+                lora_alpha=32,
+                lora_dropout=0.)
+model = Swift.prepare_model(model, lora_config)
+# use model to do other things
+```
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,3 @@
		# LLM训练方案

		Swift提供了完整的LLM训练方案，可以查看[Examples的README](../../examples/pytorch/llm/README_CN.md).