modelscope · Jintao-Huang · Dec 14, 2024 · Dec 13, 2024 · Dec 14, 2024
diff --git a/docs/source/GetStarted/快速开始.md b/docs/source/GetStarted/快速开始.md
@@ -8,7 +8,7 @@ ms-swift是魔搭社区提供的大模型与多模态大模型训练部署框架
 - 轻量训练：支持了LoRA、QLoRA、DoRA、LoRA+、ReFT、RS-LoRA、LLaMAPro、Adapter、GaLore、Q-Galore、LISA、UnSloth、Liger-Kernel等轻量微调方式。
 - 分布式训练：支持分布式数据并行（DDP）、device_map简易模型并行、DeepSpeed ZeRO2 ZeRO3、FSDP等分布式训练技术。
 - 量化训练：支持对BNB、AWQ、GPTQ、AQLM、HQQ、EETQ量化模型进行训练。
-- RLHF训练：支持纯文本大模型和多模态大模型的DPO、CPO、SimPO、ORPO、KTO等人类对齐训练方法。
+- RLHF训练：支持纯文本大模型和多模态大模型的DPO、CPO、SimPO、ORPO、KTO、RM等人类对齐训练方法。
 - 多模态训练：支持对图像、视频和语音不同模态模型进行训练，支持VQA、Caption、OCR、Grounding任务的训练。
 - 界面训练：以界面的方式提供训练、推理、评测、量化的能力，完成大模型的全链路。
 - 插件化与拓展：支持自定义模型和数据集拓展，支持对loss、metric、trainer、loss-scale、callback、optimizer等组件进行自定义。
@@ -75,4 +75,4 @@ swift infer \
 > [!TIP]
 > 更多例子可以查看：[examples](https://github.com/modelscope/ms-swift/tree/main/examples)
 >
-> 以python方式进行训练和推理的例子可以查看[notebook](https://github.com/modelscope/ms-swift/tree/main/examples/notebook)
+> 以python方式进行训练和推理的例子可以查看：[notebook](https://github.com/modelscope/ms-swift/tree/main/examples/notebook)
diff --git a/docs/source/Instruction/命令行参数.md b/docs/source/Instruction/命令行参数.md
@@ -91,7 +91,7 @@
 - 🔥learning_rate: 学习率，全参数默认为1e-5，tuner为1e-4
 - lr_scheduler_type: lr_scheduler类型，默认为cosine
 - lr_scheduler_kwargs: lr_scheduler其他参数
-- 🔥gradient_checkpointing_kwargs: 传入`torch.utils.checkpoint`中的参数. 例如设置为`'{"use_reentrant": false}'`
+- 🔥gradient_checkpointing_kwargs: 传入`torch.utils.checkpoint`中的参数. 例如设置为`--gradient_checkpointing_kwargs '{"use_reentrant": false}'`
 - report_to: 默认值为`tensorboard`
 - remove_unused_columns: 默认值False
 - logging_first_step: 是否记录第一个step的打印，默认值True

diff --git a/docs/source/Instruction/推理和部署.md b/docs/source/Instruction/推理和部署.md
@@ -3,7 +3,7 @@
 SWIFT支持以命令行、Python代码和界面方式进行推理和部署：
 - 使用`engine.infer`或者`engine.infer_async`进行python的方式推理. 参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/infer/demo.py).
 - 使用`swift infer`使用命令行的方式进行推理. 参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/infer/cli_demo.sh).
-- 使用`swift deploy`进行服务部署，并使用openai API或者`client.infer`的方式推理. 参考[这里](https://github.com/modelscope/ms-swift/tree/main/examples/deploy/client)
+- 使用`swift deploy`进行服务部署，并使用openai API或者`client.infer`的方式推理. 服务端参考[这里](https://github.com/modelscope/ms-swift/tree/main/examples/deploy/server), 客户端参考[这里](https://github.com/modelscope/ms-swift/tree/main/examples/deploy/client).
 - 使用`swift web-ui`部署模型进行界面推理, 可以查看[这里](../GetStarted/界面使用.md)
 
 

diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -17,12 +17,12 @@ Swift DOCUMENTATION
    :maxdepth: 2
    :caption: Instruction
 
+   Instruction/命令行参数.md
    Instruction/预训练及微调.md
    Instruction/人类对齐.md
    Instruction/推理和部署.md
    Instruction/评测.md
    Instruction/导出.md
-   Instruction/命令行参数.md
    Instruction/支持的模型和数据集.md
    Instruction/使用tuners.md
    Instruction/智能体的支持.md

diff --git a/docs/source_en/GetStarted/Quick-start.md b/docs/source_en/GetStarted/Quick-start.md
@@ -8,7 +8,7 @@ ms-swift is a comprehensive training and deployment framework for large language
 - Lightweight Training: Supports lightweight fine-tuning methods like LoRA, QLoRA, DoRA, LoRA+, ReFT, RS-LoRA, LLaMAPro, Adapter, GaLore, Q-Galore, LISA, UnSloth, Liger-Kernel, and more.
 - Distributed Training: Supports distributed data parallel (DDP), simple model parallelism via device_map, DeepSpeed ZeRO2 ZeRO3, FSDP, and other distributed training technologies.
 - Quantization Training: Provides training for quantized models like BNB, AWQ, GPTQ, AQLM, HQQ, EETQ.
-- RLHF Training: Supports human alignment training methods like DPO, CPO, SimPO, ORPO, KTO for both text-based and multimodal large models.
+- RLHF Training: Supports human alignment training methods like DPO, CPO, SimPO, ORPO, KTO, RM for both text-based and multimodal large models.
 - Multimodal Training: Capable of training models for different modalities such as images, videos, and audios; supports tasks like VQA (Visual Question Answering), Captioning, OCR (Optical Character Recognition), and Grounding.
 - Interface-driven Training: Offers training, inference, evaluation, and quantization capabilities through an interface, enabling a complete workflow for large models.
 - Plugins and Extensions: Allows customization and extension of models and datasets, and supports customizations for components like loss, metric, trainer, loss-scale, callback, optimizer, etc.

diff --git a/docs/source_en/Instruction/Command-line-parameters.md b/docs/source_en/Instruction/Command-line-parameters.md
@@ -92,7 +92,7 @@ This parameter list inherits from transformers `Seq2SeqTrainingArguments`, with
 - 🔥learning_rate: Learning rate, default is 1e-5 for all parameters, and 1e-4 for the tuner.
 - lr_scheduler_type: LR scheduler type, default is cosine.
 - lr_scheduler_kwargs: Other parameters for the LR scheduler.
-- 🔥gradient_checkpointing_kwargs: Parameters passed to `torch.utils.checkpoint`. For example, set to `{"use_reentrant": false}`.
+- 🔥gradient_checkpointing_kwargs: Parameters passed to `torch.utils.checkpoint`. For example, set to `--gradient_checkpointing_kwargs '{"use_reentrant": false}'`.
 - report_to: Default is `tensorboard`.
 - remove_unused_columns: Default is False.
 - logging_first_step: Whether to log the first step print, default is True.

diff --git a/docs/source_en/Instruction/Inference-and-deployment.md b/docs/source_en/Instruction/Inference-and-deployment.md
@@ -3,7 +3,7 @@
 SWIFT supports inference and deployment through command line, Python code, and interface methods:
 - Use `engine.infer` or `engine.infer_async` for Python-based inference. See [here](https://github.com/modelscope/ms-swift/blob/main/examples/infer/demo.py) for reference.
 - Use `swift infer` for command-line-based inference. See [here](https://github.com/modelscope/ms-swift/blob/main/examples/infer/cli_demo.sh) for reference.
-- Use `swift deploy` for service deployment and perform inference using the OpenAI API or `client.infer`. Refer to [here](https://github.com/modelscope/ms-swift/tree/main/examples/deploy/client) for more information.
+- Use `swift deploy` for service deployment and perform inference using the OpenAI API or `client.infer`. Refer to the server guidelines [here](https://github.com/modelscope/ms-swift/tree/main/examples/deploy/server) and the client guidelines [here](https://github.com/modelscope/ms-swift/tree/main/examples/deploy/client).
 - Deploy the model with `swift web-ui` for web-based inference. You can check [here](../GetStarted/Interface-usage.md) for details.
 
 

diff --git a/docs/source_en/index.rst b/docs/source_en/index.rst
@@ -17,12 +17,12 @@ Swift DOCUMENTATION
    :maxdepth: 2
    :caption: Instruction
 
+   Instruction/Command-line-parameters.md
    Instruction/Pre-training-and-Fine-tuning.md
    Instruction/RLHF.md
    Instruction/Inference-and-deployment.md
    Instruction/Evaluation.md
    Instruction/Export.md
-   Instruction/Command-line-parameters.md
    Instruction/Supported-models-and-datasets.md
    Instruction/Use-tuners.md
    Instruction/Agent-support.md

diff --git a/swift/llm/argument/base_args/base_args.py b/swift/llm/argument/base_args/base_args.py
@@ -63,6 +63,8 @@ class BaseArguments(GenerationArguments, QuantizeArguments, DataArguments, Templ
 
     def _init_custom_register(self) -> None:
         """Register custom .py file to datasets"""
+        if isinstance(self.custom_register_path, str):
+            self.custom_register_path = [self.custom_register_path]
         self.custom_register_path = to_abspath(self.custom_register_path, True)
         for path in self.custom_register_path:
             folder, fname = os.path.split(path)

diff --git a/swift/llm/argument/base_args/data_args.py b/swift/llm/argument/base_args/data_args.py
@@ -50,6 +50,8 @@ class DataArguments:
 
     def _init_custom_dataset_info(self):
         """register custom dataset_info.json to datasets"""
+        if isinstance(self.custom_dataset_info, str):
+            self.custom_dataset_info = [self.custom_dataset_info]
         for path in self.custom_dataset_info:
             register_dataset_info(path)
 

diff --git a/swift/llm/dataset/data/dataset_info.json b/swift/llm/dataset/data/dataset_info.json
@@ -309,6 +309,10 @@
     },
     {
         "ms_dataset_id": "swift/Infinity-Instruct",
+        "subsets": ["3M", "7M", "0625", "Gen", "7M_domains"],
+        "columns": {
+            "label": "_"
+        },
         "hf_dataset_id": "BAAI/Infinity-Instruct",
         "tags": ["qa", "quality", "multi-task"],
         "huge_dataset": true

diff --git a/tests/tuners/test_extra_state_dict.py b/tests/tuners/test_extra_state_dict.py
@@ -34,7 +34,7 @@ def test_swift_extra_state_dict(self):
         with open(os.path.join(self.tmp_dir, 'extra_states', 'adapter_model.bin'), 'wb') as f:
             torch.save(state_dict, f)
         model = Model.from_pretrained('damo/nlp_structbert_sentence-similarity_chinese-base')
-        model = Swift.from_pretrained(model, self.tmp_dir)
+        model = Swift.from_pretrained(model, self.tmp_dir, inference_mode=False)
         names = [name for name, value in model.named_parameters() if value.requires_grad]
         self.assertTrue(any('classifier' in name for name in names))
         self.assertTrue(torch.allclose(state_dict['classifier.weight'], model.base_model.classifier.weight))