Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/source/GetStarted/快速开始.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ ms-swift是魔搭社区提供的大模型与多模态大模型训练部署框架
- 轻量训练:支持了LoRA、QLoRA、DoRA、LoRA+、ReFT、RS-LoRA、LLaMAPro、Adapter、GaLore、Q-Galore、LISA、UnSloth、Liger-Kernel等轻量微调方式。
- 分布式训练:支持分布式数据并行(DDP)、device_map简易模型并行、DeepSpeed ZeRO2 ZeRO3、FSDP等分布式训练技术。
- 量化训练:支持对BNB、AWQ、GPTQ、AQLM、HQQ、EETQ量化模型进行训练。
- RLHF训练:支持纯文本大模型和多模态大模型的DPO、CPO、SimPO、ORPO、KTO等人类对齐训练方法
- RLHF训练:支持纯文本大模型和多模态大模型的DPO、CPO、SimPO、ORPO、KTO、RM等人类对齐训练方法
- 多模态训练:支持对图像、视频和语音不同模态模型进行训练,支持VQA、Caption、OCR、Grounding任务的训练。
- 界面训练:以界面的方式提供训练、推理、评测、量化的能力,完成大模型的全链路。
- 插件化与拓展:支持自定义模型和数据集拓展,支持对loss、metric、trainer、loss-scale、callback、optimizer等组件进行自定义。
Expand Down Expand Up @@ -75,4 +75,4 @@ swift infer \
> [!TIP]
> 更多例子可以查看:[examples](https://github.com/modelscope/ms-swift/tree/main/examples)
>
> 以python方式进行训练和推理的例子可以查看[notebook](https://github.com/modelscope/ms-swift/tree/main/examples/notebook)
> 以python方式进行训练和推理的例子可以查看[notebook](https://github.com/modelscope/ms-swift/tree/main/examples/notebook)
2 changes: 1 addition & 1 deletion docs/source/Instruction/命令行参数.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@
- 🔥learning_rate: 学习率,全参数默认为1e-5,tuner为1e-4
- lr_scheduler_type: lr_scheduler类型,默认为cosine
- lr_scheduler_kwargs: lr_scheduler其他参数
- 🔥gradient_checkpointing_kwargs: 传入`torch.utils.checkpoint`中的参数. 例如设置为`'{"use_reentrant": false}'`
- 🔥gradient_checkpointing_kwargs: 传入`torch.utils.checkpoint`中的参数. 例如设置为`--gradient_checkpointing_kwargs '{"use_reentrant": false}'`
- report_to: 默认值为`tensorboard`
- remove_unused_columns: 默认值False
- logging_first_step: 是否记录第一个step的打印,默认值True
Expand Down
2 changes: 1 addition & 1 deletion docs/source/Instruction/推理和部署.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
SWIFT支持以命令行、Python代码和界面方式进行推理和部署:
- 使用`engine.infer`或者`engine.infer_async`进行python的方式推理. 参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/infer/demo.py).
- 使用`swift infer`使用命令行的方式进行推理. 参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/infer/cli_demo.sh).
- 使用`swift deploy`进行服务部署,并使用openai API或者`client.infer`的方式推理. 参考[这里](https://github.com/modelscope/ms-swift/tree/main/examples/deploy/client)
- 使用`swift deploy`进行服务部署,并使用openai API或者`client.infer`的方式推理. 服务端参考[这里](https://github.com/modelscope/ms-swift/tree/main/examples/deploy/server), 客户端参考[这里](https://github.com/modelscope/ms-swift/tree/main/examples/deploy/client).
- 使用`swift web-ui`部署模型进行界面推理, 可以查看[这里](../GetStarted/界面使用.md)


Expand Down
2 changes: 1 addition & 1 deletion docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,12 @@ Swift DOCUMENTATION
:maxdepth: 2
:caption: Instruction

Instruction/命令行参数.md
Instruction/预训练及微调.md
Instruction/人类对齐.md
Instruction/推理和部署.md
Instruction/评测.md
Instruction/导出.md
Instruction/命令行参数.md
Instruction/支持的模型和数据集.md
Instruction/使用tuners.md
Instruction/智能体的支持.md
Expand Down
2 changes: 1 addition & 1 deletion docs/source_en/GetStarted/Quick-start.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ ms-swift is a comprehensive training and deployment framework for large language
- Lightweight Training: Supports lightweight fine-tuning methods like LoRA, QLoRA, DoRA, LoRA+, ReFT, RS-LoRA, LLaMAPro, Adapter, GaLore, Q-Galore, LISA, UnSloth, Liger-Kernel, and more.
- Distributed Training: Supports distributed data parallel (DDP), simple model parallelism via device_map, DeepSpeed ZeRO2 ZeRO3, FSDP, and other distributed training technologies.
- Quantization Training: Provides training for quantized models like BNB, AWQ, GPTQ, AQLM, HQQ, EETQ.
- RLHF Training: Supports human alignment training methods like DPO, CPO, SimPO, ORPO, KTO for both text-based and multimodal large models.
- RLHF Training: Supports human alignment training methods like DPO, CPO, SimPO, ORPO, KTO, RM for both text-based and multimodal large models.
- Multimodal Training: Capable of training models for different modalities such as images, videos, and audios; supports tasks like VQA (Visual Question Answering), Captioning, OCR (Optical Character Recognition), and Grounding.
- Interface-driven Training: Offers training, inference, evaluation, and quantization capabilities through an interface, enabling a complete workflow for large models.
- Plugins and Extensions: Allows customization and extension of models and datasets, and supports customizations for components like loss, metric, trainer, loss-scale, callback, optimizer, etc.
Expand Down
2 changes: 1 addition & 1 deletion docs/source_en/Instruction/Command-line-parameters.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ This parameter list inherits from transformers `Seq2SeqTrainingArguments`, with
- 🔥learning_rate: Learning rate, default is 1e-5 for all parameters, and 1e-4 for the tuner.
- lr_scheduler_type: LR scheduler type, default is cosine.
- lr_scheduler_kwargs: Other parameters for the LR scheduler.
- 🔥gradient_checkpointing_kwargs: Parameters passed to `torch.utils.checkpoint`. For example, set to `{"use_reentrant": false}`.
- 🔥gradient_checkpointing_kwargs: Parameters passed to `torch.utils.checkpoint`. For example, set to `--gradient_checkpointing_kwargs '{"use_reentrant": false}'`.
- report_to: Default is `tensorboard`.
- remove_unused_columns: Default is False.
- logging_first_step: Whether to log the first step print, default is True.
Expand Down
2 changes: 1 addition & 1 deletion docs/source_en/Instruction/Inference-and-deployment.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
SWIFT supports inference and deployment through command line, Python code, and interface methods:
- Use `engine.infer` or `engine.infer_async` for Python-based inference. See [here](https://github.com/modelscope/ms-swift/blob/main/examples/infer/demo.py) for reference.
- Use `swift infer` for command-line-based inference. See [here](https://github.com/modelscope/ms-swift/blob/main/examples/infer/cli_demo.sh) for reference.
- Use `swift deploy` for service deployment and perform inference using the OpenAI API or `client.infer`. Refer to [here](https://github.com/modelscope/ms-swift/tree/main/examples/deploy/client) for more information.
- Use `swift deploy` for service deployment and perform inference using the OpenAI API or `client.infer`. Refer to the server guidelines [here](https://github.com/modelscope/ms-swift/tree/main/examples/deploy/server) and the client guidelines [here](https://github.com/modelscope/ms-swift/tree/main/examples/deploy/client).
- Deploy the model with `swift web-ui` for web-based inference. You can check [here](../GetStarted/Interface-usage.md) for details.


Expand Down
2 changes: 1 addition & 1 deletion docs/source_en/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,12 @@ Swift DOCUMENTATION
:maxdepth: 2
:caption: Instruction

Instruction/Command-line-parameters.md
Instruction/Pre-training-and-Fine-tuning.md
Instruction/RLHF.md
Instruction/Inference-and-deployment.md
Instruction/Evaluation.md
Instruction/Export.md
Instruction/Command-line-parameters.md
Instruction/Supported-models-and-datasets.md
Instruction/Use-tuners.md
Instruction/Agent-support.md
Expand Down
2 changes: 2 additions & 0 deletions swift/llm/argument/base_args/base_args.py
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,8 @@ class BaseArguments(GenerationArguments, QuantizeArguments, DataArguments, Templ

def _init_custom_register(self) -> None:
"""Register custom .py file to datasets"""
if isinstance(self.custom_register_path, str):
self.custom_register_path = [self.custom_register_path]
self.custom_register_path = to_abspath(self.custom_register_path, True)
for path in self.custom_register_path:
folder, fname = os.path.split(path)
Expand Down
2 changes: 2 additions & 0 deletions swift/llm/argument/base_args/data_args.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,8 @@ class DataArguments:

def _init_custom_dataset_info(self):
"""register custom dataset_info.json to datasets"""
if isinstance(self.custom_dataset_info, str):
self.custom_dataset_info = [self.custom_dataset_info]
for path in self.custom_dataset_info:
register_dataset_info(path)

Expand Down
4 changes: 4 additions & 0 deletions swift/llm/dataset/data/dataset_info.json
Original file line number Diff line number Diff line change
Expand Up @@ -309,6 +309,10 @@
},
{
"ms_dataset_id": "swift/Infinity-Instruct",
"subsets": ["3M", "7M", "0625", "Gen", "7M_domains"],
"columns": {
"label": "_"
},
"hf_dataset_id": "BAAI/Infinity-Instruct",
"tags": ["qa", "quality", "multi-task"],
"huge_dataset": true
Expand Down
2 changes: 1 addition & 1 deletion tests/tuners/test_extra_state_dict.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ def test_swift_extra_state_dict(self):
with open(os.path.join(self.tmp_dir, 'extra_states', 'adapter_model.bin'), 'wb') as f:
torch.save(state_dict, f)
model = Model.from_pretrained('damo/nlp_structbert_sentence-similarity_chinese-base')
model = Swift.from_pretrained(model, self.tmp_dir)
model = Swift.from_pretrained(model, self.tmp_dir, inference_mode=False)
names = [name for name, value in model.named_parameters() if value.requires_grad]
self.assertTrue(any('classifier' in name for name in names))
self.assertTrue(torch.allclose(state_dict['classifier.weight'], model.base_model.classifier.weight))
Expand Down
Loading