Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/source/Customization/新增模型.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,8 @@ swift sft --model my-model --model_type llama --template chatml --dataset xxx
```

如果需要新增model_type和template请给我们提交issue,如果您阅读了我们的源代码,也可以在llm/template和llm/model中添加新的类型。


## 模型注册

请参考[examples](https://github.com/modelscope/swift/blob/main/examples/custom/model.py)中示例代码. 你可以通过指定`--custom_register_path xxx.py`对注册的内容进行解析.
6 changes: 3 additions & 3 deletions docs/source/Instruction/使用tuners.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,9 @@ tuner是指附加在模型上的额外结构部分,用于减少训练参数量
- Res-Tuning: [Res-Tuning: A Flexible and Efficient Tuning Paradigm via Unbinding Tuner from Backbone](https://arxiv.org/abs/2310.19859) < [arXiv](https://arxiv.org/abs/2310.19859) | [Project Page](https://res-tuning.github.io/) | [Usage](ResTuning.md) >
- [PEFT](https://github.com/huggingface/peft)提供的tuners, 如AdaLoRA、DoRA、Fourierft等

# 接口列表
## 接口列表

## Swift类静态接口
### Swift类静态接口

- `Swift.prepare_model(model, config, **kwargs)`
- 接口作用:加载某个tuner到模型上,如果是PeftConfig的子类,则使用Peft库的对应接口加载tuner。在使用SwiftConfig的情况下,本接口可以传入SwiftModel实例并重复调用,此时和config传入字典的效果相同。
Expand Down Expand Up @@ -78,7 +78,7 @@ tuner是指附加在模型上的额外结构部分,用于减少训练参数量
- adapter_name:`str`或`List[str]`或`Dict[str, str]`类型或`None`,待加载tuner目录中的tuner名称,如果为`None`则加载所有名称的tuners,如果是`str`或`List[str]`则只加载某些具体的tuner,如果是`Dict`,则将`key`指代的tuner加载起来后换成`value`的名字
- revision: 如果model_id是魔搭的id,则revision可以指定对应版本号

## SwiftModel接口
### SwiftModel接口

下面列出用户可能调用的接口列表,其他内部接口或不推荐使用的接口可以通过`make docs`命令查看API Doc文档。

Expand Down
2 changes: 1 addition & 1 deletion docs/source/Instruction/命令行参数.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
- load_dataset_config: 当指定resume_from_checkpoint/ckpt_dir会读取保存文件中的`args.json`,将默认为None的参数进行赋值(可通过手动传入进行覆盖). 如果将该参数设置为True, 则会额外读取数据参数. 默认为False
- use_hf: 默认为False. 控制模型下载、数据集下载、模型push的hub
- hub_token: hub token. modelscope的hub token可以查看[这里](https://modelscope.cn/my/myaccesstoken)
- custom_register_path: 自定义模型、对话模板和数据集注册的`.py`文件路径

### 模型参数

Expand All @@ -37,7 +38,6 @@
- 🔥model_name: 仅用于自我认知任务,传入模型中文名和英文名,以空格分隔
- 🔥model_author: 仅用于自我认知任务,传入模型作者的中文名和英文名,以空格分隔
- custom_dataset_info: 自定义简单数据集注册,参考[新增数据集](../Customization/新增数据集.md)
- custom_register_path: 自定义复杂数据集注册,参考[新增数据集](../Customization/新增数据集.md)

### 模板参数
- 🔥template: 模板类型,默认使用model对应的template类型。如果为自定义模型,请参考[支持的模型和数据集](./支持的模型和数据集.md)手动传入这个字段
Expand Down
36 changes: 18 additions & 18 deletions docs/source/Instruction/支持的模型和数据集.md
Original file line number Diff line number Diff line change
Expand Up @@ -212,24 +212,24 @@
|[swift/Meta-Llama-3-70B-Instruct-AWQ](https://modelscope.cn/models/swift/Meta-Llama-3-70B-Instruct-AWQ)|llama3|llama3|-|-|[study-hjt/Meta-Llama-3-70B-Instruct-AWQ](https://huggingface.co/study-hjt/Meta-Llama-3-70B-Instruct-AWQ)|
|[ChineseAlpacaGroup/llama-3-chinese-8b-instruct](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b-instruct)|llama3|llama3|-|-|[hfl/llama-3-chinese-8b-instruct](https://huggingface.co/hfl/llama-3-chinese-8b-instruct)|
|[ChineseAlpacaGroup/llama-3-chinese-8b](https://modelscope.cn/models/ChineseAlpacaGroup/llama-3-chinese-8b)|llama3|llama3|-|-|[hfl/llama-3-chinese-8b](https://huggingface.co/hfl/llama-3-chinese-8b)|
|[LLM-Research/Meta-Llama-3.1-8B-Instruct](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-8B-Instruct)|llama3|llama3|-|-|[meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct)|
|[LLM-Research/Meta-Llama-3.1-70B-Instruct](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-70B-Instruct)|llama3|llama3|-|-|[meta-llama/Meta-Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct)|
|[LLM-Research/Meta-Llama-3.1-405B-Instruct](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-405B-Instruct)|llama3|llama3|-|-|[meta-llama/Meta-Llama-3.1-405B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-405B-Instruct)|
|[LLM-Research/Meta-Llama-3.1-8B](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-8B)|llama3|llama3|-|-|[meta-llama/Meta-Llama-3.1-8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B)|
|[LLM-Research/Meta-Llama-3.1-70B](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-70B)|llama3|llama3|-|-|[meta-llama/Meta-Llama-3.1-70B](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B)|
|[LLM-Research/Meta-Llama-3.1-405B](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-405B)|llama3|llama3|-|-|[meta-llama/Meta-Llama-3.1-405B](https://huggingface.co/meta-llama/Meta-Llama-3.1-405B)|
|[LLM-Research/Meta-Llama-3.1-70B-Instruct-FP8](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-70B-Instruct-FP8)|llama3|llama3|-|-|[meta-llama/Meta-Llama-3.1-70B-Instruct-FP8](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct-FP8)|
|[LLM-Research/Meta-Llama-3.1-405B-Instruct-FP8](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-405B-Instruct-FP8)|llama3|llama3|-|-|[meta-llama/Meta-Llama-3.1-405B-Instruct-FP8](https://huggingface.co/meta-llama/Meta-Llama-3.1-405B-Instruct-FP8)|
|[LLM-Research/Meta-Llama-3.1-8B-Instruct-BNB-NF4](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-8B-Instruct-BNB-NF4)|llama3|llama3|-|-|[hugging-quants/Meta-Llama-3.1-8B-Instruct-BNB-NF4](https://huggingface.co/hugging-quants/Meta-Llama-3.1-8B-Instruct-BNB-NF4)|
|[LLM-Research/Meta-Llama-3.1-70B-Instruct-bnb-4bit](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-70B-Instruct-bnb-4bit)|llama3|llama3|-|-|[unsloth/Meta-Llama-3.1-70B-Instruct-bnb-4bit](https://huggingface.co/unsloth/Meta-Llama-3.1-70B-Instruct-bnb-4bit)|
|[LLM-Research/Meta-Llama-3.1-405B-Instruct-BNB-NF4](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-405B-Instruct-BNB-NF4)|llama3|llama3|-|-|[hugging-quants/Meta-Llama-3.1-405B-Instruct-BNB-NF4](https://huggingface.co/hugging-quants/Meta-Llama-3.1-405B-Instruct-BNB-NF4)|
|[LLM-Research/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4)|llama3|llama3|-|-|[hugging-quants/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4](https://huggingface.co/hugging-quants/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4)|
|[LLM-Research/Meta-Llama-3.1-70B-Instruct-GPTQ-INT4](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-70B-Instruct-GPTQ-INT4)|llama3|llama3|-|-|[hugging-quants/Meta-Llama-3.1-70B-Instruct-GPTQ-INT4](https://huggingface.co/hugging-quants/Meta-Llama-3.1-70B-Instruct-GPTQ-INT4)|
|[LLM-Research/Meta-Llama-3.1-405B-Instruct-GPTQ-INT4](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-405B-Instruct-GPTQ-INT4)|llama3|llama3|-|-|[hugging-quants/Meta-Llama-3.1-405B-Instruct-GPTQ-INT4](https://huggingface.co/hugging-quants/Meta-Llama-3.1-405B-Instruct-GPTQ-INT4)|
|[LLM-Research/Meta-Llama-3.1-8B-Instruct-AWQ-INT4](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-8B-Instruct-AWQ-INT4)|llama3|llama3|-|-|[hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4](https://huggingface.co/hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4)|
|[LLM-Research/Meta-Llama-3.1-70B-Instruct-AWQ-INT4](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-70B-Instruct-AWQ-INT4)|llama3|llama3|-|-|[hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4](https://huggingface.co/hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4)|
|[LLM-Research/Meta-Llama-3.1-405B-Instruct-AWQ-INT4](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-405B-Instruct-AWQ-INT4)|llama3|llama3|-|-|[hugging-quants/Meta-Llama-3.1-405B-Instruct-AWQ-INT4](https://huggingface.co/hugging-quants/Meta-Llama-3.1-405B-Instruct-AWQ-INT4)|
|[AI-ModelScope/Llama-3.1-Nemotron-70B-Instruct-HF](https://modelscope.cn/models/AI-ModelScope/Llama-3.1-Nemotron-70B-Instruct-HF)|llama3|llama3|-|-|[nvidia/Llama-3.1-Nemotron-70B-Instruct-HF](https://huggingface.co/nvidia/Llama-3.1-Nemotron-70B-Instruct-HF)|
|[LLM-Research/Meta-Llama-3.1-8B-Instruct](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-8B-Instruct)|llama3_1|llama3_2|transformers>=4.43|-|[meta-llama/Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct)|
|[LLM-Research/Meta-Llama-3.1-70B-Instruct](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-70B-Instruct)|llama3_1|llama3_2|transformers>=4.43|-|[meta-llama/Meta-Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct)|
|[LLM-Research/Meta-Llama-3.1-405B-Instruct](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-405B-Instruct)|llama3_1|llama3_2|transformers>=4.43|-|[meta-llama/Meta-Llama-3.1-405B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-405B-Instruct)|
|[LLM-Research/Meta-Llama-3.1-8B](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-8B)|llama3_1|llama3_2|transformers>=4.43|-|[meta-llama/Meta-Llama-3.1-8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B)|
|[LLM-Research/Meta-Llama-3.1-70B](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-70B)|llama3_1|llama3_2|transformers>=4.43|-|[meta-llama/Meta-Llama-3.1-70B](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B)|
|[LLM-Research/Meta-Llama-3.1-405B](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-405B)|llama3_1|llama3_2|transformers>=4.43|-|[meta-llama/Meta-Llama-3.1-405B](https://huggingface.co/meta-llama/Meta-Llama-3.1-405B)|
|[LLM-Research/Meta-Llama-3.1-70B-Instruct-FP8](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-70B-Instruct-FP8)|llama3_1|llama3_2|transformers>=4.43|-|[meta-llama/Meta-Llama-3.1-70B-Instruct-FP8](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct-FP8)|
|[LLM-Research/Meta-Llama-3.1-405B-Instruct-FP8](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-405B-Instruct-FP8)|llama3_1|llama3_2|transformers>=4.43|-|[meta-llama/Meta-Llama-3.1-405B-Instruct-FP8](https://huggingface.co/meta-llama/Meta-Llama-3.1-405B-Instruct-FP8)|
|[LLM-Research/Meta-Llama-3.1-8B-Instruct-BNB-NF4](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-8B-Instruct-BNB-NF4)|llama3_1|llama3_2|transformers>=4.43|-|[hugging-quants/Meta-Llama-3.1-8B-Instruct-BNB-NF4](https://huggingface.co/hugging-quants/Meta-Llama-3.1-8B-Instruct-BNB-NF4)|
|[LLM-Research/Meta-Llama-3.1-70B-Instruct-bnb-4bit](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-70B-Instruct-bnb-4bit)|llama3_1|llama3_2|transformers>=4.43|-|[unsloth/Meta-Llama-3.1-70B-Instruct-bnb-4bit](https://huggingface.co/unsloth/Meta-Llama-3.1-70B-Instruct-bnb-4bit)|
|[LLM-Research/Meta-Llama-3.1-405B-Instruct-BNB-NF4](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-405B-Instruct-BNB-NF4)|llama3_1|llama3_2|transformers>=4.43|-|[hugging-quants/Meta-Llama-3.1-405B-Instruct-BNB-NF4](https://huggingface.co/hugging-quants/Meta-Llama-3.1-405B-Instruct-BNB-NF4)|
|[LLM-Research/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4)|llama3_1|llama3_2|transformers>=4.43|-|[hugging-quants/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4](https://huggingface.co/hugging-quants/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4)|
|[LLM-Research/Meta-Llama-3.1-70B-Instruct-GPTQ-INT4](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-70B-Instruct-GPTQ-INT4)|llama3_1|llama3_2|transformers>=4.43|-|[hugging-quants/Meta-Llama-3.1-70B-Instruct-GPTQ-INT4](https://huggingface.co/hugging-quants/Meta-Llama-3.1-70B-Instruct-GPTQ-INT4)|
|[LLM-Research/Meta-Llama-3.1-405B-Instruct-GPTQ-INT4](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-405B-Instruct-GPTQ-INT4)|llama3_1|llama3_2|transformers>=4.43|-|[hugging-quants/Meta-Llama-3.1-405B-Instruct-GPTQ-INT4](https://huggingface.co/hugging-quants/Meta-Llama-3.1-405B-Instruct-GPTQ-INT4)|
|[LLM-Research/Meta-Llama-3.1-8B-Instruct-AWQ-INT4](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-8B-Instruct-AWQ-INT4)|llama3_1|llama3_2|transformers>=4.43|-|[hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4](https://huggingface.co/hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4)|
|[LLM-Research/Meta-Llama-3.1-70B-Instruct-AWQ-INT4](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-70B-Instruct-AWQ-INT4)|llama3_1|llama3_2|transformers>=4.43|-|[hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4](https://huggingface.co/hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4)|
|[LLM-Research/Meta-Llama-3.1-405B-Instruct-AWQ-INT4](https://modelscope.cn/models/LLM-Research/Meta-Llama-3.1-405B-Instruct-AWQ-INT4)|llama3_1|llama3_2|transformers>=4.43|-|[hugging-quants/Meta-Llama-3.1-405B-Instruct-AWQ-INT4](https://huggingface.co/hugging-quants/Meta-Llama-3.1-405B-Instruct-AWQ-INT4)|
|[AI-ModelScope/Llama-3.1-Nemotron-70B-Instruct-HF](https://modelscope.cn/models/AI-ModelScope/Llama-3.1-Nemotron-70B-Instruct-HF)|llama3_1|llama3_2|transformers>=4.43|-|[nvidia/Llama-3.1-Nemotron-70B-Instruct-HF](https://huggingface.co/nvidia/Llama-3.1-Nemotron-70B-Instruct-HF)|
|[LLM-Research/Llama-3.2-1B](https://modelscope.cn/models/LLM-Research/Llama-3.2-1B)|llama3_2|llama3_2|transformers>=4.45|-|[meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B)|
|[LLM-Research/Llama-3.2-3B](https://modelscope.cn/models/LLM-Research/Llama-3.2-3B)|llama3_2|llama3_2|transformers>=4.45|-|[meta-llama/Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B)|
|[LLM-Research/Llama-3.2-1B-Instruct](https://modelscope.cn/models/LLM-Research/Llama-3.2-1B-Instruct)|llama3_2|llama3_2|transformers>=4.45|-|[meta-llama/Llama-3.2-1B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct)|
Expand Down
3 changes: 3 additions & 0 deletions docs/source_en/Customization/New-model.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,6 @@ swift sft --model my-model --model_type llama --template chatml --dataset xxx
```

If you need to add a new `model_type` or `template`, please submit an issue to us. If you have read our source code, you can also add new types in `llm/template` and `llm/model`.

## Model Registration
Please refer to the example code in [examples](https://github.com/modelscope/swift/blob/main/examples/custom/model.py). You can parse the registered content by specifying `--custom_register_path xxx.py`.
2 changes: 1 addition & 1 deletion docs/source_en/Instruction/Command-line-parameters.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ The introduction to command line parameters will cover base arguments, atomic ar
- load_dataset_config: When specifying resume_from_checkpoint/ckpt_dir, it will read the `args.json` in the saved file and assign values to any parameters that are None (can be overridden by manual input). If this parameter is set to True, it will read the data parameters as well. Default is False.
- use_hf: Default is False. Controls model and dataset downloading, and model pushing to the hub.
- hub_token: Hub token. You can check the modelscope hub token [here](https://modelscope.cn/my/myaccesstoken).
- custom_register_path: The file path for the custom model, chat template, and dataset registration `.py` files.

### Model Arguments

Expand Down Expand Up @@ -38,7 +39,6 @@ The introduction to command line parameters will cover base arguments, atomic ar
- 🔥model_name: For self-awareness tasks, input the model's Chinese and English names separated by space.
- 🔥model_author: For self-awareness tasks, input the model author's Chinese and English names separated by space.
- custom_dataset_info: Custom simple dataset registration, refer to [Add New Dataset](../Customization/New-dataset.md).
- custom_register_path: Custom complex dataset registration, refer to [Add New Dataset](../Customization/New-dataset.md).

### Template Arguments

Expand Down
Loading
Loading