diff --git "a/docs/source/Customization/\350\207\252\345\256\232\344\271\211\346\250\241\345\236\213.md" "b/docs/source/Customization/\350\207\252\345\256\232\344\271\211\346\250\241\345\236\213.md"
index 65d9b72775..75b88510a8 100644
--- "a/docs/source/Customization/\350\207\252\345\256\232\344\271\211\346\250\241\345\236\213.md"
+++ "b/docs/source/Customization/\350\207\252\345\256\232\344\271\211\346\250\241\345\236\213.md"
@@ -4,7 +4,6 @@ ms-swift内置的模型，你可以直接通过指定model_id或者model_path来
 
 > [!TIP]
 > 在使用`swift sft`通过LoRA技术微调base模型为chat模型时，例如将Llama3.2-1B微调为chat模型，有时需要手动设置模板。通过添加`--template default`参数来避免base模型因未见过对话模板中的特殊字符而无法正常停止的情况。
-
 ## 模型注册
 
 请参考[examples](https://github.com/modelscope/swift/blob/main/examples/custom)中示例代码。你可以通过指定`--custom_register_path xxx.py`对注册的内容进行解析。
diff --git "a/docs/source/Instruction/\345\221\275\344\273\244\350\241\214\345\217\202\346\225\260.md" "b/docs/source/Instruction/\345\221\275\344\273\244\350\241\214\345\217\202\346\225\260.md"
index e4c47bae19..385fa17603 100644
--- "a/docs/source/Instruction/\345\221\275\344\273\244\350\241\214\345\217\202\346\225\260.md"
+++ "b/docs/source/Instruction/\345\221\275\344\273\244\350\241\214\345\217\202\346\225\260.md"
@@ -37,7 +37,7 @@
 - strict: 如果为True，则数据集只要某行有问题直接抛错，否则会丢弃出错行。默认False
 - 🔥model_name: 仅用于自我认知任务，传入模型中文名和英文名，以空格分隔
 - 🔥model_author: 仅用于自我认知任务，传入模型作者的中文名和英文名，以空格分隔
-- custom_dataset_info: 自定义简单数据集注册，参考[新增数据集](../Customization/新增数据集.md)
+- custom_dataset_info: 自定义简单数据集注册，参考[自定义数据集](../Customization/自定义数据集.md)
 
 ### 模板参数
 - 🔥template: 对话模板类型，默认使用model对应的template类型。`swift pt`会将对话模版转为生成模板使用
@@ -46,7 +46,7 @@
 - truncation_strategy: 如果超长如何处理，支持`delete`和`left`，代表删除和左侧裁剪，默认为left
 - 🔥max_pixels: 多模态模型图片前处理的最大像素数（H\*W），默认不缩放。
 - tools_prompt: 智能体训练时的工具列表转为system的格式，请参考[智能体训练](./智能体的支持.md)，默认为'react_en'
-- loss_scale: 如何针对训练添加token的loss权重。默认为`'default'`，代表所有response（含history）以1计算交叉熵损失。具体可以查看[插件化](../Customization/插件.md)和[智能体训练](./智能体的支持.md)
+- loss_scale: 如何针对训练添加token的loss权重。默认为`'default'`，代表所有response（含history）以1计算交叉熵损失。具体可以查看[插件化](../Customization/插件化.md)和[智能体训练](./智能体的支持.md)
 - sequence_parallel_size: 序列并行数量。参考[example](https://github.com/modelscope/ms-swift/tree/main/examples/train/sequence_parallel/train.sh)
 - use_chat_template: 使用chat模板或generation模板，默认为`True`。`swift pt`会自动设置为generation模板
 - template_backend: 使用swift或jinja进行推理。如果使用jinja，则使用transformers的`apply_chat_template`。默认为swift
diff --git "a/docs/source/Instruction/\346\231\272\350\203\275\344\275\223\347\232\204\346\224\257\346\214\201.md" "b/docs/source/Instruction/\346\231\272\350\203\275\344\275\223\347\232\204\346\224\257\346\214\201.md"
index 1eb1dabb93..0ad3f0ab64 100644
--- "a/docs/source/Instruction/\346\231\272\350\203\275\344\275\223\347\232\204\346\224\257\346\214\201.md"
+++ "b/docs/source/Instruction/\346\231\272\350\203\275\344\275\223\347\232\204\346\224\257\346\214\201.md"
@@ -237,12 +237,12 @@ SWIFT为了提升Agent训练效果，提供了以下技术：
 
 Thought和Final Answer部分权重为1，Action和Action Input部分权重为2，Observation:字段本身权重为2，Observation:后面的实际api调用结果权重为0
 
-具体的loss_scale插件设计，请参考[插件](../Customization/插件.md)部分文档.
+具体的loss_scale插件设计，请参考[插件化](../Customization/插件化.md)文档.
 
 
 ### tools(--tools_prompt)
 
-tools部分为拼装后的system字段格式，除上述介绍的react_en/react_zh/toolbench外，还支持glm4格式。另外用户也可以自行定义格式tools_prompt，同样也可以参考[插件](../Customization/插件.md)部分文档.
+tools部分为拼装后的system字段格式，除上述介绍的react_en/react_zh/toolbench外，还支持glm4格式。另外用户也可以自行定义格式tools_prompt，同样也可以参考[插件化](../Customization/插件化.md)文档.
 
 一个完整的Agent训练脚本请参考[这里](https://github.com/modelscope/ms-swift/tree/main/examples/train/agent/train.sh).
 
diff --git "a/docs/source/Instruction/\351\242\204\350\256\255\347\273\203\345\217\212\345\276\256\350\260\203.md" "b/docs/source/Instruction/\351\242\204\350\256\255\347\273\203\345\217\212\345\276\256\350\260\203.md"
index 7b8810e6ea..8f1c0de715 100644
--- "a/docs/source/Instruction/\351\242\204\350\256\255\347\273\203\345\217\212\345\276\256\350\260\203.md"
+++ "b/docs/source/Instruction/\351\242\204\350\256\255\347\273\203\345\217\212\345\276\256\350\260\203.md"
@@ -2,7 +2,7 @@
 
 由于预训练和微调比较相似，在本文中共同介绍。
 
-预训练和微调的数据格式需求请参考[新增数据集](../Customization/新增数据集.md)部分。
+预训练和微调的数据格式需求请参考[自定义数据集](../Customization/自定义数据集.md)部分。
 
 从数据需求上，继续预训练的训练需求量可能在几十万行~几百万行不等，如果从头预训练需要的卡数和数据量非常庞大，不在本文的讨论范围内。
 微调的数据需求从几千行~百万行不等，更低的数据量请考虑使用RAG方式。
diff --git a/docs/source/index.rst b/docs/source/index.rst
index 9e8a4c4294..4ea240c330 100644
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@@ -35,8 +35,8 @@ Swift DOCUMENTATION
    :maxdepth: 2
    :caption: Customization
 
-   Customization/新增数据集.md
-   Customization/新增模型.md
+   Customization/自定义数据集.md
+   Customization/自定义模型.md
    Customization/插件化.md
 
 Indices and tables
diff --git a/docs/source_en/Customization/Custom-dataset.md b/docs/source_en/Customization/Custom-dataset.md
new file mode 100644
index 0000000000..1584572d22
--- /dev/null
+++ b/docs/source_en/Customization/Custom-dataset.md
@@ -0,0 +1,112 @@
+# Custom Dataset
+
+The standard format for the ms-swift dataset accepts the following keys: 'messages', 'rejected_response', 'label', 'images', 'videos', 'audios', 'tools', and 'objects'. Among these, 'messages' is a required key. 'rejected_response' is used for RLHF training like DPO, 'label' is used for KTO training, while 'images', 'videos', and 'audios' are used for storing paths or URLs of multimodal data. 'tools' is for Agent tasks, and 'objects' is for grounding tasks.
+
+There are three core preprocessors in ms-swift: `MessagesPreprocessor`, `AlpacaPreprocessor`, and `ResponsePreprocessor`. `MessagesPreprocessor` converts datasets in messages and sharegpt formats to the standard format, `AlpacaPreprocessor` converts alpaca format datasets, and `ResponsePreprocessor` converts datasets in query/response format. `AutoPreprocessor` automatically selects the appropriate preprocessor for processing. Typically, `AutoPreprocessor` can handle over 90% of cases.
+
+The following four formats will all be converted to the messages field in the ms-swift standard format by `AutoPreprocessor`:
+
+Messages format:
+```jsonl
+{"messages": [{"role": "system", "content": "<system>"}, {"role": "user", "content": "<query1>"}, {"role": "assistant", "content": "<response1>"}, {"role": "user", "content": "<query2>"}, {"role": "assistant", "content": "<response2>"}]}
+```
+
+ShareGPT format:
+```jsonl
+{"system": "<system>", "conversation": [{"human": "<query1>", "assistant": "<response1>"}, {"human": "<query2>", "assistant": "<response2>"}]}
+```
+
+Alpaca format:
+```jsonl
+{"system": "<system>", "instruction": "<query-inst>", "input": "<query-input>", "output": "<response>"}
+```
+
+Query-Response format:
+```jsonl
+{"system": "<system>", "query": "<query2>", "response": "<response2>", "history": [["<query1>", "<response1>"]]}
+```
+
+There are three ways to integrate a custom dataset, with increasing control over preprocessing functions:
+1. **Recommended**: Directly use `--dataset <dataset_id_or_path>` to integrate with AutoPreprocessor. This supports csv, json, jsonl, txt, and folder formats.
+2. Write a dataset_info.json file. You can refer to the built-in [dataset_info.json](https://github.com/modelscope/ms-swift/blob/main/swift/llm/dataset/data/dataset_info.json) in ms-swift. One of ms_dataset_id/hf_dataset_id/dataset_path is required, and column name conversion can be handled through the `columns` field. Format conversion uses AutoPreprocessor. Use `--custom_dataset_info xxx.json` to parse the JSON file.
+3. Manually register the dataset, which offers the most flexible customization of preprocessing functions but is more complex. You can refer to examples in [examples](https://github.com/modelscope/swift/blob/main/examples/custom) by specifying `--custom_register_path xxx.py` to parse the registration contents.
+
+## Recommended Dataset Format
+
+Here is the recommended dataset format for ms-swift:
+
+### Pre-training
+
+```jsonl
+{"messages": [{"role": "assistant", "content": "I love music"}]}
+{"messages": [{"role": "assistant", "content": "Coach, I want to play basketball"}]}
+{"messages": [{"role": "assistant", "content": "Which is more authoritative, tomato and egg rice or the third fresh stir-fry?"}]}
+```
+
+### Supervised Fine-tuning
+
+```jsonl
+{"messages": [{"role": "system", "content": "You are a useful and harmless assistant"}, {"role": "user", "content": "Tell me tomorrow's weather"}, {"role": "assistant", "content": "Tomorrow's weather will be sunny"}]}
+{"messages": [{"role": "system", "content": "You are a useful and harmless math calculator"}, {"role": "user", "content": "What is 1 + 1?"}, {"role": "assistant", "content": "It equals 2"}, {"role": "user", "content": "What about adding 1?"}, {"role": "assistant", "content": "It equals 3"}]}
+```
+
+### RLHF
+
+#### DPO/ORPO/CPO/SimPO
+
+```jsonl
+{"messages": [{"role": "system", "content": "You are a useful and harmless assistant"}, {"role": "user", "content": "Tell me tomorrow's weather"}, {"role": "assistant", "content": "Tomorrow's weather will be sunny"}], "rejected_response": "I don't know"}
+{"messages": [{"role": "system", "content": "You are a useful and harmless math calculator"}, {"role": "user", "content": "What is 1 + 1?"}, {"role": "assistant", "content": "It equals 2"}, {"role": "user", "content": "What about adding 1?"}, {"role": "assistant", "content": "It equals 3"}], "rejected_response": "I don't know"}
+```
+
+#### KTO
+
+```jsonl
+{"messages": [{"role": "system", "content": "You are a useful and harmless assistant"}, {"role": "user", "content": "Tell me tomorrow's weather"}, {"role": "assistant", "content": "I don't know"}], "label": false}
+{"messages": [{"role": "system", "content": "You are a useful and harmless math calculator"}, {"role": "user", "content": "What is 1 + 1?"}, {"role": "assistant", "content": "It equals 2"}, {"role": "user", "content": "What about adding 1?"}, {"role": "assistant", "content": "It equals 3"}], "label": true}
+```
+
+### Multimodal
+
+For multimodal datasets, the format is the same as above. The difference is that it includes the keys `images`, `videos`, and `audios`, which represent multimodal resources:
+```jsonl
+{"messages": [{"role": "system", "content": "You are a useful and harmless assistant"}, {"role": "user", "content": "<image> What is in the image? <video> What is in the video?"}, {"role": "assistant", "content": "An elephant and a lion"}], "images": ["/xxx/x.jpg"], "videos": ["/xxx/x.mp4"]}
+```
+The `<image>`, `<video>`, and `<audio>` tags indicate where to insert images/videos/audios.
+
+#### Grounding
+
+For grounding (object detection) tasks, SWIFT supports two methods:
+1. Maintain consistency with the above multimodal dataset format, adding special characters in the dataset, for example:
+```jsonl
+{"messages": [{"role": "system", "content": "You are a useful and harmless assistant"}, {"role": "user", "content": "<image> Find a <ref> elephant </ref>"}, {"role": "assistant", "content": "<box>(200,450),(500,800)</box>"}], "images": ["/xxx/x.jpg"]}
+```
+With this type of data, please note:
+  - Grounding tasks often require special characters. You need to determine which model to use, read the model paper to identify special characters for grounding tasks, and combine the data accordingly.
+  - The bbox coordinates may use actual image coordinates or thousandth coordinates. Please confirm this before assembling data.
+  - Different models require different data formats; you need to recreate the dataset if switching models.
+
+2. Use SWIFT's grounding data format:
+
+```jsonl
+# Object detection
+{"messages": [{"role": "system", "content": "You are a useful and harmless assistant"}, {"role": "user", "content": "<image> Identify <bbox>"}, {"role": "assistant", "content": "<ref-object>"}], "images": ["/coco2014/train2014/COCO_train2014_000000001507.jpg"], "objects": "[{\"caption\": \"guy in red\", \"bbox\": [138, 136, 235, 359], \"bbox_type\": \"real\", \"image\": 0}]"}
+# Grounding to multiple bboxes
+{"messages": [{"role": "system", "content": "You are a useful and harmless assistant"}, {"role": "user", "content": "<image> Find <ref-object>"}, {"role": "assistant", "content": "<bbox>"}], "images": ["/coco2014/train2014/COCO_train2014_000000001507.jpg"], "objects": "[{\"caption\": \"guy in red\", \"bbox\": [[138, 136, 235, 359], [1,2,3,4]], \"bbox_type\": \"real\", \"image\": 0}]"}
+```
+
+This format adds the objects field, which includes:
+ - caption: description of the object corresponding to the bbox
+ - bbox: coordinates, suggested as four integers (not floats), representing x_min, y_min, x_max, y_max
+ - bbox_type: bbox type, currently supporting three types: real/norm_1000/norm_1, representing actual pixel coordinates/thousandth ratio coordinates/normalized ratio coordinates
+ - image: index of the image corresponding to the bbox, starting from 0
+
+### Text-to-Image Format
+
+```jsonl
+{"messages": [{"role": "system", "content": "You are a useful and harmless assistant"}, {"role": "user", "content": "Draw me an apple"}, {"role": "assistant", "content": "<image>"}], "images": ["/xxx/x.jpg"]}
+```
+
+### Agent Format
+
+Refer to the [Agent documentation](../Instruction/Agent-support.md) for the Agent format.
diff --git a/docs/source_en/Customization/Custom-model.md b/docs/source_en/Customization/Custom-model.md
new file mode 100644
index 0000000000..a6d8a5308a
--- /dev/null
+++ b/docs/source_en/Customization/Custom-model.md
@@ -0,0 +1,9 @@
+# Custom Model
+
+The models built into ms-swift can be used directly by specifying either `model_id` or `model_path`: `--model <model_id_or_path>`. ms-swift determines the `model_type` based on the suffix of `model_id/model_path` and the `config.json` file. Each `model_type` has a unique model structure, template, and loading method. Of course, you can also manually override these by passing `--model_type` and `--template`. You can check the supported `model_type` and templates in the [Supported Models and Datasets](../Instruction/Supported-models-and-datasets.md).
+
+> [!TIP]
+> When using `swift sft` to fine-tune a base model into a chat model using LoRA technology, for instance, fine-tuning Llama3.2-1B into a chat model, you may need to manually set the template. Adding the `--template default` parameter can help avoid issues where the base model fails to stop properly due to encountering special characters in the conversation template that it hasn't seen before.
+## Model Registration
+
+Please refer to the example code in [examples](https://github.com/modelscope/swift/blob/main/examples/custom). You can parse the registered content by specifying `--custom_register_path xxx.py`.
diff --git a/docs/source_en/Customization/New-dataset.md b/docs/source_en/Customization/New-dataset.md
deleted file mode 100644
index c0f94191cf..0000000000
--- a/docs/source_en/Customization/New-dataset.md
+++ /dev/null
@@ -1,101 +0,0 @@
-# New Dataset
-
-## Local Dataset
-
-### Pre-trained Format
-
-```jsonl
-{"messages": [{"role": "assistant", "content": "I love music"}]}
-{"messages": [{"role": "assistant", "content": "Coach, I want to play basketball"}]}
-{"messages": [{"role": "assistant", "content": "Which is more authoritative, tomato and egg rice or the three fresh dishes?"}]}
-```
-
-### Fine-tuning Format
-
-```jsonl
-{"messages": [{"role": "system", "content": "You are a useful and harmless assistant"}, {"role": "user", "content": "Tell me the weather for tomorrow"}, {"role": "assistant", "content": "Tomorrow's weather is sunny"}]}
-{"messages": [{"role": "system", "content": "You are a useful and harmless calculator"}, {"role": "user", "content": "What is 1+1"}, {"role": "assistant", "content": "It equals 2"}, {"role": "user", "content": "What about adding 1 again?"}, {"role": "assistant", "content": "It equals 3"}]}
-```
-
-### Human Alignment Format
-
-#### DPO/ORPO/CPO/SimPO
-
-```jsonl
-{"messages": [{"role": "system", "content": "You are a useful and harmless assistant"}, {"role": "user", "content": "Tell me the weather for tomorrow"}, {"role": "assistant", "content": "Tomorrow's weather is sunny"}], "rejected_response": "I don't know"}
-{"messages": [{"role": "system", "content": "You are a useful and harmless calculator"}, {"role": "user", "content": "What is 1+1"}, {"role": "assistant", "content": "It equals 2"}, {"role": "user", "content": "What about adding 1 again?"}, {"role": "assistant", "content": "It equals 3"}], "rejected_response": "I don't know"}
-```
-
-#### KTO
-
-```jsonl
-{"messages": [{"role": "system", "content": "You are a useful and harmless assistant"}, {"role": "user", "content": "Tell me the weather for tomorrow"}, {"role": "assistant", "content": "I don't know"}], "label": false}
-{"messages": [{"role": "system", "content": "You are a useful and harmless calculator"}, {"role": "user", "content": "What is 1+1"}, {"role": "assistant", "content": "It equals 2"}, {"role": "user", "content": "What about adding 1 again?"}, {"role": "assistant", "content": "It equals 3"}], "label": true}
-```
-
-For multimodal datasets, the format remains the same as above, with the addition of several keys like `images`, `videos`, and `audios`, which represent multimodal resources. For example:
-
-```jsonl
-{"messages": [{"role": "system", "content": "You are a useful and harmless assistant"}, {"role": "user", "content": "<image> What is in the picture? <video> What is in the video?"}, {"role": "assistant", "content": "An elephant, a lion"}], "images": ["/xxx/x.jpg"], "videos": ["/xxx/x.mp4"]}
-```
-
-In which the `<image>` and `<video>` tags represent the insertion points for images. The multimodal training of SWIFT supports the mixed use of multiple resources and modalities.
-
-For grounding (object detection) tasks, SWIFT supports two methods:
-1. Keeping the format consistent with the above multimodal dataset format while adding special characters to the dataset, for instance:
-
-```jsonl
-{"messages": [{"role": "system", "content": "You are a useful and harmless assistant"}, {"role": "user", "content": "<image> Find <ref> an elephant </ref>"}, {"role": "assistant", "content": "<box>(200,450),(500, 800)</box>"}], "images": ["/xxx/x.jpg"]}
-```
-When using this type of data, please be mindful of:
-  - Grounding tasks often require special characters; you need to determine which model to use and read the model paper to confirm which special characters are used for grounding tasks and how to merge data.
-  - The bbox coordinates may either use the real coordinates of the image or use thousandths coordinates; please clarify this before merging data.
-  - Different models require different data formats; if you switch models, please recreate the dataset.
-
-2. Using SWIFT's grounding data format:
-
-```jsonl
-# Object detection
-{"messages": [{"role": "system", "content": "You are a useful and harmless assistant"}, {"role": "user", "content": "<image> Identify <bbox>"}, {"role": "assistant", "content": "<ref-object>"}], "images": ["/coco2014/train2014/COCO_train2014_000000001507.jpg"], "objects": "[{\"caption\": \"guy in red\", \"bbox\": [138, 136, 235, 359], \"bbox_type\": \"real\", \"image\": 0}]" }
-# Grounding to multiple bboxes
-{"messages": [{"role": "system", "content": "You are a useful and harmless assistant"}, {"role": "user", "content": "<image> Find <ref-object>"}, {"role": "assistant", "content": "<bbox>"}], "images": ["/coco2014/train2014/COCO_train2014_000000001507.jpg"], "objects": "[{\"caption\": \"guy in red\", \"bbox\": [[138, 136, 235, 359], [1,2,3,4]], \"bbox_type\": \"real\", \"image\": 0}]" }
-```
-
-This format contains an additional `objects` field that includes:
- - `caption`: the description of the object corresponding to the bbox
- - `bbox`: coordinates; it is recommended to provide four integers (rather than float) representing `x_min`, `y_min`, `x_max`, `y_max`.
- - `bbox_type`: the type of bbox; currently supports three types: `real/norm_1000/norm_1`, representing actual pixel coordinates/thousandth ratio coordinates/normalized ratio coordinates.
- - `image`: indicates which image the bbox corresponds to, with indexing starting from 0.
-
-### Text-to-Image Format
-
-```jsonl
-{"messages": [{"role": "system", "content": "You are a useful and harmless assistant"}, {"role": "user", "content": "Draw me an apple"}, {"role": "assistant", "content": "<image>"}], "images": ["/xxx/x.jpg"]}
-```
-
-### Agent Format
-
-The Agent format is more complex. Please refer to the [Agent Documentation](../Instruction/Agent-support.md).
-
-## Register Hub Dataset
-
-### Simple Data Format
-
-You can refer to the [built-in dataset_info.json in swift](https://github.com/modelscope/swift/blob/main/swift/llm/dataset/data/dataset_info.json) for dataset expansion. You can directly add to the built-in dataset_info.json, or you can use `--custom_dataset_info 1.json` to specify the path, JSON string or dictionary of an external dataset_info.json.
-
-```json
-[
-    {
-        "ms_dataset_id": "AI-ModelScope/xxx",
-        "hf_dataset_id": "my-group/xxx",
-        "columns": {
-            "question": "query",
-            "answer": "response"
-        }
-    }
-]
-```
-
-### Complex Data Format
-
-For examples, please refer to [examples](https://github.com/modelscope/swift/blob/main/examples/custom/dataset.py). You can parse the registered content by specifying `--custom_register_path xxx.py`.
diff --git a/docs/source_en/Customization/New-model.md b/docs/source_en/Customization/New-model.md
deleted file mode 100644
index 81010683af..0000000000
--- a/docs/source_en/Customization/New-model.md
+++ /dev/null
@@ -1,12 +0,0 @@
-# New Model
-
-It is generally recommended to specify the model ID directly using `--model`, in conjunction with `--model_type` and `--template`. For example:
-
-```shell
-swift sft --model my-model --model_type llama --template chatml --dataset xxx
-```
-
-If you need to add a new `model_type` or `template`, please submit an issue to us. If you have read our source code, you can also add new types in `llm/template` and `llm/model`.
-
-## Model Registration
-Please refer to the example code in [examples](https://github.com/modelscope/swift/blob/main/examples/custom/model.py). You can parse the registered content by specifying `--custom_register_path xxx.py`.
diff --git a/docs/source_en/Instruction/Agent-support.md b/docs/source_en/Instruction/Agent-support.md
index 69bebf1b10..93a4dc63a6 100644
--- a/docs/source_en/Instruction/Agent-support.md
+++ b/docs/source_en/Instruction/Agent-support.md
@@ -236,11 +236,11 @@ This technique adjusts the training weights for portions of model output. For ex
 
 Weights for the Thought and Final Answer sections are set to 1; weights for Action and Action Input sections are set to 2; the weight for the Observation: field itself is 2, while the weight for the actual API call response following Observation: is 0.
 
-For a detailed explanation of the loss_scale plugin design, please refer to the [plugin](../Customization/Plugin) documentation.
+For a detailed explanation of the loss_scale plugin design, please refer to the [pluginization](../Customization/Pluginization.md) documentation.
 
 ### tools(--tools_prompt)
 
-The tools section corresponds to the format of the system field after assembly. In addition to the previously mentioned react_en/react_zh/toolbench, it also supports glm4 format. Furthermore, users can define their own format, tools_prompt, which can also refer to the documentation in the [Plugins](../Customization/Plugin) section.
+The tools section corresponds to the format of the system field after assembly. In addition to the previously mentioned react_en/react_zh/toolbench, it also supports glm4 format. Furthermore, users can define their own format, tools_prompt, which can also refer to the documentation in the [Pluginization](../Customization/Pluginization.md) section.
 
 For a complete agent training script, see [this link](https://github.com/modelscope/ms-swift/tree/main/examples/train/agent/train.sh).
 
diff --git a/docs/source_en/Instruction/Command-line-parameters.md b/docs/source_en/Instruction/Command-line-parameters.md
index 6f4c478528..81d8227a4f 100644
--- a/docs/source_en/Instruction/Command-line-parameters.md
+++ b/docs/source_en/Instruction/Command-line-parameters.md
@@ -15,8 +15,8 @@ The introduction to command line parameters will cover base arguments, atomic ar
 
 ### Model Arguments
 
-- 🔥model: Model ID or local path to the model. If it's a custom model, please use it with `model_type` and `template`.
-- model_type: Model type. The same model architecture, template, and loading process define a model type.
+- 🔥model: Model ID or local path to the model. If it's a custom model, please use it with `model_type` and `template`. The specific details can be referred to in the [Custom Model](../Customization/Custom-model.md).
+- model_type: Model type. The same model architecture, template, and loading process define a model_type.
 - model_revision: Model version.
 - 🔥torch_dtype: Data type for model weights, supports `float16`, `bfloat16`, `float32`, default is read from the config file.
 - attn_impl: Attention type, supports `flash_attn`, `sdpa`, `eager`, default is sdpa.
@@ -25,8 +25,7 @@ The introduction to command line parameters will cover base arguments, atomic ar
 - local_repo_path: Some models require a GitHub repo when loading. To avoid network issues during `git clone`, you can directly use a local repo. This parameter needs to pass the local repo path, default is `None`.
 
 ### Data Arguments
-
-- 🔥dataset: Dataset ID or path. The format is `dataset_id or dataset_path:sub_dataset#sample_count`, where sub_dataset and sample_count are optional. Use spaces to pass multiple datasets. Local datasets support jsonl, csv, json, and folders, etc.
+- 🔥dataset: Dataset ID or path. The format is `dataset_id or dataset_path:sub_dataset#sample_count`, where sub_dataset and sample_count are optional. Use spaces to pass multiple datasets. Local datasets support jsonl, csv, json, and folders, etc. For custom datasets, you can refer to [Custom Dataset](../Customization/Custom-dataset.md).
 - 🔥val_dataset: Validation dataset ID or path.
 - 🔥split_dataset_ratio: How to split the training and validation sets when val_dataset is not specified, default is 0.01.
 - data_seed: Random seed for the dataset, default is 42.
@@ -38,17 +37,16 @@ The introduction to command line parameters will cover base arguments, atomic ar
 - strict: If True, the dataset will throw an error if any row has a problem; otherwise, it will discard the erroneous row. Default is False.
 - 🔥model_name: For self-awareness tasks, input the model's Chinese and English names separated by space.
 - 🔥model_author: For self-awareness tasks, input the model author's Chinese and English names separated by space.
-- custom_dataset_info: Custom simple dataset registration, refer to [Add New Dataset](../Customization/New-dataset.md).
+- custom_dataset_info: Custom simple dataset registration, refer to the [Custom Dataset](../Customization/Custom-dataset.md) Documentation.
 
 ### Template Arguments
-
-- 🔥template: Template type, default uses the corresponding template type of the model. If it is a custom model, please refer to [Supported Models and Datasets](./Supported-models-and-datasets) and manually input this field.
+- 🔥template: Type of dialogue template, which defaults to the template type corresponding to the model. `swift pt` will convert the dialogue template into a generation template for use.
 - 🔥system: Custom system field, default is None, uses the default system of the template.
 - 🔥max_length: Maximum length of tokens for a single sample, default is None (no limit).
 - truncation_strategy: How to handle overly long tokens, supports `delete` and `left`, representing deletion and left trimming, default is left.
 - 🔥max_pixels: Maximum pixel count for pre-processing images in multimodal models (H*W), default is no scaling.
 - tools_prompt: The list of tools for agent training converted to system format, refer to [Agent Training](./Agent-support.md), default is 'react_en'.
-- loss_scale: How to add token loss weight during training. Default is `'default'`, meaning all responses (including history) are treated as 1 for cross-entropy loss. For specifics, see [Plugin](../Customization/plugin.md) and [Agent Training](./Agent-support.md).
+- loss_scale: How to add token loss weight during training. Default is `'default'`, meaning all responses (including history) are treated as 1 for cross-entropy loss. For specifics, see [Pluginization](../Customization/Pluginization.md) and [Agent Training](./Agent-support.md).
 - sequence_parallel_size: Number of sequence parallelism. Refer to [example](https://github.com/modelscope/ms-swift/tree/main/examples/train/sequence_parallel/train.sh).
 - use_chat_template: Use chat template or generation template, default is `True`. `swift pt` is automatically set to the generation template.
 - template_backend: Use swift or jinja for inference. If using jinja, it will utilize transformers' `apply_chat_template`. Default is swift.
diff --git a/docs/source_en/Instruction/Pre-training-and-Fine-tuning.md b/docs/source_en/Instruction/Pre-training-and-Fine-tuning.md
index 42fedffabf..c2507a8a44 100644
--- a/docs/source_en/Instruction/Pre-training-and-Fine-tuning.md
+++ b/docs/source_en/Instruction/Pre-training-and-Fine-tuning.md
@@ -2,7 +2,7 @@
 
 Since pre-training and fine-tuning are quite similar, they will be discussed together in this section.
 
-For the data format requirements for pre-training and fine-tuning, please refer to the section on [Adding Datasets](../Customization/New-dataset.md).
+For the data format requirements for pre-training and fine-tuning, please refer to the section on [Custom Dataset](../Customization/Custom-dataset.md).
 
 In terms of data requirements, the amount needed for continued pre-training can range from hundreds of thousands to millions of rows. Starting pre-training from scratch requires significantly more resources and data, which is beyond the scope of this article.
 The data needed for fine-tuning can vary from a few thousand to a million rows. For lower data requirements, consider using RAG methods.
diff --git a/docs/source_en/index.rst b/docs/source_en/index.rst
index b833a2d951..85587e7ea6 100644
--- a/docs/source_en/index.rst
+++ b/docs/source_en/index.rst
@@ -36,8 +36,8 @@ Swift DOCUMENTATION
    :maxdepth: 2
    :caption: Customization
 
-   Customization/New-dataset.md
-   Customization/New-model.md
+   Customization/Custom-dataset.md
+   Customization/Custom-model.md
    Customization/Plugin.md