diff --git "a/docs/source/LLM/\345\221\275\344\273\244\350\241\214\345\217\202\346\225\260.md" "b/docs/source/LLM/\345\221\275\344\273\244\350\241\214\345\217\202\346\225\260.md" index e2fd8ff09..006064edd 100644 --- "a/docs/source/LLM/\345\221\275\344\273\244\350\241\214\345\217\202\346\225\260.md" +++ "b/docs/source/LLM/\345\221\275\344\273\244\350\241\214\345\217\202\346\225\260.md" @@ -108,7 +108,7 @@ ### FSDP参数 - `--fsdp`: 默认值`''`, fsdp类型, 详情可以查看该参数[原始文档](https://huggingface.co/docs/transformers/v4.39.3/en/main_classes/trainer#transformers.TrainingArguments.fsdp). -- `--fsdp_config`: 默认值`None`, fsdp配置文件的路径, 支持传入`fsdp_offload`, 该文件为SWIFT提供的默认配置, 具体可以查看[这里](https://github.com/modelscope/swift/tree/main/swift/llm/fsdp_config/fsdp_offload.json). +- `--fsdp_config`: 默认值`None`, fsdp配置文件的路径. ### LoRA+微调参数 diff --git a/docs/source_en/LLM/Command-line-parameters.md b/docs/source_en/LLM/Command-line-parameters.md index 2fda244de..a2b9c1787 100644 --- a/docs/source_en/LLM/Command-line-parameters.md +++ b/docs/source_en/LLM/Command-line-parameters.md @@ -107,9 +107,9 @@ ### FSDP Parameters -- `--fsdp`: Default value`''`, the FSDP type, please check[this documentation](https://huggingface.co/docs/transformers/v4.39.3/en/main_classes/trainer#transformers.TrainingArguments.fsdp) for details. +- `--fsdp`: Default value `''`, the FSDP type, please check [this documentation](https://huggingface.co/docs/transformers/v4.39.3/en/main_classes/trainer#transformers.TrainingArguments.fsdp) for details. -- `--fsdp_config`: Default value`None`, the FSDP config file path, `fsdp_offload` is a special value, check [here](https://github.com/modelscope/swift/tree/main/swift/llm/fsdp_config/fsdp_offload.json) for details. +- `--fsdp_config`: Default value `None`, the FSDP config file path. ### LoRA+ Fine-tuning Parameters diff --git a/swift/llm/fsdp_config/fsdp_offload.json b/examples/pytorch/llm/scripts/llama2_70b_chat/qlora_fsdp/fsdp_offload.json similarity index 100% rename from swift/llm/fsdp_config/fsdp_offload.json rename to examples/pytorch/llm/scripts/llama2_70b_chat/qlora_fsdp/fsdp_offload.json diff --git a/examples/pytorch/llm/scripts/llama2_70b_chat/qlora_fsdp/sft.sh b/examples/pytorch/llm/scripts/llama2_70b_chat/qlora_fsdp/sft.sh index 6270cf28e..714c61d3b 100644 --- a/examples/pytorch/llm/scripts/llama2_70b_chat/qlora_fsdp/sft.sh +++ b/examples/pytorch/llm/scripts/llama2_70b_chat/qlora_fsdp/sft.sh @@ -4,7 +4,7 @@ nproc_per_node=2 PYTHONPATH=../../.. \ CUDA_VISIBLE_DEVICES=0,1 \ -accelerate launch --config_file "../../../swift/llm/fsdp_config/fsdp_offload.json" \ +accelerate launch --config_file "./scripts/llama2_70b_chat/qlora_fsdp/fsdp_offload.json" \ llm_sft.py \ --model_type llama2-70b-chat \ --model_revision master \ diff --git a/swift/llm/utils/argument.py b/swift/llm/utils/argument.py index 6167d1a61..7c21f633b 100644 --- a/swift/llm/utils/argument.py +++ b/swift/llm/utils/argument.py @@ -290,11 +290,6 @@ def __post_init__(self) -> None: self.deepspeed = os.path.abspath( os.path.join(ds_config_folder, 'zero3.json')) - fsdp_config_folder = os.path.join(__file__, '..', '..', 'fsdp_config') - if self.fsdp_config == 'fsdp_offload': - self.fsdp_config = os.path.abspath( - os.path.join(fsdp_config_folder, 'fsdp_offload.json')) - handle_path(self) set_model_type(self) if isinstance(self.dataset, str):