Skip to content

ValueError: remaining_argv: ['--ld_alpha', '0.8'] #5979

@hengyu-wang

Description

@hengyu-wang

hi,我更新了ms-swift版本,但是还是无法使用ld_alpha
这是我的训练脚本:
uv run torchrun
--nproc_per_node=$nproc_per_node
--nnodes=$nnodes
--master_addr=$master_addr
--master_port=$master_port
--node_rank=$node_rank
swift/cli/rlhf.py
--rlhf_type dpo
--model_type "gemma3_vision"
--model "Gemma3"
--train_type full
--torch_dtype "bfloat16"
--attn_impl flash_attn
--dataset "dpo_r1.jsonl"
--train_dataloader_shuffle true
--split_dataset_ratio 0
--output_dir "./output/Gemma3_27B"
--eval_strategy "epoch"
--save_strategy "steps"
--save_steps 10000
--save_total_limit 5
--save_only_model true
--num_train_epochs 1
--per_device_train_batch_size 1
--per_device_eval_batch_size 1
--learning_rate 5e-7
--lr_scheduler_type "cosine"
--warmup_ratio 0.02
--gradient_accumulation_steps 16
--logging_steps 10
--max_length 4096
--dataloader_num_workers 16
--gradient_checkpointing true
--deepspeed "zero3"
--rpo_alpha 0
--beta 1
--ld_alpha 0.8
--report_to wandb \

安装包的版本是:absl-py==2.3.1
accelerate==1.10.1
addict==2.4.0
aiofiles==24.1.0
aiohappyeyeballs==2.6.1
aiohttp==3.12.15
aiosignal==1.4.0
airportsdata==20250909
aliyun-python-sdk-core==2.16.0
aliyun-python-sdk-kms==2.16.5
annotated-types==0.7.0
antlr4-python3-runtime==4.13.2
anyio==4.11.0
astor==0.8.1
async-timeout==5.0.1
attrdict==2.0.1
attrs==25.3.0
av==15.1.0
cachetools==6.2.0
certifi==2025.8.3
cffi==2.0.0
charset-normalizer==3.4.3
click==8.3.0
cloudpickle==3.1.1
compressed-tensors==0.9.3
contourpy==1.3.2
cpm-kernels==1.0.11
crcmod==1.7
cryptography==46.0.1
cupy-cuda12x==13.6.0
cycler==0.12.1
dacite==1.9.2
datasets==3.6.0
decord==0.6.0
deepspeed==0.16.9
deprecated==1.2.18
depyf==0.18.0
dill==0.3.8
diskcache==5.6.3
distro==1.9.0
dnspython==2.8.0
docker-pycreds==0.4.0
einops==0.8.1
email-validator==2.3.0
exceptiongroup==1.3.0
fastapi==0.117.1
fastapi-cli==0.0.13
fastapi-cloud-cli==0.2.1
fastrlock==0.8.3
ffmpy==0.6.1
filelock==3.19.1
flash-attn==2.7.3
fonttools==4.60.0
frozenlist==1.7.0
fsspec==2025.3.0
future==1.0.0
gguf==0.17.1
gitdb==4.0.12
gitpython==3.1.45
googleapis-common-protos==1.70.0
gradio==5.47.1
gradio-client==1.13.2
groovy==0.1.2
grpcio==1.75.0
h11==0.16.0
hf-xet==1.1.10
hjson==3.1.0
httpcore==1.0.9
httptools==0.6.4
httpx==0.28.1
huggingface-hub==0.35.1
idna==3.10
importlib-metadata==8.0.0
interegular==0.3.3
ipaddress==1.0.23
jieba==0.42.1
jinja2==3.1.6
jiter==0.11.0
jmespath==0.10.0
joblib==1.5.2
jsonschema==4.25.1
jsonschema-specifications==2025.9.1
kiwisolver==1.4.9
lark==1.2.2
latex2sympy2-extended==1.10.2
llguidance==0.7.30
llvmlite==0.44.0
lm-format-enforcer==0.10.12
markdown==3.9
markdown-it-py==4.0.0
markupsafe==3.0.2
math-verify==0.8.0
matplotlib==3.10.6
mdurl==0.1.2
mistral-common==1.8.5
modelscope==1.30.0
mpi4py==4.1.0
mpmath==1.3.0
ms-swift==3.8.2
msgpack==1.1.1
msgspec==0.19.0
multidict==6.6.4
multiprocess==0.70.16
nest-asyncio==1.6.0
networkx==3.4.2
ninja==1.13.0
nltk==3.9.1
numba==0.61.2
numpy==2.2.6
nvidia-cublas-cu12==12.4.5.8
nvidia-cuda-cupti-cu12==12.4.127
nvidia-cuda-nvrtc-cu12==12.4.127
nvidia-cuda-runtime-cu12==12.4.127
nvidia-cudnn-cu12==9.1.0.70
nvidia-cufft-cu12==11.2.1.3
nvidia-curand-cu12==10.3.5.147
nvidia-cusolver-cu12==11.6.1.9
nvidia-cusparse-cu12==12.3.1.170
nvidia-cusparselt-cu12==0.6.2
nvidia-nccl-cu12==2.21.5
nvidia-nvjitlink-cu12==12.4.127
nvidia-nvtx-cu12==12.4.127
openai==1.109.1
opencv-python-headless==4.12.0.88
opentelemetry-api==1.26.0
opentelemetry-exporter-otlp==1.26.0
opentelemetry-exporter-otlp-proto-common==1.26.0
opentelemetry-exporter-otlp-proto-grpc==1.26.0
opentelemetry-exporter-otlp-proto-http==1.26.0
opentelemetry-proto==1.26.0
opentelemetry-sdk==1.26.0
opentelemetry-semantic-conventions==0.47b0
opentelemetry-semantic-conventions-ai==0.4.13
orjson==3.11.3
oss2==2.19.1
outlines==0.1.11
outlines-core==0.1.26
packaging==25.0
pandas==2.3.2
partial-json-parser==0.2.1.1.post6
peft==0.17.1
pillow==11.3.0
ply==3.11
prometheus-client==0.23.1
prometheus-fastapi-instrumentator==7.1.0
promise==2.3
propcache==0.3.2
protobuf==3.20.3
psutil==7.1.0
py-cpuinfo==9.0.0
pyarrow==21.0.0
pycountry==24.6.1
pycparser==2.23
pycryptodome==3.23.0
pydantic==2.11.9
pydantic-core==2.33.2
pydantic-extra-types==2.10.5
pydub==0.25.1
pygments==2.19.2
pyjwt==2.10.1
pyopenssl==25.3.0
pyparsing==3.2.5
python-dateutil==2.9.0.post0
python-dotenv==1.1.1
python-etcd==0.4.5
python-json-logger==3.3.0
python-multipart==0.0.20
pytz==2025.2
pyyaml==6.0.3
pyzmq==27.1.0
qwen-vl-utils==0.0.14
ray==2.49.2
referencing==0.36.2
regex==2025.9.18
requests==2.32.5
rich==14.1.0
rich-toolkit==0.15.1
rignore==0.6.4
rouge==1.0.1
rpds-py==0.27.1
ruff==0.13.2
safehttpx==0.1.6
safetensors==0.5.3
schedule==1.2.2
scipy==1.15.3
semantic-version==2.10.0
sentencepiece==0.2.1
sentry-sdk==2.39.0
setproctitle==1.3.7
setuptools==80.9.0
shellingham==1.5.4
shortuuid==1.0.13
simplejson==3.20.1
six==1.17.0
smmap==5.0.2
sniffio==1.3.1
sortedcontainers==2.4.0
starlette==0.48.0
sympy==1.13.1
tensorboard==2.20.0
tensorboard-data-server==0.7.2
thriftpy2==0.5.3
tiktoken==0.11.0
timm==1.0.20
tokenizers==0.21.4
tomlkit==0.13.3
torch==2.6.0
torchaudio==2.6.0
torchvision==0.21.0
tos==2.8.7
tqdm==4.67.1
transformers==4.51.3
transformers-stream-generator==0.0.5
triton==3.2.0
trl==0.19.1
typer==0.19.2
typing-extensions==4.15.0
typing-inspection==0.4.1
tzdata==2025.2
urllib3==1.26.20
uvicorn==0.37.0
uvloop==0.21.0
vllm==0.8.5.post1
watchfiles==1.1.0
websockets==15.0.1
werkzeug==3.1.3
wrapt==1.17.3
xformers==0.0.29.post2
xgrammar==0.1.18
xxhash==3.5.0
yarl==1.20.1
zipp==3.23.0
zstandard==0.25.0
求大佬解答🙏

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions