-
Notifications
You must be signed in to change notification settings - Fork 901
Description
hi,我更新了ms-swift版本,但是还是无法使用ld_alpha
这是我的训练脚本:
uv run torchrun
--nproc_per_node=$nproc_per_node
--nnodes=$nnodes
--master_addr=$master_addr
--master_port=$master_port
--node_rank=$node_rank
swift/cli/rlhf.py
--rlhf_type dpo
--model_type "gemma3_vision"
--model "Gemma3"
--train_type full
--torch_dtype "bfloat16"
--attn_impl flash_attn
--dataset "dpo_r1.jsonl"
--train_dataloader_shuffle true
--split_dataset_ratio 0
--output_dir "./output/Gemma3_27B"
--eval_strategy "epoch"
--save_strategy "steps"
--save_steps 10000
--save_total_limit 5
--save_only_model true
--num_train_epochs 1
--per_device_train_batch_size 1
--per_device_eval_batch_size 1
--learning_rate 5e-7
--lr_scheduler_type "cosine"
--warmup_ratio 0.02
--gradient_accumulation_steps 16
--logging_steps 10
--max_length 4096
--dataloader_num_workers 16
--gradient_checkpointing true
--deepspeed "zero3"
--rpo_alpha 0
--beta 1
--ld_alpha 0.8
--report_to wandb \
安装包的版本是:absl-py==2.3.1
accelerate==1.10.1
addict==2.4.0
aiofiles==24.1.0
aiohappyeyeballs==2.6.1
aiohttp==3.12.15
aiosignal==1.4.0
airportsdata==20250909
aliyun-python-sdk-core==2.16.0
aliyun-python-sdk-kms==2.16.5
annotated-types==0.7.0
antlr4-python3-runtime==4.13.2
anyio==4.11.0
astor==0.8.1
async-timeout==5.0.1
attrdict==2.0.1
attrs==25.3.0
av==15.1.0
cachetools==6.2.0
certifi==2025.8.3
cffi==2.0.0
charset-normalizer==3.4.3
click==8.3.0
cloudpickle==3.1.1
compressed-tensors==0.9.3
contourpy==1.3.2
cpm-kernels==1.0.11
crcmod==1.7
cryptography==46.0.1
cupy-cuda12x==13.6.0
cycler==0.12.1
dacite==1.9.2
datasets==3.6.0
decord==0.6.0
deepspeed==0.16.9
deprecated==1.2.18
depyf==0.18.0
dill==0.3.8
diskcache==5.6.3
distro==1.9.0
dnspython==2.8.0
docker-pycreds==0.4.0
einops==0.8.1
email-validator==2.3.0
exceptiongroup==1.3.0
fastapi==0.117.1
fastapi-cli==0.0.13
fastapi-cloud-cli==0.2.1
fastrlock==0.8.3
ffmpy==0.6.1
filelock==3.19.1
flash-attn==2.7.3
fonttools==4.60.0
frozenlist==1.7.0
fsspec==2025.3.0
future==1.0.0
gguf==0.17.1
gitdb==4.0.12
gitpython==3.1.45
googleapis-common-protos==1.70.0
gradio==5.47.1
gradio-client==1.13.2
groovy==0.1.2
grpcio==1.75.0
h11==0.16.0
hf-xet==1.1.10
hjson==3.1.0
httpcore==1.0.9
httptools==0.6.4
httpx==0.28.1
huggingface-hub==0.35.1
idna==3.10
importlib-metadata==8.0.0
interegular==0.3.3
ipaddress==1.0.23
jieba==0.42.1
jinja2==3.1.6
jiter==0.11.0
jmespath==0.10.0
joblib==1.5.2
jsonschema==4.25.1
jsonschema-specifications==2025.9.1
kiwisolver==1.4.9
lark==1.2.2
latex2sympy2-extended==1.10.2
llguidance==0.7.30
llvmlite==0.44.0
lm-format-enforcer==0.10.12
markdown==3.9
markdown-it-py==4.0.0
markupsafe==3.0.2
math-verify==0.8.0
matplotlib==3.10.6
mdurl==0.1.2
mistral-common==1.8.5
modelscope==1.30.0
mpi4py==4.1.0
mpmath==1.3.0
ms-swift==3.8.2
msgpack==1.1.1
msgspec==0.19.0
multidict==6.6.4
multiprocess==0.70.16
nest-asyncio==1.6.0
networkx==3.4.2
ninja==1.13.0
nltk==3.9.1
numba==0.61.2
numpy==2.2.6
nvidia-cublas-cu12==12.4.5.8
nvidia-cuda-cupti-cu12==12.4.127
nvidia-cuda-nvrtc-cu12==12.4.127
nvidia-cuda-runtime-cu12==12.4.127
nvidia-cudnn-cu12==9.1.0.70
nvidia-cufft-cu12==11.2.1.3
nvidia-curand-cu12==10.3.5.147
nvidia-cusolver-cu12==11.6.1.9
nvidia-cusparse-cu12==12.3.1.170
nvidia-cusparselt-cu12==0.6.2
nvidia-nccl-cu12==2.21.5
nvidia-nvjitlink-cu12==12.4.127
nvidia-nvtx-cu12==12.4.127
openai==1.109.1
opencv-python-headless==4.12.0.88
opentelemetry-api==1.26.0
opentelemetry-exporter-otlp==1.26.0
opentelemetry-exporter-otlp-proto-common==1.26.0
opentelemetry-exporter-otlp-proto-grpc==1.26.0
opentelemetry-exporter-otlp-proto-http==1.26.0
opentelemetry-proto==1.26.0
opentelemetry-sdk==1.26.0
opentelemetry-semantic-conventions==0.47b0
opentelemetry-semantic-conventions-ai==0.4.13
orjson==3.11.3
oss2==2.19.1
outlines==0.1.11
outlines-core==0.1.26
packaging==25.0
pandas==2.3.2
partial-json-parser==0.2.1.1.post6
peft==0.17.1
pillow==11.3.0
ply==3.11
prometheus-client==0.23.1
prometheus-fastapi-instrumentator==7.1.0
promise==2.3
propcache==0.3.2
protobuf==3.20.3
psutil==7.1.0
py-cpuinfo==9.0.0
pyarrow==21.0.0
pycountry==24.6.1
pycparser==2.23
pycryptodome==3.23.0
pydantic==2.11.9
pydantic-core==2.33.2
pydantic-extra-types==2.10.5
pydub==0.25.1
pygments==2.19.2
pyjwt==2.10.1
pyopenssl==25.3.0
pyparsing==3.2.5
python-dateutil==2.9.0.post0
python-dotenv==1.1.1
python-etcd==0.4.5
python-json-logger==3.3.0
python-multipart==0.0.20
pytz==2025.2
pyyaml==6.0.3
pyzmq==27.1.0
qwen-vl-utils==0.0.14
ray==2.49.2
referencing==0.36.2
regex==2025.9.18
requests==2.32.5
rich==14.1.0
rich-toolkit==0.15.1
rignore==0.6.4
rouge==1.0.1
rpds-py==0.27.1
ruff==0.13.2
safehttpx==0.1.6
safetensors==0.5.3
schedule==1.2.2
scipy==1.15.3
semantic-version==2.10.0
sentencepiece==0.2.1
sentry-sdk==2.39.0
setproctitle==1.3.7
setuptools==80.9.0
shellingham==1.5.4
shortuuid==1.0.13
simplejson==3.20.1
six==1.17.0
smmap==5.0.2
sniffio==1.3.1
sortedcontainers==2.4.0
starlette==0.48.0
sympy==1.13.1
tensorboard==2.20.0
tensorboard-data-server==0.7.2
thriftpy2==0.5.3
tiktoken==0.11.0
timm==1.0.20
tokenizers==0.21.4
tomlkit==0.13.3
torch==2.6.0
torchaudio==2.6.0
torchvision==0.21.0
tos==2.8.7
tqdm==4.67.1
transformers==4.51.3
transformers-stream-generator==0.0.5
triton==3.2.0
trl==0.19.1
typer==0.19.2
typing-extensions==4.15.0
typing-inspection==0.4.1
tzdata==2025.2
urllib3==1.26.20
uvicorn==0.37.0
uvloop==0.21.0
vllm==0.8.5.post1
watchfiles==1.1.0
websockets==15.0.1
werkzeug==3.1.3
wrapt==1.17.3
xformers==0.0.29.post2
xgrammar==0.1.18
xxhash==3.5.0
yarl==1.20.1
zipp==3.23.0
zstandard==0.25.0
求大佬解答🙏