Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WizardCoder-Python-7B模型精度问题 #17

Closed
llwx593 opened this issue Jan 11, 2024 · 4 comments
Closed

WizardCoder-Python-7B模型精度问题 #17

llwx593 opened this issue Jan 11, 2024 · 4 comments

Comments

@llwx593
Copy link

llwx593 commented Jan 11, 2024

作者你好。我测试了一下wizardcoder-python-7b在weight_mask_rate=0.9下在human_eval上的精度,发现只有25左右;然后测试了一下just_inference情况下,即weight_mask_rate=0.0下的精度,发现也只有34.7。测试脚本如下
CUDA_VISIBLE_DEVICES=7 python -u inference_llms_instruct_math_code.py --dataset_name human_eval --finetuned_model_name WizardCoder-Python-7B-V1.0 --tensor_parallel_size 1 --weight_mask_rate 0.9 --use_weight_rescale
CUDA_VISIBLE_DEVICES=7 python -u inference_llms_instruct_math_code.py --dataset_name human_eval --finetuned_model_name WizardCoder-Python-7B-V1.0 --tensor_parallel_size 1 --weight_mask_rate 0.0
其余和现有仓库的代码保持一致,想问是否是需要对某些超参数进行特殊设定?

@yule-BUAA
Copy link
Owner

你好,

我们使用的命令是一致的,运行时不需要设置特定的超参数。

可以确认下是否由下面两个原因引起:

  1. 下载的wizardcoder-python-7b模型是否正确?如果按照这个repository的代码复现性能低,可以尝试下通过WizardLM中的说明进行复现,看是否会发生性能上的变化;
  2. 检查下使用的vllm版本,我使用的是0.1.4,不确定是否会因为使用不同的版本而影响模型inference的性能。

@llwx593
Copy link
Author

llwx593 commented Jan 12, 2024

我检查了下是否是模型问题或者vllm版本问题,发现似乎并不是这两个问题导致的。然后我对比了下wizardlm官方仓库和您的仓库推理human_eval任务的区别(wizardlm官方仓库似乎只提供了wizardcoder-python-34b的eval脚本),发现有两点不一样:

  1. temperature; wizardlm仓库设置是0.2,mergelm设置的是0.0
  2. decoding_style; wizardlm仓库设置是loops=100,mergelm是loops=1

目前我正在测试wizardlm仓库的精度,但我想问下是否可能和上面两个参数相关?

@yule-BUAA
Copy link
Owner

我在运行时,temperature为0.0保证输出使用确定性的greedy策略,loops设置为1也不影响模型效果。感觉应该也不是两个参数的问题。

请问你有尝试WizardCoder-Python-13B或者WizardCoder-Python-34B吗?他们的性能是否可以复现?

@llwx593
Copy link
Author

llwx593 commented Jan 12, 2024

调整vllm版本为0.1.4,transformer版本为4.33.1后精度正常。感谢。

@llwx593 llwx593 closed this as completed Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants