Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

模型训练出错 #41

Closed
svjack opened this issue Jun 13, 2023 · 9 comments
Closed

模型训练出错 #41

svjack opened this issue Jun 13, 2023 · 9 comments
Labels
bug Something isn't working

Comments

@svjack
Copy link

svjack commented Jun 13, 2023

使用 textgen/examples/llama/training_llama_demo.py
微调模型:https://huggingface.co/shibing624/chinese-llama-plus-13b-hf
使用示例数据集data/zh_csc_train.tsv
有下面的错误
assertion srcindex < srcselectdimsize failed.

该用模型 如:https://huggingface.co/shibing624/chinese-alpaca-plus-13b-hf
则正常。

@svjack svjack added the bug Something isn't working label Jun 13, 2023
@shibing624
Copy link
Owner

嗯,我留意到了此问题,llama-plus-13b 本地直接预测也会出此错误,alpaca不会。

可能是transformers升级导致的问题,还在排查。

训练13b,可以用其他模型替代,如alpaca-13b, ziya-13b

@svjack
Copy link
Author

svjack commented Jun 13, 2023

llama model predict 方法感觉对外暴露的(不由default arg指定的)GenerationConfig 参数感觉有点少
**kwargs 应该考虑重载generation_config 会不会更好一些呢?

@shibing624
Copy link
Owner

@svjack
Copy link
Author

svjack commented Jun 13, 2023

有 kwargs: https://github.com/shibing624/textgen/blob/main/textgen/llama/llama_model.py#L519

像这种参数怎么改呢?

repetition_penalty=self.args.repetition_penalty,
length_penalty=self.args.length_penalty,

@svjack
Copy link
Author

svjack commented Jun 13, 2023

@shibing624
Copy link
Owner

shibing624 commented Jun 13, 2023

这样写:https://github.com/shibing624/textgen/blob/main/examples/llama/training_llama_demo.py#L53 写进model_args 就可以,会自动覆盖默认的参数。

@svjack
Copy link
Author

svjack commented Jun 13, 2023

这样写:https://github.com/shibing624/textgen/blob/main/examples/llama/training_llama_demo.py#L53 写进model_args 就可以,会自动覆盖默认的参数。

感觉这里面的一些参数不应该在初始化时指定 而应该在生成时是动态的

@shibing624
Copy link
Owner

初始化时指定的是默认的,生成时指定可以覆盖默认的,类似于max_length 参数,其他的参数也会覆盖默认的,这个我加下。

@shibing624
Copy link
Owner

done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants