We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
作者您好! 我使用CASIA数据集自行混合生成了训练和验证数据集, 可以进行训练, 但在每个epoch后的验证阶段会不定期卡死. 验证时调用的的函数为trainer\trainer.py: validation(self, epoch), 请问您是否有解决方案? 谢谢!
The text was updated successfully, but these errors were encountered:
您好你有什么报错信息么,给我参考一下
Sorry, something went wrong.
您好, 谢谢您的回复! 没有报错信息, 现象就是CPU利用率直接降低至0. 切掉进程后也没有返回异常信息. 我进行了一些实验, 发现将num_worker降低至16以下可以降低该问题出现的概率. 环境为Intel(R) Xeon(R) Gold 5218R CPU, 可能是加载数据时的调度问题?
有可能是因为设置的num_workers超过cpu的线程数目,造成了进程堵塞
No branches or pull requests
作者您好!
我使用CASIA数据集自行混合生成了训练和验证数据集, 可以进行训练, 但在每个epoch后的验证阶段会不定期卡死.
验证时调用的的函数为trainer\trainer.py: validation(self, epoch), 请问您是否有解决方案?
谢谢!
The text was updated successfully, but these errors were encountered: