Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【PaddlePaddle Hackathon】50、基于 PaddleNLP 语义索引实现 Gradient Cache 策略,实现超大 batch 语义索引模型训练 #1080

Closed
TCChenlong opened this issue Sep 23, 2021 · 1 comment
Labels

Comments

@TCChenlong
Copy link

(此 ISSUE 为 PaddlePaddle Hackathon 活动的任务 ISSUE,更多详见PaddlePaddle Hackathon

【任务说明】

  • 任务标题:基于 PaddleNLP 语义索引实现 Gradient Cache 策略,实现超大 batch 语义索引模型训练

  • 技术标签:python、语义索引

  • 任务难度:困难

  • 详细描述:语义索引模型的效果受 batch_size 影响很大,一般 batch_size 越大模型效果越好,但是受限于 GPU 显存大小,batch_size 在普通硬件上往往无法开到很大;这篇 paper(Paper: https://arxiv.org/pdf/2101.06983.pdf) 提出的 Gradient Cache 算法可以有效扩展 batch_size , 在显存较小条件下也能实现大 batch 语义索引模型训练。

【提交内容】

  • 任务 PR 到 PaddleNLP

  • 相关技术文档(模型效果验证符合预期)

【技术要求】

  • 熟练掌握 python

  • 理解深度学习模型原理

  • 了解语义索引模型基础算法(非必须)

【参考资料】

@github-actions
Copy link

github-actions bot commented Feb 9, 2023

This issue is stale because it has been open for 60 days with no activity. 当前issue 60天内无活动,被标记为stale。

@github-actions github-actions bot added the stale label Feb 9, 2023
@sijunhe sijunhe closed this as completed Feb 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants