Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Eager] print gpu mem info #42616

Merged
merged 5 commits into from
May 10, 2022
Merged

Conversation

wanghuancoder
Copy link
Contributor

@wanghuancoder wanghuancoder commented May 9, 2022

PR types

Others

PR changes

Others

Describe

修改CI显存监控逻辑,即使在非TESTING下,也编译该逻辑,使用FLAGS_enable_gpu_memory_usage_log控制开关。

在程序结束时打印模型显存实际使用峰值和模型申请显存峰值,单位MB,格式如下:

[Memory Usage (MB)] gpu 0 : Reserved = 2266.72, Allocated = 2237.15

其中Allocated为模型显存实际使用峰值。Reserved为模型向显卡申请显存峰值。

@paddle-bot-old
Copy link

paddle-bot-old bot commented May 9, 2022

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@wanghuancoder wanghuancoder changed the title [Do Not Merge] print mem [Eager] print gpu mem info May 10, 2022
if (FLAGS_enable_gpu_memory_usage_log) {
std::cout << "[Memory Usage (Byte)] gpu " << dev_id_ << " : "
<< MEMORY_STAT_PEAK_VALUE(Reserved, dev_id_) << std::endl;
std::cout << "[Memory Usage (MB)] gpu " << dev_id_ << " : Reserved = "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

对于一些显存小的,用MB为单位的话可能会出现大量的0

Copy link
Contributor Author

@wanghuancoder wanghuancoder May 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我又加了一个FLAGS_enable_gpu_memory_usage_log_mb开关。默认打印MB,FLAGS_enable_gpu_memory_usage_log_mb=false时,打印Byte

Copy link
Contributor

@From00 From00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wanghuancoder wanghuancoder merged commit 8164414 into PaddlePaddle:develop May 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants