-
Notifications
You must be signed in to change notification settings - Fork 292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unbound memory allocation on system with six gpus #47
Comments
Hi, It seems that there is a problem with the window initialization code. Your backtrace shows a NULL window pointer. Could you please provide me the gdb output after: At which terminal size does it break? |
Here's the full screen terminal size which runs into this issue
and a tmux split pane in which in works is of size The gdb output:
Thank you! 🙇 |
Could you please confirm that the patch 8b56210 on the dev branch fixes your problem? |
Wonderful, the dev branch fixes the problem! 🎉 Thank you for this quick fix! 🤗 Shows me two plots at the top and a third one below (out of six gpus). |
You are welcome, |
Hi, Sorry for commenting in a closed issue but I am still having the same issue as the OP faced, but our system has 8 GPUs. Reducing the size of the terminal and nvtop works correctly by showing 4 plots, each displaying 2 GPUs. I am building nvtop from the master branch. |
Hello, Can you please provide the location of the error in the same way Daniel did, the size of your terminal, and the output of the debugger for the following commands:
To generate a debug build you have to specify Thanks |
Address sanitizer backtrace:
|
@Syllo The problem is gone, thank you! |
Hey there - thanks for this amazing monitoring tool! 🙇
Here's an issue I'm hitting: when running on a 6-gpu system nvtop allocates memory until it is getting killed by the linux oom killer. It looks like there is an overflow somewhere leading to unbound memory allocation (at a rate of multiple GBs per second).
Another data point: this behavior stops to happen when I run in a small terminal (e.g. 80x24) or in a tmux split pane, which indicates it has something to do with the live utilization plots.
When running a debug build and sending a SIGHUP signal during the memory allocation I get backtraces indicating draw_plots in the problem, e.g.
Hope that helps, let me know if you need more information.
The text was updated successfully, but these errors were encountered: