-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expanded Hardware Utilization Information #800
Conversation
TravisCI failed because you didn't add Can you make it work with psutil v1.2.1? That's what's available on 14.04:
It's probably a good idea to just not display the information if you can't retrieve it on a certain system for whatever reason. |
Forgot to commit that one.
Yep that's what I'm already checking for on the front-end. On the back-end, on Windows, I do not even attempt to obtain it (and therefore will not be shown in the front-end). |
This is great. Do you think we could show the CPU utilization for dataset creation too? |
Hmm, probably. But is that useful? I haven't found a need for that myself On Wednesday, 1 June 2016, Greg Heinrich notifications@github.com wrote:
|
It's useful if you want to make sure you are utilizing your CPUs efficiently when creating a large dataset. But don't go out of your way to support that if it's not trivial. |
I have implemented this hardware utilization because I had a very distinct need for it. I do not have a need for this in creating the dataset at all. Secondly, implementing is not trivial at all as I do not see a way to accurately seperate the relevant hardware metrics exactly corresponding to the job; where-as currently the hardware utilization is reported only and exactly for that distinct, specific job, which is [to me and other digitizers] really neat and useful. N.B. I would also love to log these kind of metrics but that's for another PR. For example, my favorite metric is the GPU temperature; because it gives insight into some kind of running-average of the usage. I.E. <70 deg = inefficient settings/model. |
This looks good to me thanks. There are conflicts that must be resolved before merging. Question: the disk write info looks OK however the read statistics appear to be underestimated (I have a 4GB dataset and after several epoch the read counter shows only 96kB). Did I misunderstand what it's supposed to show? Or perhaps the process was reading the database from cache and it didn't count? |
Hmm yes indeed it seems the disk statistics are unreliable. Possibly due to On Monday, 13 June 2016, Greg Heinrich notifications@github.com wrote:
|
Okido ready when you are, fixed, squashed & rebased. |
That still looks very good to me, let's see what @lukeyeager thinks :-) |
@@ -21,3 +20,18 @@ | |||
{% endif %} | |||
</dl> | |||
{% endfor %} | |||
{% if data_cpu %} | |||
<h3>CPU ({% if 'pid' in data_cpu %}#{{data_cpu.pid}}{% endif %})</h3> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a little misleading. When I saw CPU (#10291)
I thought it was talking about a CPU core or something. How about Process 20291
instead?
Oops I caused a merge conflict with #825. While you're rebasing, can you also:
|
I'm seeing some training jobs fail to finish with this change. Are you seeing the same? The Caffe task goes to 100% complete, but is stuck at |
No haven't seen that. Will try to reproduce. Might be because process variable |
I think I nailed it now @lukeyeager : replaced the shitty try block with a nice
|
I'm trying this now and am getting this error:
It kills the background socketio thread and now I'm not getting any GPU or CPU information. $ python -c 'import psutil;print psutil.__file__;print psutil.__version__'
/usr/lib/python2.7/dist-packages/psutil/__init__.pyc
3.4.2 Have you tried this with older versions of
|
Sorry my bad. Yeah too new. I'll just use the old functions. On Tuesday, 26 July 2016, Luke Yeager notifications@github.com wrote:
|
I think I fixed it. |
The Travis build is failing, but I think it's related to https://www.traviscistatus.com/incidents/2p40l49r3yxd. I'm following up with them ... |
Version fallback for psutil, tested for versions 1,3,4 and added some checks. Implemented showing hw info also for cpu-only systems
Mkay. I just squashed and rebased hoping to trigger Travis again. edit: yep, worked. I do advise to run a real test like you did before just to be sure. |
Looks good to me! Thanks for the nice addition and for supporting multiple versions of psutil! |
Does Travis also check on Windows OS? If nope, maybe ask @IsaacYangSLA to check this PR's functionality. Some |
I just ran a simple training task on Windows 7. The CPU / Memory usage was shown and updated correctly. So that basically concludes it works in Windows. |
…il_info Expanded Hardware Utilization Information
Needed this for identifying potential CPU (and disk) bottlenecks (for example for testing preprocessing during for #777).
Issues that this will immediately reveal is for example the 1200% CPU usage I had due to some over-optimization in my BLAS library.
Works for Caffe and Torch, but the psutil manual tells me disk usage is not supported on Windows so I'm checking for that one.
Caffe
## Torch