Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Values of '[Not Supported]' are not handled properly. #2

Closed
tasptz opened this issue Oct 5, 2017 · 7 comments
Closed

Values of '[Not Supported]' are not handled properly. #2

tasptz opened this issue Oct 5, 2017 · 7 comments
Assignees
Labels

Comments

@tasptz
Copy link

tasptz commented Oct 5, 2017

Values of '[Not Supported]' are not handled properly.

In [1]: import GPUtil

In [2]: g = GPUtil.getGPUs()
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-871afb3451f3> in <module>()
----> 1 g = GPUtil.getGPUs()

~\AppData\Local\Continuum\Anaconda3\envs\tensorflow\lib\site-packages\GPUtil\__init__.py in getGPUs()
     80                 deviceIds[g] = int(vals[i])
     81             elif (i == 1):
---> 82                 gpuUtil[g] = float(vals[i])/100
     83             elif (i == 2):
     84                 memTotal[g] = int(vals[i])

ValueError: could not convert string to float: '[Not Supported]'
@anderskm
Copy link
Owner

anderskm commented Oct 5, 2017

Thank you for opening the issue.

The issue can be handled by wrapping it in a try-catch statement and setting it to some fixed value, if the typecasting fails.
This, however, opens the questions, what the most appropriate value would be. Any number between 0 (no load) and 1 (full load) doesn't really make sense, as we do not know anything about the load. Then NaN comes to mind, but that messes with the sorting functions, as NaN is an unordered "number".
Despite this, I think the best option is setting it to NaN, if the typecasting fails and then writing a custom sorting function. However, I am open for suggestions.

@anderskm
Copy link
Owner

anderskm commented Oct 9, 2017

@tasptz I've committed an update to GPUtil, which should resolve the issue.
The code now tries to typecast the load (and memory) to a float, but if it fails, it sets it to nan.
When getting available GPUs (getAvailable, getFirstAvailable or getAvailability) the optional input "includeNan" can be set to True (default: False) include GPUs with a nan load or memory.
I have tested it by manually inserting nan values. Would you mind testing it using your setup?

@anderskm anderskm added the bug label Oct 9, 2017
@anderskm anderskm self-assigned this Oct 9, 2017
@anderskm
Copy link
Owner

@tasptz Have you had a chance to test if the updated version works for you?

@dizcza
Copy link

dizcza commented Oct 31, 2017

@anderskm I also had the same problem and I confirm current master (commit a492d3b) fixes it
yet you broke py2-3 compatibility in v1.2.3...master#diff-6d20cf947cfd76895c515f6b1c48b0a0R145
python3 list's options do NOT contain comparator thus current master code does not work with py3.
please, consider adding unit tests to prevent such regression

@anderskm
Copy link
Owner

anderskm commented Nov 1, 2017

@dizcza Yay, and doh! :-)

Thank you for confirming that it fixes the initial problem, but also broke the compatibility. I was not aware of the removal of the cmp parameters from py2 to py3. I'll see if I can find a good solution within the next few days. I'm open for suggestions of how to best fix it ;-)

I completely agree with the unit testing. It should be done, and it has been on my to-do list, but I have not had time to set it up properly yet. So far it has mainly been done manually, which is far from ideal, as the recent update shows.

@anderskm
Copy link
Owner

anderskm commented Nov 1, 2017

@dizcza I believe, I have found a solution, which is compatible with both py2 and py3.
The solution uses the key option in list.sort(), which should work in both py2 and py3.
E.g.:
GPUs.sort(key=lambda x: np.Inf if np.isnan(x.id) else x.id, reverse=False)

It also seems like a nicer solution than the custom compare function.

Will test it properly before committing ;-)

@anderskm
Copy link
Owner

anderskm commented Nov 1, 2017

Had a chance to test it, and it seems to work in both py2 and py3.
Tested it manually by setting the load of odd GPU id's to "Not supported" and sorting according to load.
Got same expected results in both py2 and py3 environments (using anaconda) on the same machine.
I have pushed a new version (f1aa347).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants