Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed '0mb' pbs.size bug, added PTL test #2287

Merged
merged 1 commit into from Mar 11, 2021

Conversation

alexis-cousein
Copy link
Contributor

Describe Bug or Feature

Unfortunately the base types library has a routine to_bytes that takes a size but only supports the "k" multiplier.

That routine is used in transform_sizes which is called in many places, and there it is called when a quantity is smaller than 10kb.

No doubt the author of the code thought "if it's smaller than 10kb then surely it can't use a multiplier larger than 'k'" but alas, that is forgetting about zero bytes, which is a round number for all multipliers, i.e. also 'm', 'g', 't' and 'p'.

The easiest way to get into trouble is to try to assign a pbs.size("0mb") to a node resource in a hook. That will crash with an exception:

03/03/2021 21:53:15.824408;0001;pbs_python;Svr;pbs_python;PBS server internal error (15011) in Error evaluating Python script, <class 'ValueError'>
03/03/2021 21:53:15.824413;0001;pbs_python;Svr;pbs_python;PBS server internal error (15011) in Error evaluating Python script, invalid literal for int() with base 10: '0m'

In another hook I have a more complete backtrace (this is 2020.1.1):
03/03/2021 20:01:12.420402;0080;pbs_python;Hook;pbs_python;['Traceback (most recent call last):', ' File "", line 6122, in main', ' File "", line 1016, in invoke_handler', ' File "", line 1228, in _exechost_startup_handler', ' File "", line 2601, in crea
te_vnodes', ' File "/opt/pbs/lib/python/altair/pbs/v1/base_types.py", line 1602, in _setitem', ' setattr(self, resname, resval)', ' File "/opt/pbs/lib/python/altair
/pbs/v1/base_types.py", line 1661, in __setattr', ' super().setattr(name, value)', ' File "/opt/pbs/lib/python/altair/pbs/v1/_base_types.py", line 188, in _set
', ' elif (value is None) or (value == "")
', ' File "/opt/pbs/lib/python/altair/pbs/v1/base_types.py", line 479, in _eq', ' so = transform_sizes(self, other)',
' File "/opt/pbs/lib/python/altair/pbs/v1/_base_types.py", line 358, in transform_sizes', ' s_num = to_bytes(s) + 1073741824', ' File "/opt/pbs/lib/python/altair/pbs/v
1/_base_types.py", line 339, in to_bytes', ' s_num = int(s_str)', "ValueError: invalid literal for int() with base 10: '0m'"]

There is another dangerous construct in there: (value == "") is being tried even when value is not a string. In this case it calls the eq for the size class, and that calls transform_sizes which in turn calls the incomplete to_bytes.

But if we compare to the empty string, we'd better guard against 'value' not being a string in the first place; calling the pbs.size eq is nonsensical.

Describe Your Change

Define basestring if necessary (in Python 3 it's undefined and shoud just be set to be synonymous to str) and see if a value is of that type before comparing it to "" to avoid pbs.size comparisons with the empty string.

Extend to_bytes function to deal with suffixes 'm', 'g', 't' and 'p' (the valid ones as per the Reference Guide), to get rid of the incorrect assumption 'values under 10k cannot use large suffixes", which is true for strictly positive sizes but not 0.

Link to Design Doc

Pure bug fix.

Attach Test and Valgrind Logs/Output

Added a test in pbs_test_pbs_python.py:

Test_pbs_python.test_pbs_python_zero_size

Without the fix:
2021-03-09 11:37:09,364 INFO created test python script /tmp/test_0mbubpzhyrw.py
2021-03-09 11:37:09,364 INFO created dummy input file /tmp/dummyhrglixtg.in
2021-03-09 11:37:09,364 INFO running ['/opt/pbs/bin/pbs_python', '--hook', '-i', '/tmp/dummyhrglixtg.in', '/tmp/test_0mbubpzhyrw.py']
2021-03-09 11:37:09,364 INFOCLI2 dragon: sudo -H /opt/pbs/bin/pbs_python --hook -i /tmp/dummyhrglixtg.in /tmp/test_0mbubpzhyrw.py
[...]
2021-03-09 11:37:09,926 INFO ======================================================================
2021-03-09 11:37:09,926 INFO FAILED: test_pbs_python_zero_size (tests.functional.pbs_python_test.Test_pbs_python)

2021-03-09 11:37:09,926 INFO m_oo_m
2021-03-09 11:37:09,926 INFO Traceback (most recent call last):
File "/home/alexis/git/github/alexis-cousein/pbspro/test/tests/functional/pbs_python_test.py", line 98, in test_pbs_python_zero_size
self.assertTrue(msg in combined_output, "Test hooklet rejected event")
AssertionError: False is not true : Test hooklet rejected event

The rejection is the result of an assertion which is on stderr of the process that ran pbs_python:
pbs_python: PBS server internal error (15011) in Error evaluating Python script, <class 'ValueError'>
pbs_python: PBS server internal error (15011) in Error evaluating Python script, invalid literal for int() with base 10: '0m'

With the fix:
2021-03-09 11:41:01,705 INFO created test python script /tmp/test_0mbe0vyvl3j.py
2021-03-09 11:41:01,705 INFO created dummy input file /tmp/dummy621_qvww.in
2021-03-09 11:41:01,705 INFO running ['/opt/pbs/bin/pbs_python', '--hook', '-i', '/tmp/dummy621_qvww.in', '/tmp/test_0mbe0vyvl3j.py']
2021-03-09 11:41:01,705 INFOCLI2 dragon: sudo -H /opt/pbs/bin/pbs_python --hook -i /tmp/dummy621_qvww.in /tmp/test_0mbe0vyvl3j.py
2021-03-09 11:41:01,733 INFO Test hooklet accepted event

Copy link
Contributor

@bayucan bayucan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@alexis-cousein
Copy link
Contributor Author

alexis-cousein commented Mar 11, 2021

Added PTL test also succeeds:

logfile_PASS.txt

Copy link
Contributor

@nishiya nishiya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@bayucan bayucan merged commit c859522 into openpbs:master Mar 11, 2021
@bayucan
Copy link
Contributor

bayucan commented Mar 11, 2021

Merged #2287 into master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants