New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unit test failures on Perlmutter #1963
Comments
Is there a specific reason you're using |
@weaverba137 thanks for the suggestion. There is no specific reason other than for consistency with what is done in desitest. Did you determine that using pytest would solve these failures? |
If the tests are passing on GitHub Actions, then one should try to minimize the differences between that environment and perlmutter. Also, desitest is probably not what we want to be consistent with even in the near-term future. |
Curiously, 4 (not 2!) bootcalib tests fail with pytest on perlmutter when running the full test suite (they all boil down to hitting the same "UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 613: ordinal not in range(128)")
But they pass if you run just the bootcalib test on its own:
I don't have a good explanation for why or how that could matter. Although the test failures are probably harmless, we should get them fixed ASAP so that we don't get used to failing unit tests, which eventually leads to not noticing a genuine failure. |
@sbailey @weaverba137 I suspect this failure is related to #291 but you may be able to find a better solution. Do you want to take a try at resolving this? Thanks. |
The short version is that we might not need the elaborate locale manipulations any longer because perlmutter sets |
Sadly getting rid of the locale manipulations didn't fix it (although it would be great to get rid of them if no longer necessary, even cori sets Something about |
Replacing the line
with
results in success of desispec unit tests on Perlmutter. @sbailey please have a look and let me know if you have any idea why calling
and verified that the environments are identical before and after this call, so there are no hidden changes to environment variables in the call to |
see the bootcalib-test branch for the change that "fixes" the unit test failures @sbailey |
I have no idea. @dmargala or @craigwarner-ufastro do you have any ideas for why a call to |
Wow. That is really strange. I was able to reproduce the error and the success when replacing with |
Do any of the scripts tested in Is the same |
It seems like the underlying issue here is that there are non-ascii characters in a file that is parsed using an ascii codec. Is that correct? From a quick and non-thorough look at the related issues referenced above, it seems like a work-around was put in place instead of a fix to the underlying problem. I'm not sure what or why but perhaps something in cupy is undo-ing that work-around? source /global/common/software/desi/desi_environment.sh main
git clone https://github.com/desihub/desispec.git
cd desispec
pytest -x py/desispec/test # fails
sed -i s/Å/Ang/g py/desispec/data/arc_lines/*ascii # replace offending character
pytest -x py/desispec/test # passes |
Yes, the file contains non-ascii characters. The key point here is that it is the test that is broken, not the actual code. If you simply open Python and load the file using the function in |
@weaverba137 I know the order is determined alphabetically, so maybe not feasible, but in the spirit of making the smallest change possible to keep the unit test in and not getting sidetracked, what do you think of somehow changing the order of unit tests so bootcalib comes first? |
@marcelo-alvarez, maybe I'm misunderstanding something here, but it seems like it's a bit too early to give up on fixing the problem. |
That all depends on your definition of fixing the problem, but I'm happy to use yours :) |
OK, let's back up a step and let me ask: in detail, how did you identify that |
I don't think that's accurate. The test is checking whether Reproducer outsider of the framework of the tests: from desispec import bootcalib
#- Works
bootcalib.parse_nist('CdI')
#- bootcalib fails if cupy is imported and used
import cupy as cp
x = cp.arange(10)
#- Fails
bootcalib.parse_nist('CdI') For further context, the files that is it trying to parse are downloaded directly from NIST, so it would be nice to be able to use them as-is and not require a manual cleaning step if we ever need to download an updated version. But that might be less painful that writing our own unicode-aware equivalent of |
@sbailey, thank you for the clarification. I'm still very interested to hear exactly how @marcelo-alvarez discovered that |
Thanks for the context @sbailey. I think the solution here may depend on how likely it is that these particular calib files will ever be read after importing and using cupy/cupyx. Given that it hasn't happened in Iron or on nights daily processing was on Perlmutter, maybe it's not likely?
@weaverba137 I traversed all lines previous to the unit test failures in question systematically until I came upon the line that, if removed, would result in success. |
Using This works (no error): > python -c 'from astropy.table import Table; Table.read("ArI_air.ascii", format="ascii.fixed_width")' Adding a bit of > python -c 'import cupy; cupy.arange(10); from astropy.table import Table; Table.read("ArI_air.ascii", format="ascii.fixed_width")'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/conda/envs/test-ascii/lib/python3.10/site-packages/astropy/table/connect.py", line 62, in __call__
out = self.registry.read(cls, *args, **kwargs)
File "/tmp/conda/envs/test-ascii/lib/python3.10/site-packages/astropy/io/registry/core.py", line 219, in read
data = reader(*args, **kwargs)
File "/tmp/conda/envs/test-ascii/lib/python3.10/site-packages/astropy/io/ascii/connect.py", line 19, in io_read
return read(filename, **kwargs)
File "/tmp/conda/envs/test-ascii/lib/python3.10/site-packages/astropy/io/ascii/ui.py", line 350, in read
table = fileobj.read()
File "/tmp/conda/envs/test-ascii/lib/python3.10/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 675: ordinal not in range(128) Comparing > python -c 'import locale; print(locale.getpreferredencoding())'
UTF-8
> python -c 'import cupy; cupy.arange(10); import locale; print(locale.getpreferredencoding())'
ANSI_X3.4-1968 It looks like there is a special Python UTF-8 mode that you can use with either: > python -X utf8 -c 'import cupy; cupy.arange(10); import locale; print(locale.getpreferredencoding())'
UTF-8
> PYTHONUTF8=1 python -c 'import cupy; cupy.arange(10); import locale; print(locale.getpreferredencoding())'
UTF-8 Forcing Python UTF-8 mode avoids the error: > python -X utf8 -c 'import cupy; cupy.arange(10); from astropy.table import Table; Table.read("ArI_air.ascii", format="ascii.fixed_width")' This actually seems more like a CUDA issue rather than |
@dmargala thanks for getting (closer) to the bottom of this. I can't test it myself right now, but maybe a more comprehensive forcing of the encoding in parse_nist could be the solution |
I'm planning to run some tests now. I'll work on the |
Two unit tests fail in the main environment (desiconda/20220119-2.0.1) on Perlmutter:
@sbailey the same two errors also occur with a fresh desiconda install, so these failures should not block our updating desiconda, since desispec is working fine on Perlmutter in spite of these two failures. While not a high priority, we should get this fixed at some point.
The text was updated successfully, but these errors were encountered: