Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Issues with Eureka!'s S4 Astraeus Saving #2

Open
taylorbell57 opened this issue Jun 29, 2022 · 9 comments
Open

[Bug]: Issues with Eureka!'s S4 Astraeus Saving #2

taylorbell57 opened this issue Jun 29, 2022 · 9 comments

Comments

@taylorbell57
Copy link
Contributor

I am currently getting an issue with Eureka!'s Stage 4 save files, where the save files are successfully made, but then it seems an error occurs when the automated garbage collection tries to close the files (which seem to have already been closed). I suspect this is a bug with Astraeus, but I am confused why this is only happening with Eureka!'s Stage 4. Any thought @kevin218?

Traceback:

Saving results
Finished writing to /Volumes/DataDrive/WASP-12b_WFC3/G102/visit11/Stage4/S4_2022-06-29_wfc3_run2/ap5_bg5/S4_wfc3_ap5_bg5_SpecData.h5
Finished writing to /Volumes/DataDrive/WASP-12b_WFC3/G102/visit11/Stage4/S4_2022-06-29_wfc3_run2/ap5_bg5/S4_wfc3_ap5_bg5_LCData.h5
Exception ignored in: <function CachingFileManager.__del__ at 0x180af74c0>
Traceback (most recent call last):
  File "/Users/tjbell1/miniconda3/envs/eureka/lib/python3.9/site-packages/xarray/backends/file_manager.py", line 244, in __del__
  File "/Users/tjbell1/miniconda3/envs/eureka/lib/python3.9/site-packages/xarray/backends/file_manager.py", line 222, in close
  File "/Users/tjbell1/miniconda3/envs/eureka/lib/python3.9/site-packages/h5netcdf/core.py", line 1125, in close
  File "/Users/tjbell1/miniconda3/envs/eureka/lib/python3.9/site-packages/h5py/_hl/files.py", line 525, in close
TypeError: bad operand type for unary ~: 'NoneType'
Exception ignored in: <function File.close at 0x180eceb80>
Traceback (most recent call last):
  File "/Users/tjbell1/miniconda3/envs/eureka/lib/python3.9/site-packages/h5netcdf/core.py", line 1125, in close
  File "/Users/tjbell1/miniconda3/envs/eureka/lib/python3.9/site-packages/h5py/_hl/files.py", line 525, in close
TypeError: bad operand type for unary ~: 'NoneType'`
@kevin218
Copy link
Owner

@taylorbell57 Can you confirm that writeXR returns success = True after writing out each file? Next, are you able to read in each file using readXR? Finally, can you compare the Xarray Datasets from before writing and after loading to make sure they are identical? If a parameter is missing, this might help us understand the issue.

@taylorbell57
Copy link
Contributor Author

I can confirm that it is returning success = True because it is printing the "Finished writing to "... messages which only happen if it was successful. I'll double-check reading and will try to compare the values before and after writing.

@taylorbell57
Copy link
Contributor Author

Very strange, I just realized that I don't get this issue with one conda environment (called eureka2) but I do with the other (eureka). I tried to compare the differences in the installed packages, but there were far too many for me to compare, so I tried just making a new conda environment (eureka3) and installing Eureka! from scratch using the pip install . method. In that new environment, I get a similar but different error message:

Saving results
Finished writing to /Volumes/DataDrive/WASP-12b_WFC3/G102/visit11/Stage4/S4_2022-07-01_wfc3_run7/ap5_bg5/S4_wfc3_ap5_bg5_SpecData.h5
Finished writing to /Volumes/DataDrive/WASP-12b_WFC3/G102/visit11/Stage4/S4_2022-07-01_wfc3_run7/ap5_bg5/S4_wfc3_ap5_bg5_LCData.h5
Exception ignored in: <function CachingFileManager.__del__ at 0x14a684310>
Traceback (most recent call last):
  File "/Users/tjbell1/miniconda3/envs/eureka3/lib/python3.9/site-packages/xarray/backends/file_manager.py", line 244, in __del__
  File "/Users/tjbell1/miniconda3/envs/eureka3/lib/python3.9/site-packages/xarray/backends/file_manager.py", line 222, in close
  File "/Users/tjbell1/miniconda3/envs/eureka3/lib/python3.9/site-packages/h5netcdf/core.py", line 1125, in close
  File "/Users/tjbell1/miniconda3/envs/eureka3/lib/python3.9/site-packages/h5py/_hl/files.py", line 445, in close
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5f.pyx", line 275, in h5py.h5f.get_obj_ids
  File "h5py/h5i.pyx", line 46, in h5py.h5i.wrap_identifier
ImportError: sys.meta_path is None, Python is likely shutting down
Exception ignored in: <function File.close at 0x14b1c5b80>
Traceback (most recent call last):
  File "/Users/tjbell1/miniconda3/envs/eureka3/lib/python3.9/site-packages/h5netcdf/core.py", line 1125, in close
  File "/Users/tjbell1/miniconda3/envs/eureka3/lib/python3.9/site-packages/h5py/_hl/files.py", line 445, in close
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5f.pyx", line 275, in h5py.h5f.get_obj_ids
  File "h5py/h5i.pyx", line 46, in h5py.h5i.wrap_identifier
ImportError: sys.meta_path is None, Python is likely shutting down

This new eureka3 environment differed only slightly from the eureka2 environment which worked without error, and I was able to figure out that the above error message is caused by having dask==2022.6.1 installed rather than dask==2022.6.0. I get the same behaviour if working with MIRI or WFC3 data. In the environment I was using that gave the error message from my first post here (eureka) had been setup using the environment.yml file a while ago and I had dask-core==2022.4.0 installed, but upgrading it didn't resolve that first error message. I tried doing a fresh environment.yml install (eureka4) and ended up with dask-core==2022.6.1 which gave me the same ImportError: sys.meta_path is None... error message as the eureka3 environment. Downgrading to dask-core==2022.6.0 with conda ends up giving me no error messages. Very confusingly, downgrading to dask-core==2022.4.0 ends up working fine without any error messages though. Similarly, in the eureka3 environment if I pip install dask==2022.4.0 I get no issues. To be clear, none of these upgrades/downgrades are changing the installed versions of other packages - just the dask version.

So in summary, my understanding is that at a minimum we need dask<2022.6.1 as a requirement for pip installs and dask-core<2022.6.1 for conda installs.

I'll try looking at the outputs from each of these different installs and see if they actually differ or if it's just an issue with marking the files as closed which is causing the error message.

@taylorbell57
Copy link
Contributor Author

taylorbell57 commented Jul 1, 2022

Alright, it is confirmed that the outputs from the eureka environment which gives the TypeError: bad operand type for unary ~... message are exactly the same as those from the eureka3 environment with dask==2022.6.0 which gives no errors. And changing eureka3 to use dask==2022.6.1 gives the ImportError: sys.meta_path is None... error message but gives the exact same outputs as the other runs with/without errors.

So in summary, there is no actual issue with saving the files, just with having them marked as closed so that the automated garbage collection doesn't try to close already closed files when the stage ends

@taylorbell57
Copy link
Contributor Author

And for some reason we only encounter this in Eureka!'s Stage 4 and none of the other stages...

@taylorbell57
Copy link
Contributor Author

Aha, the TypeError: bad operand type for unary ~... error comes up when I have pymc3 and starry related dependencies installed and I use the pymc3 branch of Eureka (but then the error message becomes the sys.meta_path error if I switch to the wfc3 branch of Eureka)... Comparing the differences, I found that pip installing dill==0.3.5.1 (which changes no other packages) in my functional environment ends up reintroducing the error message, and uninstalling it in my environment with pymc3 and starry dependencies ends up removing the error message. With dill<=0.3.4 installed, the error message ends up becoming the TypeError: ... message again, and I tried every other version of dill and couldn't get it to run without that TypeError: ... message. Do you have any ideas how installing dill would cause things to break like this? Or should I just reach out to the folks in charge of xarray and/or dill?

@taylorbell57
Copy link
Contributor Author

It seems like others already know about this issue actually:
pydata/xarray#4267
h5netcdf/h5netcdf#50

@kevin218
Copy link
Owner

kevin218 commented Jul 2, 2022

Are we using h5netcdf version 0.12.0 or better? This seems to have fixed the issue for others.

I have no idea why dill would be causing this issue.

@taylorbell57
Copy link
Contributor Author

Yeah, I have v1.0.1 of h5netcdf installed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants