Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructure the python embedding logic to run the user-specified instance of python to write/read a temporary pickle file. #1205

Closed
JohnHalleyGotway opened this issue Sep 30, 2019 · 4 comments
Assignees
Labels
MET: Python Embedding priority: blocker Blocker requestor: Community General Community type: bug Fix something that is not working
Milestone

Comments

@JohnHalleyGotway
Copy link
Collaborator

JohnHalleyGotway commented Sep 30, 2019

Python embedding in MET does not work well on Cheyenne with the h5py or pygrib modules.

Not sure if we can actually fix this in met-8.1.2 or not. You could change the milestone to met-9.0 instead.

This issue has been found for two modules, specifically h5py and pygrib. You can see these issues by running the following commands. The problem may be that MET was compiled using the HDF5 and GRIB2 libraries, which are also used by these packages.

module use /glade/p/ral/jntp/MET/MET_releases/modulefiles
module load met/8.1_python
ncar_pylib

This runs fine with pygrib:

cd /glade/p/ral/jntp/MET/MET_Help/mandelbaum_data_20190930/pygrib_problem
python ./read_GFSv3.py ./gfs.t00z.pgrb2.1p00.f048.reduced.grib2

This core dumps:

plot_data_plane PYTHON_NUMPY gfs.ps 'name="./read_GFSv3.py ./gfs.t00z.pgrb2.1p00.f048.reduced.grib2";'

This runs fine with h5py:

cd /glade/p/ral/jntp/MET/MET_Help/mandelbaum_data_20190930/h5py_problem
python read_IMERG_V06_HDF5.py 3B-HHR.MS.MRG.3IMERG.20180102-S200000-E202959.1200.V06B.HDF5 HQprecipitation

This core dumps:

plot_data_plane PYTHON_NUMPY imerg.ps 'name="read_IMERG_V06_HDF5.py 3B-HHR.MS.MRG.3IMERG.20180102-S200000-E202959.1200.V06B.HDF5 HQprecipitation";'

@georgemccabe
Copy link
Collaborator

Regarding the pygrib error:

It appears that calling .data() or .values on a grb record via pygrib causes a seg fault via plot_data_plane but does not occur when calling directly from python on cheyenne. I have found references of other people experiencing this problem when running from python (copied below). It sounds like there may be an error in a library (possibly Jasper?) that is being used external to pygrib since it works fine via python but not in MET.

from jswhit/pygrib#86:
"We encountered this same problem, albeit, this was not an issue with pygrib itself, but with ECCODES and/or a library on our system. The issue can be recreated with any hrdps file from Environment Canada on an intel-python3 docker image, with pygrib and eccodes installed via conda.

We discovered that while trying to read from the grib file, the stack filled up for the process and resulted in a segfault. Increasing the stack size ulimit solved our problem. We do NOT experience this issue with any other grib data, even data from the gem or rgem, it seems unique. Hopefully this helps!"

from jswhit/pygrib#74
the problem may be caused by Jasper. I have encountered the same (and for reading meteofrance files) and resolve it by using (from sources) : jasper 2.0.14 (installing in /usr/local/jasper_new and building as static) eccodes-2.7.3 (don't forget to ask for python interface and to give path to jasper lib and includes). This message is old but hope this helps.

@georgemccabe
Copy link
Collaborator

h5py causes a seg fault when you import it on cheyenne. This is due to a mismatch of the version of HDF5 used to compile MET and the version to install the h5py python module.

I was able to rebuild h5py on my machine using the same version of HDF5 that I used to install MET and was able to get script to run through plot_data_plane. You have to build h5py from source by running:

pip uninstall h5py
pip install --no-binary=h5py h5py

I had to do a little trickery to get this to work using the correct version of HDF5. Here are some of the things I did:

  • To find hdf5.h and hdf5_hl.h:
    export CPATH=/home/mccabe/met/external_libs/hdf5/hdf5-1.8.18/src:/home/mccabe/met/external_libs/hdf5/hdf5-1.8.18/hl/src

To find -lhdf5 and -lhdf5_hl, I created a sym link for libhdf5.so and libhdf5_hl.so in a directory that gcc could find (there may be a better way to add -L

to the call but this is how I got it to work), i.e. /home/mccabe/miniconda3/envs/py2.7/lib.

Also, I had to modify the read_IMERG_V06_HDF5.py script to convert the numpy float32 value to a python float value. I changed min(lat) to min(lat).item() and the same for lon.

@JohnHalleyGotway JohnHalleyGotway changed the title Python embedding in MET does not work well on Cheyenne with the h5py or pygrib modules. Restructure python embedding logic to run the user-specified instance of python to write/read a temporary pickle file. Jan 3, 2020
@JohnHalleyGotway JohnHalleyGotway changed the title Restructure python embedding logic to run the user-specified instance of python to write/read a temporary pickle file. Restructure the python embedding logic to run the user-specified instance of python to write/read a temporary pickle file. Jan 3, 2020
@JohnHalleyGotway
Copy link
Collaborator Author

JohnHalleyGotway commented Feb 4, 2020

This logic has been merged into the develop branch. As of 2/4/2020, python embedding for point and gridded data works both with and without the pickle logic. Also, LD_LIBRARY_PATH and PYTHONPATH do NOT need to be set.

On 2/4/2020, added the following refinements:
(1) Update ascii2nc python embedding to write pickle file to the MET temp directory instead of the current working directory.
(2) Update point and gridded python embedding to DELETE the temporary pickle files.
(3) Rename the temporary pickle files as "tmp_met_pickle..." and "tmp_ascii2nc_pickle..." for gridded and point data, respectively, where "..." is the process id suffix that the make_temp_file_name() function adds.
(4) A couple minor changes to remove stale code and cout's from C++ code.
(5) Add consistent Debug(3) log messages to MET about running python scripts and reading pickle files.
(6) Update python scripts to print the name of the script being run and any temp files being written.
(7) Updated read_ascii_point.py python script to format the columns of data correctly.
(8) Added 3 unit tests to unit_python.xml to call plot_data_plane via pickle and call ascii2nc with and without pickle logic.

Still would like to reimplement split_path() as ConcatString::dirname() and ConcatString::basename() to get rid of PATH_MAX in several spots.

I also realized that the obs_gc() variable in the ascii2nc output contains bad data values. Need to do more debugging!

@JohnHalleyGotway
Copy link
Collaborator Author

JohnHalleyGotway commented Feb 5, 2020

On 2/5/2020, updated ascii2nc point python embedding logic to...

  • Always parse the obs_var column as a string.
  • Call is_number() to determine if that string is actually numeric.
  • By default, set use_var_id flag to false, but if we encounter a non-numeric obs_var name, switch it to true.
    These changes enable the python embedding of ascii2nc write both obs_var and GRIB code output files... just like the ascii2nc does when reading MET point files directly.
  • Also reimplemented the split_path() utility function as ConcatString::dirname() and ConcatString::basename() to avoid Fortify warnings about character arrays.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
MET: Python Embedding priority: blocker Blocker requestor: Community General Community type: bug Fix something that is not working
Projects
None yet
Development

No branches or pull requests

3 participants