Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix: Fix the fill value setting used in the write_tmp_dataplane internal Python embedding script #2525

Closed
22 tasks
DanielAdriaansen opened this issue Apr 27, 2023 · 3 comments · Fixed by #2529 or #2557
Closed
22 tasks
Assignees
Labels
MET: Python Embedding priority: high High Priority requestor: METplus Team METplus Development Team required: FOR OFFICIAL RELEASE Required to be completed in the official release for the assigned milestone type: bug Fix something that is not working
Milestone

Comments

@DanielAdriaansen
Copy link
Contributor

DanielAdriaansen commented Apr 27, 2023

Describe the Problem

In #2509, an error was corrected inside write_tmp_dataplane.py.

The code changed from this:

# determine fill value
try:
fill = met_data.get_fill_value()
except:
fill = -9999.

To this:
# determine fill value
try:
fill = met_in.met_data.get_fill_value()
except:
fill = -9999.

Previously, the try block was failing and so -9999. was used as the fill_value when writing the temporary netCDF file (used only for Python embedding with MET_PYTHON_EXE set). This is actually desirable, because -9999. is the fill_value recognized by MET tools.

In #2509, the try block was corrected so that met_in.met_data was properly referenced. Downstream in METplus wrappers, a use case failed using this new code. The reason is because the fill_value in the temporary netCDF file was now set to the fill_value of the met_in.met_data object controlled by the user. In the METplus use case, the user is using a NumPy masked array and for a data type of float64, the default fill_value is 1e+20 (https://numpy.org/doc/stable/reference/generated/numpy.ma.default_fill_value.html). However, MET tools do not respect this as a missing data value and so for this use case 1e+20 was treated as valid data.

Some questions:

  1. Why would we want to use the masked array fill_value if it is not respected inside MET tools? Should fill always be set to -9999.?
  2. When creating a netcdf4 variable using createVariable, how are missing data in the met_data object handled? For example, if the user has a NumPy N-D array that contains nan, but fill=-9999. is set in createVariable, does netcdf4-python automatically know to substitute -9999. for nan everywhere in the user's data? I did a quick test of this case for PYTHON_NUMPY and PYTHON_XARRAY and it appears that is the case. But if the user has a special value for fill_value, like "-99", then this does not work. I think that our temporary Python embedding code assumes that the user has nan where there is missing data. We should state this if that is the case. Ironically, I tried substituting "-9999." for "nan" in my data, and netcdf4 did not recognize this as missing data, even though we set fill=-9999. in write_tmp_dataplane.py. Therefore, it really is critical to communicate to the user they must be using nan.
  3. It appears the same is true using the compile time Python instance. The user's N-D array must have nan where there is missing data. MET respects this using the compile time instance, and also using MET_PYTHON_EXE Python. If any other value is substituted, it will not be respected as missing data in MET.

For the "bug" in this issue, to get the use case to pass downstream, I think we can just remove the entire try/except block and always force fill=-9999.. From what I read about netcdf4.createVariable, it seems that if a user passes a masked array (as met_data), and it has a fill/mask value of 1e+20, then it will be respected automatically by netcdf4. Otherwise, the user must pass a NumPy N-D array with nan as the FillValue or an Xarray DataArray object with nan as the FillValue. I tested this for the use case that was failing, and simply just forcing `fill=-9999.' worked. However, we will need to test other cases that do not use masked arrays to make sure this doesn't break that, but I don't think it will.

Expected Behavior

Provide a clear and concise description of what you expected to happen here.

Environment

Describe your runtime environment:
1. Machine: (e.g. HPC name, Linux Workstation, Mac Laptop)
2. OS: (e.g. RedHat Linux, MacOS)
3. Software version number(s)

To Reproduce

Describe the steps to reproduce the behavior:
1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error
Post relevant sample data following these instructions:
https://dtcenter.org/community-code/model-evaluation-tools-met/met-help-desk#ftp

Relevant Deadlines

List relevant project deadlines here or state NONE.

Funding Source

Define the source of funding and account keys here or state NONE.

Define the Metadata

Assignee

  • Select engineer(s) or no engineer required
  • Select scientist(s) or no scientist required

Labels

  • Select component(s)
  • Select priority
  • Select requestor(s)

Projects and Milestone

  • Select Organization level Project for support of the current coordinated release
  • Select Repository level Project for development toward the next official release or add alert: NEED PROJECT ASSIGNMENT label
  • Select Milestone as the next bugfix version

Define Related Issue(s)

Consider the impact to the other METplus components.

Bugfix Checklist

See the METplus Workflow for details.

  • Complete the issue definition above, including the Time Estimate and Funding Source.
  • Fork this repository or create a branch of main_<Version>.
    Branch name: bugfix_<Issue Number>_main_<Version>_<Description>
  • Fix the bug and test your changes.
  • Add/update log messages for easier debugging.
  • Add/update unit tests.
  • Add/update documentation.
  • Push local changes to GitHub.
  • Submit a pull request to merge into main_<Version>.
    Pull request: bugfix <Issue Number> main_<Version> <Description>
  • Define the pull request metadata, as permissions allow.
    Select: Reviewer(s) and Development issue
    Select: Organization level software support Project for the current coordinated release
    Select: Milestone as the next bugfix version
  • Iterate until the reviewer(s) accept and merge your changes.
  • Delete your fork or branch.
  • Complete the steps above to fix the bug on the develop branch.
    Branch name: bugfix_<Issue Number>_develop_<Description>
    Pull request: bugfix <Issue Number> develop <Description>
    Select: Reviewer(s) and Development issue
    Select: Repository level development cycle Project for the next official release
    Select: Milestone as the next official version
  • Close this issue.
@DanielAdriaansen DanielAdriaansen added type: bug Fix something that is not working alert: NEED MORE DEFINITION Not yet actionable, additional definition required alert: NEED ACCOUNT KEY Need to assign an account key to this issue alert: NEED CYCLE ASSIGNMENT Need to assign to a release development cycle labels Apr 27, 2023
@DanielAdriaansen DanielAdriaansen added this to the MET 11.1.0 milestone Apr 27, 2023
@hsoh-u
Copy link
Collaborator

hsoh-u commented Apr 27, 2023

This will be a general solution at python/met/dataplane.py when MET_PYTHON_EXE is defined:

# determine fill value 
fill = -9999. 
try: 
   custom_fill = met_in.met_data.get_fill_value() 
except: 
   custom_fill = None

...

dp[:] = met_in.met_data
if custom_fill is not None:
   # convert missing values to -9999.
   dp[ dp == custom_fill ] = fill

@JohnHalleyGotway JohnHalleyGotway added requestor: METplus Team METplus Development Team MET: Python Embedding priority: high High Priority and removed alert: NEED MORE DEFINITION Not yet actionable, additional definition required alert: NEED CYCLE ASSIGNMENT Need to assign to a release development cycle labels Apr 27, 2023
hsoh-u pushed a commit that referenced this issue Apr 27, 2023
hsoh-u pushed a commit that referenced this issue Apr 27, 2023
@hsoh-u hsoh-u linked a pull request Apr 27, 2023 that will close this issue
15 tasks
hsoh-u added a commit that referenced this issue Apr 28, 2023
@JohnHalleyGotway JohnHalleyGotway removed the alert: NEED ACCOUNT KEY Need to assign an account key to this issue label Apr 28, 2023
@JohnHalleyGotway
Copy link
Collaborator

Closing since PR #2529 was merged into develop.

@hsoh-u hsoh-u reopened this May 11, 2023
@JohnHalleyGotway JohnHalleyGotway added the required: FOR OFFICIAL RELEASE Required to be completed in the official release for the assigned milestone label May 18, 2023
@hsoh-u
Copy link
Collaborator

hsoh-u commented May 25, 2023

pp_dataplane.py wil be introduced and the user's python script is executed by "pp_dataplane.py".

Arguments for plot_data_plane:

PYTHON_NUMPY <output_plot_name>
'name="/d1/personal/hsoh/git/features/feature_2525_fill_value_at_dataplane/MET/share/met/python/examples/read_ascii_numpy.py letter.txt LETTER";'

MET calls the following command intenally

python3 /d1/personal/hsoh/git/features/feature_2525_fill_value_at_dataplane/MET/share/met/python/examples/read_ascii_numpy.py letter.txt LETTER";' -plot_range 0.0 255.0 -title "Python enabled numpy plot_data_plane" -v 8

This will be changed to

python3 /d1/personal/hsoh/git/features/feature_2525_fill_value_at_dataplane/MET/share/met/python/met/pp_dataplane.py /d1/personal/hsoh/git/features/feature_2525_fill_value_at_dataplane/MET/share/met/python/examples/read_ascii_numpy.py letter.txt LETTER";' -plot_range 0.0 255.0 -title "Python enabled numpy plot_data_plane" -v 8

pp_dataplane.py checks values at dataplane and converts NaN, INF, and 1e10 (missing value for Numpy masked array) to missing value, -9999. If user_fill_value is set, it will be changed to -9999.

hsoh-u pushed a commit that referenced this issue Jun 6, 2023
hsoh-u pushed a commit that referenced this issue Jun 6, 2023
hsoh-u pushed a commit that referenced this issue Jun 6, 2023
hsoh-u pushed a commit that referenced this issue Jun 6, 2023
@hsoh-u hsoh-u linked a pull request Jun 6, 2023 that will close this issue
15 tasks
hsoh-u pushed a commit that referenced this issue Jun 12, 2023
@JohnHalleyGotway JohnHalleyGotway changed the title Bugfix: write_tmp_dataplane uses fill_value unrecognized by MET Bugfix: Fix the fill value setting used in the write_tmp_dataplane internal Python embedding script Jun 13, 2023
hsoh-u pushed a commit that referenced this issue Jun 13, 2023
hsoh-u added a commit that referenced this issue Jun 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
MET: Python Embedding priority: high High Priority requestor: METplus Team METplus Development Team required: FOR OFFICIAL RELEASE Required to be completed in the official release for the assigned milestone type: bug Fix something that is not working
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants