Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Regridding time-series variable's climatology from hybrid to pressure levels breaks due to misaligned axes #721

Open
tomvothecoder opened this issue Aug 24, 2023 · 2 comments
Labels
bug Bug fix (will increment patch version) cdat-migration-fy24 CDAT Migration FY24 Task

Comments

@tomvothecoder
Copy link
Collaborator

tomvothecoder commented Aug 24, 2023

In PR #677, I am doing a complete run of ex1.py to make sure all of the metrics and plots are successfully generated. The last remaining issue is an xgcm error that is raised with the variables "U", "V", "OMEGA", and "Z3".

Traceback (most recent call last):
   File "<string>", line 1, in <module>
   File "/global/homes/v/vo13/mambaforge/envs/e3sm_diags_dev_658/lib/python3.10/site-packages/xcdat/regridder/accessor.py", line 389, in vertical
     output_ds = regridder.vertical(data_var, self._ds)
   File "/global/homes/v/vo13/mambaforge/envs/e3sm_diags_dev_658/lib/python3.10/site-packages/xcdat/regridder/xgcm.py", line 204, in vertical
     output_da = grid.transform(
   File "/global/homes/v/vo13/mambaforge/envs/e3sm_diags_dev_658/lib/python3.10/site-packages/xgcm/grid.py", line 2692, in transform
     return ax.transform(da, target, **kwargs)
   File "/global/homes/v/vo13/mambaforge/envs/e3sm_diags_dev_658/lib/python3.10/site-packages/xgcm/grid.py", line 1077, in transform
     target, target_dim, target_data = _parse_target(
   File "/global/homes/v/vo13/mambaforge/envs/e3sm_diags_dev_658/lib/python3.10/site-packages/xgcm/grid.py", line 1072, in _parse_target
     _check_other_dims(target_data)
   File "/global/homes/v/vo13/mambaforge/envs/e3sm_diags_dev_658/lib/python3.10/site-packages/xgcm/grid.py", line 1038, in _check_other_dims
     raise ValueError(
 ValueError: Found additional dimensions [{'time'}]in `target_data` not found in `da`. This could mean that the target array is not on the same position along other axes. If the additional dimensions are associated witha staggered axis, use grid.interp() to move values to other grid position. If additional dimensions are not related to the grid (e.g. climate model ensemble members or similar), use xr.broadcast() before using transform.

Overview

I switched back to the main branch to see how the original code was handling this situation and noticed it breaks there too.

The lat_lon_driver.py tries to regrid the variables "U", "V", "OMEGA", and "Z3" using pressure levels reconstructed from hybrid. The issue is that these variables have already been regridded to 19 pressure levels (source) and no longer have a time axis. Since the variable and pressure level variable (has time axis) do not have matching dimensions, regridding does not work on the variable.

This error is raised: cdms2.error.CDMSError: axis length 129 does not match corresponding dimension 36.

Related code:

def hybrid_to_plevs(var, hyam, hybm, ps, plev):
"""Convert from hybrid pressure coordinate to desired pressure level(s)."""
p0 = 1000.0 # mb
ps = ps / 100.0 # convert unit from 'Pa' to mb
levels_orig = cdutil.vertical.reconstructPressureFromHybrid(ps, hyam, hybm, p0)
levels_orig.units = "mb"
# Make sure z is positive down
if var.getLevel()[0] > var.getLevel()[-1]:
var = var(lev=slice(-1, None, -1))
levels_orig = levels_orig(lev=slice(-1, None, -1))
var_p = cdutil.vertical.logLinearInterpolation(
var(squeeze=1), levels_orig(squeeze=1), plev
)
return var_p

Possible Solutions

From this comment:

We need the axes for the climatology variable and the hybrid variables to align. This means we should calculate the climatologies for the hybrid variables beforehand.

All variables fed into convert_to_pressure_levels(), hybrid_to_plevs() and pressure_to_plevs() should not have a time axis.

Steps to Reproduce (NERSC machine)

  1. Checkout the main branch
  2. Create and activate the dev environment
     mamba env create -f conda/dev.yml
     mamba activate e3sm_diags_dev
  3. Update lat_lon_model_vs_model.cfg to only run one of related variables (e.g., "U")
    [#]
    sets = ["lat_lon"]
    case_id = "model_vs_model"
    variables = ["U"]
    seasons = ["ANN", "DJF", "MAM", "JJA", "SON"]
    plevs = [850.0]
    test_colormap = "PiYG_r"
    reference_colormap = "PiYG_r"
    contour_levels = [-20, -15, -10, -8, -5, -3, -1, 1, 3, 5, 8, 10, 15, 20]
    diff_levels = [-15, -10, -8, -6, -4, -2, -1, 1, 2, 4, 6, 8, 10, 15]
    regrid_method = "bilinear"
  4. Install e3sm_diags into the dev environment
      python -m pip install .
  5. Update result paths and run ex1.py (attached below for convenience)
    import os
    
    from e3sm_diags.parameter.core_parameter import CoreParameter
    from e3sm_diags.run import runner
    
    param = CoreParameter()
    # Location of the data.
    param.test_data_path = "/global/cfs/cdirs/e3sm/e3sm_diags/test_model_data_for_acme_diags/time-series/E3SM_v1"
    param.reference_data_path = "/global/cfs/cdirs/e3sm/e3sm_diags/test_model_data_for_acme_diags/time-series/E3SM_v1"
    
    # Set this parameter to True.
    # By default, e3sm_diags expects the test data to be climo data.
    param.test_timeseries_input = True
    # Years to slice the test data, base this off the years in the filenames.
    param.test_start_yr = "2011"
    param.test_end_yr = "2013"
    
    # Set this parameter to True.
    # By default, e3sm_diags expects the ref data to be climo data.
    param.ref_timeseries_input = True
    # Years to slice the ref data, base this off the years in the filenames.
    param.ref_start_yr = "1850"
    param.ref_end_yr = "1852"
    
    # When running with time-series data, you don't need to specify the name of the data.
    # But you should, otherwise nothing is displayed when the test/ref name is needed.
    param.short_test_name = "historical_H1"
    param.short_ref_name = "historical_H1"
    
    # This parameter modifies the software to accommodate model vs model runs.
    # The default setting for run_type is 'model_vs_obs'.
    param.run_type = "model_vs_model"
    # Name of the folder where the results are stored.
    # Change `prefix` to use your directory.
    prefix = "/global/cfs/cdirs/e3sm/www/vo13/examples"
    param.results_dir = os.path.join(prefix, "ex1_modTS_vs_modTS_3years")
    
    # Below are more optional arguments.
    
    # What plotsets to run the diags on.rm -rf
    # If not defined, then all available sets are used.
    param.sets = ["lat_lon"]
    # What seasons to run the diags on.
    # If not defined, diags are run on ['ANN', 'DJF', 'MAM', 'JJA', 'SON'].
    param.seasons = ["ANN"]
    # Title of the difference plots.
    param.diff_title = "Model (2011-2013) - Model (1850-1852)"
    # For running with multiprocessing.
    param.multiprocessing = False
    # param.num_workers = 32
    
    runner.sets_to_run = ["lat_lon"]
    runner.run_diags([param])

Traceback:

2023-08-24 10:36:27,981 [INFO]: e3sm_diags_driver.py(_save_env_yml:59) >> Saved environment yml file to: [/global/cfs/cdirs/e3sm/www/vo13/examples/ex1_modTS_vs_modTS_3years/prov/environment.yml](https://vscode-remote+ssh-002dremote-002bperlmutter.vscode-resource.vscode-cdn.net/global/cfs/cdirs/e3sm/www/vo13/examples/ex1_modTS_vs_modTS_3years/prov/environment.yml)
2023-08-24 10:36:27,983 [INFO]: e3sm_diags_driver.py(_save_parameter_files:70) >> Saved command used to: [/global/cfs/cdirs/e3sm/www/vo13/examples/ex1_modTS_vs_modTS_3years/prov/cmd_used.txt](https://vscode-remote+ssh-002dremote-002bperlmutter.vscode-resource.vscode-cdn.net/global/cfs/cdirs/e3sm/www/vo13/examples/ex1_modTS_vs_modTS_3years/prov/cmd_used.txt)
2023-08-24 10:36:27,985 [INFO]: e3sm_diags_driver.py(_save_python_script:134) >> Saved Python script to: [/global/cfs/cdirs/e3sm/www/vo13/examples/ex1_modTS_vs_modTS_3years/prov/ipykernel_launcher.py](https://vscode-remote+ssh-002dremote-002bperlmutter.vscode-resource.vscode-cdn.net/global/cfs/cdirs/e3sm/www/vo13/examples/ex1_modTS_vs_modTS_3years/prov/ipykernel_launcher.py)
2023-08-24 10:36:28,350 [INFO]: lat_lon_driver.py(run_diag:146) >> Variable: U
2023-08-24 10:36:52,883 [INFO]: lat_lon_driver.py(run_diag:167) >> Selected pressure level: [850.0]
2023-08-24 10:37:14,668 [ERROR]: e3sm_diags_driver.py(run_diag:296) >> Error in e3sm_diags.driver.lat_lon_driver
Traceback (most recent call last):
  File "/global/u2/v/vo13/E3SM-Project/e3sm_diags/e3sm_diags/e3sm_diags_driver.py", line 293, in run_diag
    single_result = module.run_diag(parameter)
  File "/global/u2/v/vo13/E3SM-Project/e3sm_diags/e3sm_diags/driver/lat_lon_driver.py", line 169, in run_diag
    mv1_p = utils.general.convert_to_pressure_levels(
  File "/global/u2/v/vo13/E3SM-Project/e3sm_diags/e3sm_diags/driver/utils/general.py", line 117, in convert_to_pressure_levels
    mv_p = hybrid_to_plevs(mv, hyam, hybm, ps, plevs)
  File "/global/u2/v/vo13/E3SM-Project/e3sm_diags/e3sm_diags/driver/utils/general.py", line 142, in hybrid_to_plevs
    var_p = cdutil.vertical.logLinearInterpolation(
  File "/global/homes/v/vo13/.conda/envs/e3sm_diags_dev/lib/python3.10/site-packages/cdutil-8.2.1-py3.11.egg/cdutil/vertical.py", line 278, in logLinearInterpolation
    t.setAxisList(ax)
  File "/global/homes/v/vo13/.conda/envs/e3sm_diags_dev/lib/python3.10/site-packages/cdms2/tvariable.py", line 527, in setAxisList
    self.setAxis(i, axislist[i])
  File "/global/homes/v/vo13/.conda/envs/e3sm_diags_dev/lib/python3.10/site-packages/cdms2/tvariable.py", line 517, in setAxis
    raise CDMSError(
cdms2.error.CDMSError: axis length 129 does not match corresponding dimension 36
@tomvothecoder tomvothecoder changed the title [Bug]: Reconstructing hybrid to pressure for "U", "V", "OMEGA", and "Z3" variables breaks [Bug]: Regridding variables "U", "V", "OMEGA", and "Z3" with pressure levels breaks (mismatching dims). Aug 24, 2023
@tomvothecoder tomvothecoder changed the title [Bug]: Regridding variables "U", "V", "OMEGA", and "Z3" with pressure levels breaks (mismatching dims). [Bug]: Regridding variables "U", "V", "OMEGA", and "Z3" with pressure levels breaks (mismatching dims) Aug 24, 2023
@tomvothecoder
Copy link
Collaborator Author

tomvothecoder commented Aug 24, 2023

I think what is happening is that the variable's climatology is being calculated with "ANN", which collapses the time dimension.

In hybrid_to_plevs(), there is no call to genutil.grower() to align the axes:

  • def hybrid_to_plevs(var, hyam, hybm, ps, plev):
    """Convert from hybrid pressure coordinate to desired pressure level(s)."""
    p0 = 1000.0 # mb
    ps = ps / 100.0 # convert unit from 'Pa' to mb
    levels_orig = cdutil.vertical.reconstructPressureFromHybrid(ps, hyam, hybm, p0)
    levels_orig.units = "mb"
    # Make sure z is positive down
    if var.getLevel()[0] > var.getLevel()[-1]:
    var = var(lev=slice(-1, None, -1))
    levels_orig = levels_orig(lev=slice(-1, None, -1))
    var_p = cdutil.vertical.logLinearInterpolation(
    var(squeeze=1), levels_orig(squeeze=1), plev
    )
    return var_p

In line 158 of pressure_to_plevs(), there is a call to genutil.grower() to align the axes:

def pressure_to_plevs(var, plev):
"""Convert from pressure coordinate to desired pressure level(s)."""
# Construct pressure level for interpolation
var_plv = var.getLevel()
if var_plv.units == "Pa":
var_plv[:] = var_plv[:] / 100.0 # convert Pa to mb
levels_orig = MV2.array(var_plv[:])
levels_orig.setAxis(0, var_plv)
# grow 1d levels_orig to mv dimention
var, levels_orig = genutil.grower(var, levels_orig)
# levels_orig.info()
# logLinearInterpolation only takes positive down plevel:
# "I : interpolation field (usually Pressure or depth)
# from TOP (level 0) to BOTTOM (last level), i.e P value
# going up with each level"
if var.getLevel()[0] > var.getLevel()[-1]:
var = var(lev=slice(-1, None, -1))
levels_orig = levels_orig(lev=slice(-1, None, -1))
var_p = cdutil.vertical.logLinearInterpolation(
var(squeeze=1), levels_orig(squeeze=1), plev
)
return var_p

The fix is to update hybrid_to_plevs() with a call to genutil.grower() in the main branch and xr.broadcast() (or something similar) in #677. I tested this and the hybrid_to_plevs() function now works.

UPDATE: Root cause and solution below in this comment.


Call stack:

  1. Get the variables: "ANN" climatology -- collapses time dimension
  1. Convert either hybrid or pressure to pressure levels
  1. Convert hybrid to pressure levels:
  • if mv_plv.long_name.lower().find("hybrid") != -1:
    extra_vars = ["hyam", "hybm", "PS"]
    hyam, hybm, ps = dataset.get_extra_variables_only(
    var, season, extra_vars=extra_vars
    )
    mv_p = hybrid_to_plevs(mv, hyam, hybm, ps, plevs)
  1. Raises cdms2.error.CDMSError: axis length 129 does not match corresponding dimension 36

@tomvothecoder tomvothecoder changed the title [Bug]: Regridding variables "U", "V", "OMEGA", and "Z3" with pressure levels breaks (mismatching dims) [Bug]: Regridding variables with hybrid to pressure level breaks due to misaligned axes Aug 24, 2023
@tomvothecoder tomvothecoder changed the title [Bug]: Regridding variables with hybrid to pressure level breaks due to misaligned axes [Bug]: Regridding climatology variables with hybrid to pressure level breaks due to misaligned axes Aug 24, 2023
@tomvothecoder tomvothecoder changed the title [Bug]: Regridding climatology variables with hybrid to pressure level breaks due to misaligned axes [Bug]: Regridding climatology variables from hybrid to pressure levels breaks due to misaligned axes Aug 24, 2023
@tomvothecoder tomvothecoder changed the title [Bug]: Regridding climatology variables from hybrid to pressure levels breaks due to misaligned axes [Bug]: Regridding time-series variable's climatology from hybrid to pressure levels breaks due to misaligned axes Aug 24, 2023
@tomvothecoder
Copy link
Collaborator Author

tomvothecoder commented Aug 24, 2023

Additional Findings

Root Cause

Using time series files like in ex1.py results in this issue because:

  1. Calculating the climatology variable modifies the time axis (collapse or reduce number of points based on type of climo)
  2. The pressure variable uses the hybrid variables, which still use the original time axis
  3. The climatology variable and the pressure variable have misaligned axes, resulting in the cdms2 regridding error.

Possible Solutions

We need the axes for the climatology variable and the hybrid variables to align. This means we should calculate the climatologies for the hybrid variables beforehand.

All variables fed into convert_to_pressure_levels(), hybrid_to_plevs() and pressure_to_plevs() should not have a time axis.

@tomvothecoder tomvothecoder added the bug Bug fix (will increment patch version) label Oct 4, 2023
@tomvothecoder tomvothecoder added this to the FY24 (Multi-Quarter) milestone Oct 4, 2023
@tomvothecoder tomvothecoder added the cdat-migration-fy24 CDAT Migration FY24 Task label Oct 10, 2023
@tomvothecoder tomvothecoder modified the milestones: FY24 (Multi-Quarter), FY24 Q2 (01/01/24 - 03/31/24) Jan 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug fix (will increment patch version) cdat-migration-fy24 CDAT Migration FY24 Task
Projects
None yet
Development

No branches or pull requests

1 participant