Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify grid information to include xgcm compatible metadata to identify grid positions #8

Open
jbusecke opened this issue Jun 28, 2022 · 2 comments

Comments

@jbusecke
Copy link
Collaborator

There is currently a very brittle attempt in cmip6_preprocessing to store information about the grid position/staggering in a yaml file. This was created by me in an ad hoc way that seems to have produced many wrong configurations (e.g. jbusecke/xMIP#240).

Since this information really is an intrinsic part of the native grid, I would ultimately like to migrate as much as possible to this repo and remove it from cmip6_pp.

After we have settled on a few datasets here and figured out how to implement renaming, I suggest we try to add xgcm compatible metadata to the resulting 'grid' datasets, so that one can simply do

from xgcm import Grid
grid_ds = some_magic_to_get_the_grid(source_id)
grid = Grid(grid_ds)

And this would automatically recognize axis positions and dimension names.

I will have to think about how to actually implement this. The big question here is if we should rename/modify the files before they are uploaded to the cloud or provide logic to do that afterwards (similar to cmip6_pp). I think currently I prefer the latter, but we will have to see if this is feasible with all grid files.

cc @emaroon

@jbusecke
Copy link
Collaborator Author

jbusecke commented Jun 28, 2022

Since the currently used COMODO specs used by xgcm are outdated? at this point, we should probably try to parse the information using the SGRID spec and implement the capability to parse that in xgcm at the same time (xgcm/xgcm#109).

@jdldeauna
Copy link

Hey @jbusecke ,

I worked on a very prelim version of a method which can take static ocean grid data and apply it to a CMIP6 model grid to make it compatible with xgcm. Just to double-check, would it be part of xmip or xgcm? I think either way would work.

Here is a more detailed notebook, but the basic function is as follows:

def preprocess_static_grid_model(ds):
    """
    This function renames variables in static ocean grid dataset to match xmip conventions
    
    ds : xarray Dataset
        Static ocean grid downloaded from Pangeo Forge
    """
    
    ds = ds.rename_vars({'parea':'area_t', 'uarea':'area_u', 'varea':'area_v',
                          'pdx':'dx_t', 'udx':'dx_u', 'vdx':'dx_v',
                          'pdy':'dy_t', 'udy':'dy_u', 'vdy':'dy_v',
                          'pdepth':'dz_t', 'udepth':'dz_u', 'vdepth':'dz_v'
                          })
    
    # area variables
    area_t = ds['area_t']
    area_u = ds['area_u'].rename({'x':'x_r'})
    area_v = ds['area_v'].rename({'y':'y_r'})
    
    dx_t = ds['dx_t']
    dx_u = ds['dx_u'].rename({'x':'x_r'})
    dx_v = ds['dx_v'].rename({'y':'y_r'})
    
    dy_t = ds['dy_t']
    dy_u = ds['dy_u'].rename({'x':'x_r'})
    dy_v = ds['dy_v'].rename({'y':'y_r'})
    
    dz_t = ds['dz_t']
    dz_u = ds['dz_u'].rename({'x':'x_r'})
    dz_v = ds['dy_v'].rename({'y':'y_r'})
    
    coords = {'area_t': area_t, 'area_u': area_u, 'area_v': area_v, 
              'dx_t': dx_t, 'dx_u': dx_u, 'dx_v': dx_v, 
              'dy_t': dy_t, 'dy_u': dy_u, 'dy_v': dy_v,
              'dz_t': dz_t, 'dz_u': dz_u, 'dz_v': dy_v,
             }
    metrics={ ('X','Y'):['area_t','area_u','area_v'], 
             ('X'):['dx_t','dx_u','dx_v'], 
             ('Y'):['dy_t','dy_u','dy_v'], 
             ('Z'):['dz_t','dz_u','dz_v']
            }
    
    return coords, metrics

There are two issues that I found:

  1. The MPI static ocean grid datasets doesn't match their corresponding CMIP6 model dimensions
  2. Depth of grid cells as specified in the static ocean grid are 2-dimensional instead of 3D, and doesn't specify the height of each vertical layer. This might be esp important for native grid datasets where vertical levels might change per time step.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants