Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extract_shape - crop on the ID-selected region and not on the whole shapefile #1072

Closed
lukasbrunner opened this issue Apr 19, 2021 · 11 comments · Fixed by #1151
Closed

extract_shape - crop on the ID-selected region and not on the whole shapefile #1072

lukasbrunner opened this issue Apr 19, 2021 · 11 comments · Fixed by #1151
Assignees
Labels
bug Something isn't working EUCP www.eucp-project.eu

Comments

@lukasbrunner
Copy link
Contributor

lukasbrunner commented Apr 19, 2021

Describe the bug
PR ESMValGroup/ESMValTool#2120 adds shapefiles for the scientific SREX and AR6 regions to be used by extract_shape in the preprocessor.

I understand the documentation

crop: by default extract_region will be used to crop the data to a minimal rectangular region containing the shape. Set to false to only mask data outside the shape. Data on irregular grids will not be cropped.
(https://docs.esmvaltool.org/projects/ESMValCore/en/latest/recipe/preprocessor.html?highlight=extract_shape#extract-shape)

To say that longitudes and latitudes with no value inside of the given shape will be dropped. However, this does not seem to be the case.

recipe_climwip_test_basic.yml.txt
shapefiles.zip

Example output file:

ncdump -h recipe_climwip_test_basic_20210419_074811/preproc/calculate_weights_climwip/tas_CLIM/CMIP5_ACCESS1-0_Amon_historical-rcp85_r1i1p1_tas_1995-2014.nc
netcdf CMIP5_ACCESS1-0_Amon_historical-rcp85_r1i1p1_tas_1995-2014 {
dimensions:
	lat = 72 ;
	lon = 144 ;
	bnds = 2 ;

Can anyone recreate this?

@lukasbrunner lukasbrunner added bug Something isn't working EUCP www.eucp-project.eu labels Apr 19, 2021
@lukasbrunner lukasbrunner self-assigned this Apr 19, 2021
@lukasbrunner
Copy link
Contributor Author

I am a bit out of my depth here but looking at the code

if start_longitude < -180.:
start_longitude = -180.
else:
if start_longitude < 0:
start_longitude = 0
end_longitude += lon_step
if not cmor_coords:
if end_longitude > 180.:
end_longitude = 180.
else:
if end_longitude > 360:
end_longitude = 360.

I could imagine that this is connected to #799?

@valeriupredoi
Copy link
Contributor

@lukasbrunner how is it that's not working? I see a restricted number of lats/lons in your ncdump output (at least they seem so), if you could post a snapshot of ncview run on your output file, that'd settle it 👍

@valeriupredoi
Copy link
Contributor

I am a bit out of my depth here but looking at the code

if start_longitude < -180.:
start_longitude = -180.
else:
if start_longitude < 0:
start_longitude = 0
end_longitude += lon_step
if not cmor_coords:
if end_longitude > 180.:
end_longitude = 180.
else:
if end_longitude > 360:
end_longitude = 360.

I could imagine that this is connected to #799?

not quite, that's in there to deal with geometry bounds that are not cmor standard (eg are on a -180 to +180 grid)

@lukasbrunner
Copy link
Contributor Author

No there's still all of them there (also note that the longitude is neither strictly in (0, 360) nor in (-180, 180) due to the iris issue)

ncdump.txt

@lukasbrunner
Copy link
Contributor Author

Maybe just to clarify: everything is still working, values are masked correctly they are just not dropped.
The for me its only the mapplot that made troubles but I fixed that in the last commit
https://github.com/ESMValGroup/ESMValTool/blob/e1d9c67e7fd1869514fcfe002e4ac26f04da6695/esmvaltool/diag_scripts/weighting/weighted_temperature_map.py#L76

So no worries if this only happens for me, I just though I mention it :)

@valeriupredoi
Copy link
Contributor

valeriupredoi commented Apr 20, 2021

if you know the corners of your region, you can check to see the corners if the region that will be extracted with this script:

import iris

def _crop_cube(cube,
               start_longitude,
               start_latitude,
               end_longitude,
               end_latitude,
               cmor_coords=True):
    """Crop cubes on a cartesian grid."""
    lon_coord = cube.coord(axis='X')
    lat_coord = cube.coord(axis='Y')
    if lon_coord.ndim == 1 and lat_coord.ndim == 1:
        # add a padding of one cell around the cropped cube
        lon_bound = lon_coord.core_bounds()[0]
        lon_step = lon_bound[1] - lon_bound[0]
        start_longitude -= lon_step
        if not cmor_coords:
            if start_longitude < -180.:
                start_longitude = -180.
        else:
            if start_longitude < 0:
                start_longitude = 0
        end_longitude += lon_step
        if not cmor_coords:
            if end_longitude > 180.:
                end_longitude = 180.
        else:
            if end_longitude > 360:
                end_longitude = 360.
        lat_bound = lat_coord.core_bounds()[0]
        lat_step = lat_bound[1] - lat_bound[0]
        start_latitude -= lat_step
        if start_latitude < -90:
            start_latitude = -90.
        end_latitude += lat_step
        if end_latitude > 90.:
            end_latitude = 90.

    return (start_latitude, end_latitude, start_longitude, end_longitude)

cube = iris.load_cube("/badc/cmip5/data/cmip5/output1/CSIRO-BOM/ACCESS1-0/rcp85/mon/atmos/Amon/r1i1p1/latest/tas/tas_Amon_ACCESS1-0_rcp85_r1i1p1_200601-210012.nc")
sla, ela, slo, elo = _crop_cube(cube, 0.93, 359, -90, 90)
print(sla, ela, slo, elo)

here replace 0.93, 359, -90, 90 with the corners of your shp region - I suspect that the region spans the whole globe(ish) that's why you don't actually crop to any restricted area 👍

@lukasbrunner
Copy link
Contributor Author

Ah okay so I think I start to understand: the shapefile contains all AR6 regions, that will always be global. What I do (and this is probably where I misunderstood things) is to use the ids parameter to select a few regions from the shapefile by name.

So the _crop_cube only looks at the bounds of the entire shapefile (to my understanding) and this is global, the ids are only used for masking later

selections = _get_masks_from_geometries(geometries,
lon,
lat,
method=method,
decomposed=decomposed,
ids=ids)

So this is probably not a bug then and I just misinterpreted the documentation?

@valeriupredoi
Copy link
Contributor

yes, sorry mate, I should have been more clear - you are right, crop runs by choosing all the geometries' bounds:

        if crop:
            cube = _crop_cube(cube,
                              *geometries.bounds,
                              cmor_coords=cmor_coords)
                              

@lukasbrunner
Copy link
Contributor Author

Okay but then its really fine! Everything works for me with the small addition in the mapplot script. So if this is intended and not a bug all is good :)

@valeriupredoi
Copy link
Contributor

it's a good point - not an error but it could be a missed feature - @jvegasbsc implemented the regions (I think) - maybe we can draw his attention here and say it might be worth cropping on the region of interest rather than on the whole shp file?

@valeriupredoi valeriupredoi changed the title extract_shape - crop: true not working? extract_shape - crop on the ID-selected region and not on the whole shapefile Apr 21, 2021
@stefsmeets
Copy link
Contributor

Hi @lukasbrunner , thanks for the detailed explanation and the example! I'm working on a fix for this that first makes a subset of shape ids from the shapefiles. Then takes the extent for the geometries defined by those shapes. Essentially it fits a bounding box for the region(s) defined, and then crops the rest. Would that work?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working EUCP www.eucp-project.eu
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants