Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Value of cells in the new grid that are outside the old grid's domain #15

Open
helene-b-e opened this issue Feb 27, 2018 · 14 comments
Open

Comments

@helene-b-e
Copy link

Hi,

xESMF seems like a great tool, I've just started testing it!
I have one issue though. I am regridding a limited domain with UTM-projection to a larger domain with a regular lat lon grid. In my case it seems that the cells in the new grid that are outside the old grid's domain are set to zero and not nan. Are there any options concerning what the default values are outside the input-grid's domain?

Regards, Helene

I'm using Python 3.6.3, xarray 0.10.0 xesmf 0.1.1, scipy 0.19.1

The script I am using:
import matplotlib.pyplot as plt
import numpy as np
import xarray as xr
from pyproj import Proj,transform
import xesmf as xe

SeN=xr.open_dataset('http://thredds.met.no/thredds/dodsC/metusers/senorge2/seNorge2/archive/PREC1d/seNorge2_PREC1d_grid_1957.nc')
projsrs=SeN['UTM_Zone_33'].proj4

LowerLeftEast= -75000
LowerLeftNorth= 6450000
UpperRightEast= 1120000
UpperRightNorth= 8000000
dx=1000.
ZoneNo = "33"

myP=Proj(SeN['UTM_Zone_33'].proj4)

Xcorners=np.arange(SeN['X'].data[0]-dx/2., SeN['X'].data[-1]+3dx/2., dx)
Ycorners=np.flipud(np.arange(SeN['Y'].data[-1]-dx/2., SeN['Y'].data[0]+3
dx/2., dx))
Lon2, Lat2 = myP(*np.meshgrid(SeN['X'].data,SeN['Y'].data),inverse=True)
Lon2b, Lat2b = myP(*np.meshgrid(Xcorners,Ycorners),inverse=True) #

lons=np.asarray(Lon2)
lats=np.asarray(Lat2)
SeN.coords['lat'] = (('Y','X'),Lat2)
SeN.coords['lon'] = (('Y','X'),Lon2)
SeN.set_coords(['lat','lon'])

SeN.coords['Xb'] = (Xcorners)
SeN.coords['Yb'] = (Ycorners)
SeN.set_coords(['Xb','Yb'])

SeN.coords['lat_b'] = (('Yb','Xb'),Lat2b)
SeN.coords['lon_b'] = (('Yb','Xb'),Lon2b)
SeN.set_coords(['lat_b','lon_b'])
res=0.025
reslon=0.05
ds_out = xe.util.grid_2d(lons.min()-0.06,lons.max()+0.2,0.05,lats.min()-0.08,lats.max()+0.03, 0.025)
dr=SeN['precipitation_amount'][0:110]

regridder_cons = xe.Regridder(SeN, ds_out, 'conservative',reuse_weights=True)
dr_out = regridder_cons(dr)
plt.imshow(np.isnan(dr_out[0].data),origin='lower')

cons_regridding_xesmf_0_not_nan_outside_input_data_dom

@JiaweiZhuang
Copy link
Owner

JiaweiZhuang commented Feb 27, 2018

In my case it seems that the cells in the new grid that are outside the old grid's domain are set to zero and not nan. Are there any options concerning what the default values are outside the input-grid's domain?

Currently only zero is used, which is the default behavior in ESMPy. Adding nan as an option is a good suggestion and I am happy to add it in the next version. Defaulting to 0 also has many use cases especially for multi-tile grids (regrid one by one and add up).

A quick and dirty way to get nan right now is:

  1. Make sure your input data has no 0. Add a constant like indata + C
  2. Regrid as usual outdata = regridder(indata)
  3. Set 0 to nan, by outdata[outdata==0.0] = np.nan. It is fine to use == for floating point comparison here because the unmapped cell is exactly 0.
  4. Scale down the output data outdata - C

Notes on ESMPy: The UnmappedAction option can only take ERROR or IGNORE. With ERROR the regridding operation would simply fail if unmapped cell exists. With IGNORE, 0 will be assigned to unmapped cells. But it is not hard to convert zero to nan in xESMF level.

@helene-b-e
Copy link
Author

Thank you for the quick reply and the temporary fix! It would be great to have nan as an option for the values in a future version.

@NicWayand
Copy link

Also running into this issue but the above fix is not working in my case. I have data between 0-1 and use an offset value of +10.0. Below shows an example image of the remapped grid where the target cells that contained both a missing and valid source cell have values between -10 to 0 (after subtracting back the offset value). I guess this verifies that conservative method is working correctly! But it doesn't allow me to mask correctly. So +1 on adding NaN. @JiaweiZhuang, let me know if I can help, if you can point me in the right direction I can give it a shot.

image

@JiaweiZhuang
Copy link
Owner

Ah, you are right, I overlooked this case. Conservative regridding will average the cell on the input grid's boundary with non-existing "cells" outside the boundary (which are assumed to have zero values). This means the added constant (10 in your case) will be diluted to, say ~5, after regridding, and will then become negative after you abstract 10. So that scaling-up-and-down approach won't work correctly. There is actually a conservative regridding option that won't dilute boundary cells (#17), but it is not in xESMF yet.

A more robust way to get nan is using the information in the weight matrix A:

reuse_regridder_26_1

If there are unmapped cells, several rows in the above matrix will be empty. (no empty row in this case since the output grid is fully mapped) Just add one nan element to each empty row, then the unmapped cells should get nan.

@NicWayand
Copy link

Thanks for the suggestion! Below is function for adding nans to empty rows.

It currently throws a warning that editing a csr matrix is slow... but the lil_matrix doesn't have a indptr function (I am new to sparse matrices so there probably is a way).

# Add Nans to matrices, which makes any output cell with a weight from a NaN input cell = NaN
def add_matrix_NaNs(regridder):
    X = regridder.A
    M = scipy.sparse.csr_matrix(X)
    num_nonzeros = np.diff(M.indptr)
    M[num_nonzeros == 0, 0] = np.NaN
    regridder.A = scipy.sparse.coo_matrix(M)
    return regridder

@JiaweiZhuang
Copy link
Owner

@NicWayand Thanks for the code! Does this work correctly for your use case? For performance issues there should be many ways around.

@NicWayand
Copy link

Yes it did!

@Plantain
Copy link

Plantain commented Sep 5, 2019

add_matrix_NaNs doesn't work as a workaround since 0.2.0

regridder.A = scipy.sparse.coo_matrix(M)
AttributeError: can't set attribute

Does anyone have a new workaround?

@JiaweiZhuang
Copy link
Owner

@Plantain Use regridder.weights instead

@mickaellalande
Copy link

Is there any update about this topic? I used the function define by @NicWayand by replacing regridder.A by regridder.weights:

import scipy
def add_matrix_NaNs(regridder):
    X = regridder.weights
    M = scipy.sparse.csr_matrix(X)
    num_nonzeros = np.diff(M.indptr)
    M[num_nonzeros == 0, 0] = np.NaN
    regridder.weights = scipy.sparse.coo_matrix(M)
    return regridder

It works, but it would be more convenient to have an option directly when we create the regridder. Did I miss something or there is no other way to do this so far?

@JiaweiZhuang
Copy link
Owner

Did I miss something or there is no other way to do this so far?

No updates yet from my side. I agree that it's a useful feature, and welcome discussion on the API design. I am thinking about xesmf.Regridder(..., unmapped='nan'). Defaults to unmapped='zero' (the current behavior).

Also welcome PR, especially a unit test with small, synthetic data.

This is better done after #75, which will allow you to save the modified weights. Currently the weights are saved to disk before you have the chance to apply add_matrix_NaNs

@zxdawn
Copy link

zxdawn commented Apr 5, 2020

@JiaweiZhuang I tried the method as illustrated by @mickaellalande :

def add_matrix_NaNs(regridder):
    '''Deal with wrong boundary'''
    X = regridder.weights
    M = scipy.sparse.csr_matrix(X)
    num_nonzeros = np.diff(M.indptr)
    M[num_nonzeros == 0, 0] = np.NaN
    regridder.weights = scipy.sparse.coo_matrix(M)

    return regridder

wrf_bounds = {'lon': wrf.lon,
              'lat': wrf.lat,
              'lon_b': wrf.lon_b,
              'lat_b': wrf.lat_b,
              }

s5p_bounds = {'lon': s5p['longitude'],
              'lat': s5p['latitude'],
              'lon_b': s5p['assembled_lon_bounds'],
              'lat_b': s5p['assembled_lat_bounds'],
              }

# create regridder
regrid_method = 'conservative'
regridder = xe.Regridder(wrf_bounds, s5p_bounds,
                         regrid_method,
                         reuse_weights=True)
regridder = add_matrix_NaNs(regridder)

# apply regridder
regrid_vars = {'p': regridder(wrf['P']+wrf['PB'])}

# plot
wrf_data = (wrf['P']+wrf['PB'])[0, ...]
wrf_data.plot()
plt.show()
regrid_data = regrid_vars['p'][0, ...].where(regrid_vars['p'][0, ...]>0)
regrid_data.plot(vmin=95000)
plt.show()

But, I still got the diluted grids as same as #17 (comment).

@Jing25
Copy link

Jing25 commented Sep 13, 2021

Hi any updates on this topic? Can we set the nan values now?

@axelschweiger
Copy link

Trying this solution (actually using Nic's ESIO package, I get the following error: Any suggestions
Thanks
axel

home/axel/anaconda3/envs/my_env39/lib/python3.9/site-packages/scipy/sparse/_index.py:125: SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient.
self._set_arrayXarray(i, j, x)


AttributeError Traceback (most recent call last)
/tmp/ipykernel_26950/3871824035.py in
31
32 # Add NaNs to empty rows of matrix (forces any target cell with ANY source cells containing NaN to be NaN)
---> 33 regridder = import_data.add_matrix_NaNs(regridder)
34
35 # Regrid variable

~/ax_python_lib/ESIO/esio/import_data.py in add_matrix_NaNs(regridder)
261 M = scipy.sparse.csr_matrix(X)
262 num_nonzeros = np.diff(M.indptr)
--> 263 M[num_nonzeros == 0, 0] = np.NaN
264 # regridder.A = scipy.sparse.coo_matrix(M)
265 regridder.weights = scipy.sparse.coo_matrix(M)

AttributeError: can't set attribute

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants