Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JOSS review - filtering on a tripolar grid and numerical instability #124

Closed
AleksiNummelin opened this issue Dec 20, 2021 · 10 comments
Closed
Labels
question Further information is requested
Projects

Comments

@AleksiNummelin
Copy link

AleksiNummelin commented Dec 20, 2021

Dear @NoraLoose, all, this issue is related to my review for the JOSS submission openjournals/joss-reviews#3947

Thanks for putting together this package, it is really nice addition to the general python universe related to ocean/climate modelling, and I'd expect some users to be interested in the methodology beyond these fields. I've tested the package on examples and some of my own data, and it mostly works as expected (see below). The documentation builds heavily on examples which is good I think, although that means that just based on API it might be a bit harder to understand how certain functions are working. Perhaps in time, one could link examples from the API (maybe this is unorthodox, but could help the user). Adding timing to some of the examples is a nice add on.

I only came across one issue that I could not make to work, and it might be related to #123. When I tested the filtering on a tripolar NorESM (BLOM) grid, the code fails when checking that the southern most row is land. In the grid, that row is indeed land and wet_mask[..., 0, :].any() will give False although the check on line 438 gcm_filters/kernels.py somehow manages to give True. I didn't look much more into this, but I also wondered why this condition is needed, does the algorithm fail if it is violated? I think it is fine to publish the JOSS paper before this is fixed (might be my data), but would be good to solve this since the methodology should be general and one of the main benefits of the algorithm is really the possibility to filter on a tripolar grid.

Another small addition that I thought about was to show how the numerical instability is very sensitive to the number of grid cells. This seems to happen very abruptly at 30 grid cells for float32 and 66 grid cells for float128 using the MOM data. I wonder if these are general numbers independent of data? If so, maybe useful to mention and in any case might be better to show exactly when the instability occurs rather than showing the 10 degree case (which is far beyond what the failure point.)

I've attached code to this issue that goes trough these two cases (python code named as txt so that it can be attached here), hope it is useful, and thanks again for putting this together.

@NoraLoose
Copy link
Member

Hi @AleksiNummelin,

Thanks for reviewing our JOSS submission and for your helpful comments!

I only came across one issue that I could not make to work, and it might be related to #123. When I tested the filtering on a tripolar NorESM (BLOM) grid, the code fails when checking that the southern most row is land.

It's really great that you tested the tripolar Laplacian on the NorESM (BLOM) grid! You actually only have to make a minor modification to get your code to work:

delta_filtered_tripolar_regular_with_land = filter_tripolar_regular_with_land.apply(delta, dims=['y', 'x'])

rather than specifying dims=['x',y']. The order of the dimensions matters here. (I guess your code checked whether your westernmost row is equal to zeros.)
We should maybe emphasize in the documentation that the order of dimensions matters.

I didn't look much more into this, but I also wondered why this condition is needed, does the algorithm fail if it is violated?

Yes, having land in the southernmost row is a necessary condition. To see this, it is helpful to look at the following schematic of how the tripolar exchanges are coded:

tripolar_exchanges

In step 1, if any of the southernmost row (U,V,W,X,Y,Z) is wet (i.e., not land), then the values in F,E,D,C,B,A will contaminate the values is U,V,W,X,Y,Z because we are applying diffusion with periodic boundary conditions. Does this make sense?

@NoraLoose
Copy link
Member

@AleksiNummelin I will get back to your question about numerical instability in a bit. @iangrooms and @hmkhatri, feel free to jump in too.

@AleksiNummelin
Copy link
Author

Thanks @NoraLoose, this clarifies! My mistake on the order of the dims then, I'll admit that I didn't read the API properly on this, I was just following the example, assuming that the order would not matter (didn't even think about it). I checked the API now, and it indeed does not specify that the order matters, maybe it would be helpful to add since the error doesn't really point to that - or at least I didn't come to think about that although it makes total sense now. The other option could be to flip the order if necessary, the only thing I could think of right now would be something like if field.dims.index(dims[0])>field.dims.index(dims[1]): dims_new=dims[::-1] which would work for the apply_to_field in filter.py and similarly one could use the ufield or vfield in apply_to_vector in filter.py, should work I think.

@hmkhatri
Copy link
Contributor

Thanks @AleksiNummelin and @NoraLoose.

As mentioned in Filter Theory section, roundoff errors lead to numerical instability. I am not sure if we could a priori identity the filter scale at which filtering scheme becomes unstable. Also, it is not clear if different types of fields, e.g. vectors vs tracers, would become unstable for the same filter scale.

The easiest way for a user is to try different filter scales and keep track of the total variance, which should decay with increasing the filter scale, in the filtered field. It should give an idea about the instability problem.

@NoraLoose
Copy link
Member

@iangrooms came up with empirical values for when the filter becomes unstable:

filter_params = {
FilterShape.GAUSSIAN: {
1: {"offset": 0.8, "factor": 0.0, "exponent": 1, "max_filter_factor": 67},
2: {"offset": 1.1, "factor": 0.0, "exponent": 1, "max_filter_factor": 77},
},
FilterShape.TAPER: {
1: {"offset": 2.2, "factor": 0.6, "exponent": 2.5, "max_filter_factor": 19},
2: {"offset": 3.2, "factor": 0.7, "exponent": 2.7, "max_filter_factor": 20},
},
}

For instance, in n=2 dimensions and a Gaussian filter, the max filter factor is 77. If the user specifies a filter factor higher than that, the code will issue a warning:

if filter_factor >= max_filter_factor:
warnings.warn(
"Filter scale much larger than grid scale -> numerical instability possible. "
"More information on numerical instability can be found at https://gcm-filters.readthedocs.io/en/latest/theory.html.",

But as @hmkhatri says and as we describe in Grooms et al. (2021), the maximum filter factor can't unfortunately be guaranteed a priori. It depends on the scales that are present in the field that the user wants to filter.

@NoraLoose
Copy link
Member

I checked the API now, and it indeed does not specify that the order matters, maybe it would be helpful to add since the error doesn't really point to that.

@AleksiNummelin, I totally agree that we have to do something about this issue. I opened #125.

@NoraLoose NoraLoose added this to To do in JOSS Paper via automation Dec 21, 2021
@NoraLoose NoraLoose moved this from To do to In progress in JOSS Paper Dec 21, 2021
@iangrooms
Copy link
Member

I'll second (third?) what's already been said about the numerical instability: It depends on a lot of factors, but most importantly it depends on the data being filtered which makes it practically impossible to predict a priori when instability will set in. The tests I ran (on which the warnings are based) use white noise, so the warnings should be "conservative" in the sense that the warning will hopefully kick in before the actual instability kicks in. Most people are filtering data that is nicer than white noise.

We're continually trying to improve this; there's a PR right now (#127) that significantly improves stability of the Gaussian filter.

@NoraLoose
Copy link
Member

Hi @AleksiNummelin,

We have worked on addressing your two comments:

Let us know what you think, and thanks again for your comments!

@AleksiNummelin
Copy link
Author

Thanks @NoraLoose, all, for the impressive work, this is really great! I think the documentation/warning approach on the dimension order is reasonable, I mean, we users should be able to write things in certain order now that it is clearly specified. The new information of the numerical instability in the notebook and especially the factor Gaussian approach are really neat. It is now clear to me and probably to a generic user as well what to expect in terms of the instability.

I'll add note to the review that my questions have been thoroughly answered - I think you can then close this issue.

@NoraLoose
Copy link
Member

Thanks Aleksi!

JOSS Paper automation moved this from In progress to Done Feb 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
Development

No branches or pull requests

4 participants