Conversation
juntyr
left a comment
There was a problem hiding this comment.
Could we also get a max relative error to match? Else it looks good to go from my end :)
| y : xr.DataArray | ||
| Shape (realization, time, vertical, latitude, longitude) | ||
| """ | ||
| rel_error = np.abs(x - y) / np.abs(x) |
There was a problem hiding this comment.
What happens for x=0? If x=y, we should ensure that the rel_error=0, otherwise it should be inf
There was a problem hiding this comment.
If x = 0 and y = 0, the relative error will evaluate to np.nan and hence be ignored in the max reduction. So it still returns the correct result, unless all entries of x and y are 0. Typing this reasoning out loud makes me realise I should just add an explicit check :D
There was a problem hiding this comment.
I think we should use an explicit np.where here, e.g.
np.where(
x != 0,
np.abs(x - y) / np.abs(x),
np.where(
y == 0,
0,
np.inf,
),
)|
@juntyr I have also added a max relative error bound now. I also noticed that the default behaviour of calling In another PR I think we should add an argument to the metric functions about whether they should skip the NaNs or not. It is actually quite useful to skip the NaNs in practice: for the data sources which have NaNs in them and for which the compressors do not strictly follow the NaN patterns (JPEG and SZ3) it gives us an idea what sort of error to expect if we would apply some simple NaN mask pre/post-processing. If we just return NaN for the metrics that information is lost. Also we do have information about whether the NaN masks match from our error bound tests so we can use the pass/fail information to filter the validity of computed metrics. |
|
But long-term I think it make sense to compute both versions of each metric: one that skips NaNs and one which doesn't. |
I feel like the failure of the error bound metric (or if we want to be more explicit we can always add a "do NaNs match" test just for this) will be enough, seeing metric=NaN never looks good (even if it's the compressor's fault and not the benchmark's) |
Adds a maximum absolute error metric, as was raised during #20 .