Dimensionless units as "1" - adapt to numpy 2 #1814

aulemahal · 2024-06-27T23:24:33Z

Pull Request Checklist:

This PR addresses an already opened issue (for bug fixes / features)
- This PR fixes New numpy version breaks everything #1785
Tests for the changes have been added (for bug fixes / features)
- (If applicable) Documentation has been added / updated (for bug fixes / features)
CHANGES.rst has been updated (with summary of main changes)
- Link to issue (:issue:number) and pull request (:pull:number) has been added

What kind of change does this PR introduce?

Changes NaN and NAN to nan, Inf to inf.
Changes a test so the new data type promotion of numpy 2 fits our expected values
Relaxes a test for the same reason
Change expected unit order in some cases (new cf_xarray + pint)
Dimensionless units are now printed as "1".
Simplify pint2cfunits.
Add ensure_absolute_tempetature to its module's __all__ and move ensure_delta up in the same module so both functions are near another in the file.

Does this PR introduce a breaking change?

Yes it does. Fixing numpy 2 issues made me fix pint 0.24.1 issues that made me fix cf_xarray issues which have solution that is not backwards-compatible and now pint and cf_xarray have updated pinned that imply a numpy >=2 pin.

Other information:

~~We will require 3 new pins :~~

Most problems with create_ensemble come from a np2 bug in xarray, which was fixed here promote floating-point numeric datetimes to 64-bit before decoding pydata/xarray#9182. ~~We are thus waiting for a release~~. Xarray 2024.07.0 out on the 30th.
All units problem are solved with Rewrite unit formatter for pint 0.24 and earlier xarray-contrib/cf-xarray#523, which was released in cf-xarray 0.9.3.
The dimensionless unit thing requires pint 0.24.1 ~~which requires numpy 2, so pinning this as well.~~

UPDATE: No pins were added, but the behaviour of xclim will be different for dimensionless indicators depending on the cf_xarray/pint versions installed.

aulemahal · 2024-06-28T22:17:09Z

Sadly, there is one last issue that I did not see how to fix. @coxipi the following tests fail with numpy 2:

FAILED tests/test_indices.py::TestStandardizedIndices::test_standardized_precipitation_index[D-1-gamma-ML-values8-0.02] - AssertionError: 
FAILED tests/test_indices.py::TestStandardizedIndices::test_standardized_precipitation_index[D-12-gamma-ML-values9-0.02] - AssertionError: 
FAILED tests/test_indices.py::TestStandardizedIndices::test_standardized_precipitation_index[D-1-fisk-ML-values10-0.02] - AssertionError:

I'm guessing it's also a data type promotion issue, but I don't know where to look.

coxipi · 2024-06-29T01:48:14Z

The SPI functions were only the canary in the mine... indices assigning "" (dimensionless units) are concerned, you can also test "test_ffdi.py" to see this. It comes from pint2cfunits, here is a minimal example with the problematic part:

import xclim 
value = xclim.core.units.units2pint("")
f"{value:cf}"

I guess this should be reported upstream? I'm not sure what body is in charge of operations like f"{value:cf}". Here is a dumdum change I tested in our codebase and checked it works, just change the return of pint2cfunits

return "" if value == units2pint("") else f"{value:cf}"

aulemahal · 2024-07-02T19:11:27Z

This is fixed when you install my branch from cf-xarray. If you do so (with pint 0.24.1 and xarray's master), you'll see that only SPI errors are left.

Zeitsperre · 2024-07-02T19:37:34Z

If you do so (with pint 0.24.1 and xarray's master), you'll see that only SPI errors are left.

How should we handle the new pint version? Will it support Python3.9+? Can we safely raise the lower pin to that upcoming version?

aulemahal · 2024-07-02T19:57:59Z

I guess I was eager to begin my weekend when I wrote this PR's top comment, I could have given more info ;).

My current PR to cf_xarray will fix the issue for pint 0.24.1. An older pint version can be used with any cf-xarray, and next cf-xarray will be usable with any pint version. Unless we want to write a somewhat complex dependency string, I suggest we pin cf-xarray only, not pint.

Note: I will do further changes to the cf-xarray PR, so the change will be a bit more breaking than initially thought, which goes in the direction of my suggestion.

Pint 0.24.1 supports python 3.9+. When all dependent PRs and releases are done, I'll test xclim with different versions of pint to be sure, before merging this PR.

…imilar functions together

aulemahal · 2024-07-08T20:27:39Z

Oups. The units issue was fixed in cf-xarray, but with a change that is not retrocompatible. The newest cf-xarray will return "1" as the dimensionless unit with pint 0.24.1 (and only with this version). This is what the CF conventions specify, but previous pint made this difficult, so we used to return an empty string.

Changing xclim to return "1" thus requires cf-xarray 0.9.3 and pint >= 0.24.1, but the latter implies numpy >=2¹.
So. This PR that was meant to fix issues with numpy 2 is now pinning numpy to >= 2.

To be clear xclim doesn't actually needs numpy >= 2, but I bumped the pin in the env to make solving envs easier for conda/mamba since it is implied by the pint's pin. ↩

aulemahal · 2024-07-08T20:29:57Z

@coxipi FYI, these are the SPI errors on my side:

_____________________________________________ TestStandardizedIndices.test_standardized_precipitation_index[D-1-gamma-ML-values8-0.02] _____________________________________________
[gw1] linux -- Python 3.12.4 /home/pbourg/miniforge3/envs/xclim/bin/python3.12

self = <test_indices.TestStandardizedIndices object at 0x7f3c5fbf74a0>, open_dataset = <function open_dataset.<locals>._open_session_scoped_file at 0x7f3c5dd1c720>, freq = 'D'
window = 1, dist = 'gamma', method = 'ML', values = [-0.011698, 1.597031, 0.969714, 0.265561, -0.132654], diff_tol = 0.02

    @pytest.mark.slow
    @pytest.mark.parametrize(
        "freq, window, dist, method,  values, diff_tol",
        [
            (
                "MS",
                1,
                "gamma",
                "APP",
                [1.31664, 1.45069, 1.94609, -3.09, 0.850681],
                2e-2,
            ),
            (
                "MS",
                12,
                "gamma",
                "APP",
                [0.598209, 1.55976, 1.69309, 0.9964, 0.7028],
                2e-2,
            ),
            (
                "MS",
                1,
                "gamma",
                "ML",
                [1.460105, 1.602951, 2.072521, -3.09, 0.891468],
                0.04,
            ),
            ("MS", 12, "gamma", "ML", [0.59821, 1.5598, 1.6931, 0.9964, 0.7028], 0.04),
            (
                "MS",
                1,
                "fisk",
                "ML",
                [1.41236, 1.51192, 1.93324, -2.74089, 0.932674],
                2e-2,
            ),
            (
                "MS",
                12,
                "fisk",
                "ML",
                [0.683273, 1.51189, 1.61597, 1.03875, 0.72531],
                2e-2,
            ),
            (
                "D",
                1,
                "gamma",
                "APP",
                [-0.18618353, 1.44582971, 0.95985043, 0.15779587, -0.37801587],
                2e-2,
            ),
            (
                "D",
                12,
                "gamma",
                "APP",
                [-0.24417774, -0.11404418, 0.64997039, 1.07670517, 0.6462852],
                2e-2,
            ),
            (
                "D",
                1,
                "gamma",
                "ML",
                [-0.011698, 1.597031, 0.969714, 0.265561, -0.132654],
                2e-2,
            ),
            (
                "D",
                12,
                "gamma",
                "ML",
                [-0.158116, -0.049222, 0.672544, 1.08332, 0.660903],
                2e-2,
            ),
            (
                "D",
                1,
                "fisk",
                "ML",
                [-0.158949, 1.308225, 0.449846, 0.146699, -0.502737],
                2e-2,
            ),
            (
                "D",
                12,
                "fisk",
                "ML",
                [-0.14151269, -0.01914608, 0.7080277, 1.01510279, 0.6954002],
                2e-2,
            ),
            (
                "D",
                1,
                "fisk",
                "APP",
                [-0.417288, 1.275686, 1.002566, 0.206854, -0.672117],
                2e-2,
            ),
            (
                "D",
                12,
                "fisk",
                "APP",
                [-0.220136, -0.065782, 0.783319, 1.134428, 0.782857],
                2e-2,
            ),
            (
                None,
                1,
                "gamma",
                "APP",
                [-0.18618353, 1.44582971, 0.95985043, 0.15779587, -0.37801587],
                2e-2,
            ),
            (
                None,
                12,
                "gamma",
                "APP",
                [-0.24417774, -0.11404418, 0.64997039, 1.07670517, 0.6462852],
                2e-2,
            ),
        ],
    )
    def test_standardized_precipitation_index(
        self, open_dataset, freq, window, dist, method, values, diff_tol
    ):
        ds = open_dataset("sdba/CanESM2_1950-2100.nc").isel(location=1)
        if freq == "D":
            ds = ds.convert_calendar("366_day", missing=np.nan)
            # to compare with ``climate_indices``
        pr = ds.pr.sel(time=slice("1998", "2000"))
        pr_cal = ds.pr.sel(time=slice("1950", "1980"))
        fitkwargs = {}
        if method == "APP":
            fitkwargs["floc"] = 0
        params = xci.stats.standardized_index_fit_params(
            pr_cal,
            freq=freq,
            window=window,
            dist=dist,
            method=method,
            fitkwargs=fitkwargs,
            zero_inflated=True,
        )
        spi = xci.standardized_precipitation_index(pr, params=params)
        # Only a few moments before year 2000 are tested
        spi = spi.isel(time=slice(-11, -1, 2))
    
        # [Guttman, 1999]: The number of precipitation events (over a month/season or
        # other time period) is generally less than 100 in the US. This suggests that
        # bounds of ± 3.09 correspond to 0.999 and 0.001 probabilities. SPI indices outside
        # [-3.09, 3.09] might be non-statistically relevant. In `climate_indices` the SPI
        # index is clipped outside this region of value. In the values chosen above,
        # this doesn't play role, but let's clip it anyways to avoid future problems.
        # The last few values in time are tested
        spi = spi.clip(-3.09, 3.09)
    
>       np.testing.assert_allclose(spi.values, values, rtol=0, atol=diff_tol)

tests/test_indices.py:628: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

args = (<function assert_allclose.<locals>.compare at 0x7f3c5c017560>, array([-0.08378482,  1.45764713,  0.99329611,  0.27189447, -0.44968379]), array([-0.011698,  1.597031,  0.969714,  0.265561, -0.132654]))
kwds = {'equal_nan': True, 'err_msg': '', 'header': 'Not equal to tolerance rtol=0, atol=0.02', 'strict': False, ...}

    @wraps(func)
    def inner(*args, **kwds):
        with self._recreate_cm():
>           return func(*args, **kwds)
E           AssertionError: 
E           Not equal to tolerance rtol=0, atol=0.02
E           
E           Mismatched elements: 4 / 5 (80%)
E           Max absolute difference among violations: 0.31702979
E           Max relative difference among violations: 6.16232032
E            ACTUAL: array([-0.083785,  1.457647,  0.993296,  0.271894, -0.449684])
E            DESIRED: array([-0.011698,  1.597031,  0.969714,  0.265561, -0.132654])

../../miniforge3/envs/xclim/lib/python3.12/contextlib.py:81: AssertionError
____________________________________________ TestStandardizedIndices.test_standardized_precipitation_index[D-12-gamma-ML-values9-0.02] _____________________________________________
[gw1] linux -- Python 3.12.4 /home/pbourg/miniforge3/envs/xclim/bin/python3.12

self = <test_indices.TestStandardizedIndices object at 0x7f3c5fbf7530>, open_dataset = <function open_dataset.<locals>._open_session_scoped_file at 0x7f3c5dd1c720>, freq = 'D'
window = 12, dist = 'gamma', method = 'ML', values = [-0.158116, -0.049222, 0.672544, 1.08332, 0.660903], diff_tol = 0.02

    @pytest.mark.slow
    @pytest.mark.parametrize(
        "freq, window, dist, method,  values, diff_tol",
        [
            (
                "MS",
                1,
                "gamma",
                "APP",
                [1.31664, 1.45069, 1.94609, -3.09, 0.850681],
                2e-2,
            ),
            (
                "MS",
                12,
                "gamma",
                "APP",
                [0.598209, 1.55976, 1.69309, 0.9964, 0.7028],
                2e-2,
            ),
            (
                "MS",
                1,
                "gamma",
                "ML",
                [1.460105, 1.602951, 2.072521, -3.09, 0.891468],
                0.04,
            ),
            ("MS", 12, "gamma", "ML", [0.59821, 1.5598, 1.6931, 0.9964, 0.7028], 0.04),
            (
                "MS",
                1,
                "fisk",
                "ML",
                [1.41236, 1.51192, 1.93324, -2.74089, 0.932674],
                2e-2,
            ),
            (
                "MS",
                12,
                "fisk",
                "ML",
                [0.683273, 1.51189, 1.61597, 1.03875, 0.72531],
                2e-2,
            ),
            (
                "D",
                1,
                "gamma",
                "APP",
                [-0.18618353, 1.44582971, 0.95985043, 0.15779587, -0.37801587],
                2e-2,
            ),
            (
                "D",
                12,
                "gamma",
                "APP",
                [-0.24417774, -0.11404418, 0.64997039, 1.07670517, 0.6462852],
                2e-2,
            ),
            (
                "D",
                1,
                "gamma",
                "ML",
                [-0.011698, 1.597031, 0.969714, 0.265561, -0.132654],
                2e-2,
            ),
            (
                "D",
                12,
                "gamma",
                "ML",
                [-0.158116, -0.049222, 0.672544, 1.08332, 0.660903],
                2e-2,
            ),
            (
                "D",
                1,
                "fisk",
                "ML",
                [-0.158949, 1.308225, 0.449846, 0.146699, -0.502737],
                2e-2,
            ),
            (
                "D",
                12,
                "fisk",
                "ML",
                [-0.14151269, -0.01914608, 0.7080277, 1.01510279, 0.6954002],
                2e-2,
            ),
            (
                "D",
                1,
                "fisk",
                "APP",
                [-0.417288, 1.275686, 1.002566, 0.206854, -0.672117],
                2e-2,
            ),
            (
                "D",
                12,
                "fisk",
                "APP",
                [-0.220136, -0.065782, 0.783319, 1.134428, 0.782857],
                2e-2,
            ),
            (
                None,
                1,
                "gamma",
                "APP",
                [-0.18618353, 1.44582971, 0.95985043, 0.15779587, -0.37801587],
                2e-2,
            ),
            (
                None,
                12,
                "gamma",
                "APP",
                [-0.24417774, -0.11404418, 0.64997039, 1.07670517, 0.6462852],
                2e-2,
            ),
        ],
    )
    def test_standardized_precipitation_index(
        self, open_dataset, freq, window, dist, method, values, diff_tol
    ):
        ds = open_dataset("sdba/CanESM2_1950-2100.nc").isel(location=1)
        if freq == "D":
            ds = ds.convert_calendar("366_day", missing=np.nan)
            # to compare with ``climate_indices``
        pr = ds.pr.sel(time=slice("1998", "2000"))
        pr_cal = ds.pr.sel(time=slice("1950", "1980"))
        fitkwargs = {}
        if method == "APP":
            fitkwargs["floc"] = 0
        params = xci.stats.standardized_index_fit_params(
            pr_cal,
            freq=freq,
            window=window,
            dist=dist,
            method=method,
            fitkwargs=fitkwargs,
            zero_inflated=True,
        )
        spi = xci.standardized_precipitation_index(pr, params=params)
        # Only a few moments before year 2000 are tested
        spi = spi.isel(time=slice(-11, -1, 2))
    
        # [Guttman, 1999]: The number of precipitation events (over a month/season or
        # other time period) is generally less than 100 in the US. This suggests that
        # bounds of ± 3.09 correspond to 0.999 and 0.001 probabilities. SPI indices outside
        # [-3.09, 3.09] might be non-statistically relevant. In `climate_indices` the SPI
        # index is clipped outside this region of value. In the values chosen above,
        # this doesn't play role, but let's clip it anyways to avoid future problems.
        # The last few values in time are tested
        spi = spi.clip(-3.09, 3.09)
    
>       np.testing.assert_allclose(spi.values, values, rtol=0, atol=diff_tol)

tests/test_indices.py:628: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

args = (<function assert_allclose.<locals>.compare at 0x7f3c5c017ce0>, array([-0.15885414, -0.0491648 ,  0.67586258,  0.96024696,  0.66083077]), array([-0.158116, -0.049222,  0.672544,  1.08332 ,  0.660903]))
kwds = {'equal_nan': True, 'err_msg': '', 'header': 'Not equal to tolerance rtol=0, atol=0.02', 'strict': False, ...}

    @wraps(func)
    def inner(*args, **kwds):
        with self._recreate_cm():
>           return func(*args, **kwds)
E           AssertionError: 
E           Not equal to tolerance rtol=0, atol=0.02
E           
E           Mismatched elements: 1 / 5 (20%)
E           Max absolute difference among violations: 0.12307304
E           Max relative difference among violations: 0.11360728
E            ACTUAL: array([-0.158854, -0.049165,  0.675863,  0.960247,  0.660831])
E            DESIRED: array([-0.158116, -0.049222,  0.672544,  1.08332 ,  0.660903])

../../miniforge3/envs/xclim/lib/python3.12/contextlib.py:81: AssertionError
_____________________________________________ TestStandardizedIndices.test_standardized_precipitation_index[D-1-fisk-ML-values10-0.02] _____________________________________________
[gw1] linux -- Python 3.12.4 /home/pbourg/miniforge3/envs/xclim/bin/python3.12

self = <test_indices.TestStandardizedIndices object at 0x7f3c5fbf75c0>, open_dataset = <function open_dataset.<locals>._open_session_scoped_file at 0x7f3c5dd1c720>, freq = 'D'
window = 1, dist = 'fisk', method = 'ML', values = [-0.158949, 1.308225, 0.449846, 0.146699, -0.502737], diff_tol = 0.02

    @pytest.mark.slow
    @pytest.mark.parametrize(
        "freq, window, dist, method,  values, diff_tol",
        [
            (
                "MS",
                1,
                "gamma",
                "APP",
                [1.31664, 1.45069, 1.94609, -3.09, 0.850681],
                2e-2,
            ),
            (
                "MS",
                12,
                "gamma",
                "APP",
                [0.598209, 1.55976, 1.69309, 0.9964, 0.7028],
                2e-2,
            ),
            (
                "MS",
                1,
                "gamma",
                "ML",
                [1.460105, 1.602951, 2.072521, -3.09, 0.891468],
                0.04,
            ),
            ("MS", 12, "gamma", "ML", [0.59821, 1.5598, 1.6931, 0.9964, 0.7028], 0.04),
            (
                "MS",
                1,
                "fisk",
                "ML",
                [1.41236, 1.51192, 1.93324, -2.74089, 0.932674],
                2e-2,
            ),
            (
                "MS",
                12,
                "fisk",
                "ML",
                [0.683273, 1.51189, 1.61597, 1.03875, 0.72531],
                2e-2,
            ),
            (
                "D",
                1,
                "gamma",
                "APP",
                [-0.18618353, 1.44582971, 0.95985043, 0.15779587, -0.37801587],
                2e-2,
            ),
            (
                "D",
                12,
                "gamma",
                "APP",
                [-0.24417774, -0.11404418, 0.64997039, 1.07670517, 0.6462852],
                2e-2,
            ),
            (
                "D",
                1,
                "gamma",
                "ML",
                [-0.011698, 1.597031, 0.969714, 0.265561, -0.132654],
                2e-2,
            ),
            (
                "D",
                12,
                "gamma",
                "ML",
                [-0.158116, -0.049222, 0.672544, 1.08332, 0.660903],
                2e-2,
            ),
            (
                "D",
                1,
                "fisk",
                "ML",
                [-0.158949, 1.308225, 0.449846, 0.146699, -0.502737],
                2e-2,
            ),
            (
                "D",
                12,
                "fisk",
                "ML",
                [-0.14151269, -0.01914608, 0.7080277, 1.01510279, 0.6954002],
                2e-2,
            ),
            (
                "D",
                1,
                "fisk",
                "APP",
                [-0.417288, 1.275686, 1.002566, 0.206854, -0.672117],
                2e-2,
            ),
            (
                "D",
                12,
                "fisk",
                "APP",
                [-0.220136, -0.065782, 0.783319, 1.134428, 0.782857],
                2e-2,
            ),
            (
                None,
                1,
                "gamma",
                "APP",
                [-0.18618353, 1.44582971, 0.95985043, 0.15779587, -0.37801587],
                2e-2,
            ),
            (
                None,
                12,
                "gamma",
                "APP",
                [-0.24417774, -0.11404418, 0.64997039, 1.07670517, 0.6462852],
                2e-2,
            ),
        ],
    )
    def test_standardized_precipitation_index(
        self, open_dataset, freq, window, dist, method, values, diff_tol
    ):
        ds = open_dataset("sdba/CanESM2_1950-2100.nc").isel(location=1)
        if freq == "D":
            ds = ds.convert_calendar("366_day", missing=np.nan)
            # to compare with ``climate_indices``
        pr = ds.pr.sel(time=slice("1998", "2000"))
        pr_cal = ds.pr.sel(time=slice("1950", "1980"))
        fitkwargs = {}
        if method == "APP":
            fitkwargs["floc"] = 0
        params = xci.stats.standardized_index_fit_params(
            pr_cal,
            freq=freq,
            window=window,
            dist=dist,
            method=method,
            fitkwargs=fitkwargs,
            zero_inflated=True,
        )
        spi = xci.standardized_precipitation_index(pr, params=params)
        # Only a few moments before year 2000 are tested
        spi = spi.isel(time=slice(-11, -1, 2))
    
        # [Guttman, 1999]: The number of precipitation events (over a month/season or
        # other time period) is generally less than 100 in the US. This suggests that
        # bounds of ± 3.09 correspond to 0.999 and 0.001 probabilities. SPI indices outside
        # [-3.09, 3.09] might be non-statistically relevant. In `climate_indices` the SPI
        # index is clipped outside this region of value. In the values chosen above,
        # this doesn't play role, but let's clip it anyways to avoid future problems.
        # The last few values in time are tested
        spi = spi.clip(-3.09, 3.09)
    
>       np.testing.assert_allclose(spi.values, values, rtol=0, atol=diff_tol)

tests/test_indices.py:628: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

args = (<function assert_allclose.<locals>.compare at 0x7f3c5c017ba0>, array([-0.19423456,  1.30819834,  0.53076805,  0.2223399 , -0.50263457]), array([-0.158949,  1.308225,  0.449846,  0.146699, -0.502737]))
kwds = {'equal_nan': True, 'err_msg': '', 'header': 'Not equal to tolerance rtol=0, atol=0.02', 'strict': False, ...}

    @wraps(func)
    def inner(*args, **kwds):
        with self._recreate_cm():
>           return func(*args, **kwds)
E           AssertionError: 
E           Not equal to tolerance rtol=0, atol=0.02
E           
E           Mismatched elements: 3 / 5 (60%)
E           Max absolute difference among violations: 0.08092205
E           Max relative difference among violations: 0.51561972
E            ACTUAL: array([-0.194235,  1.308198,  0.530768,  0.22234 , -0.502635])
E            DESIRED: array([-0.158949,  1.308225,  0.449846,  0.146699, -0.502737])

../../miniforge3/envs/xclim/lib/python3.12/contextlib.py:81: AssertionError

This PR still requires xarray's master to run.

coxipi · 2024-07-09T22:28:00Z

Regarding the spi errors, let me show a case study on the first error ("ML-gamma-D-window==1").

First, here are many values, the error is relatively small but not negligible:

(yes a spi, especially daily, can have some weird peaks)

This error can be traced back to _fit_start:

xclim.indices.stats._fit_start(pr_cal_doy1.values, "gamma")
# old
>>> ((0.707242185969578,),  {'scale': 2.6744933340768465e-05, 'loc': 4.230436713933637e-08})
# new np2
>>> ((np.float32(0.70724225),),  {'scale': np.float32(2.674493e-05), 'loc': np.float32(4.2304368e-08)})

we have float32 in np2, float64 before. Taking np.float32 of old results, I now have closer results:

Bottom line: Very small difference in the output of _fit_start due float64 -> float32 in np2 propagates in the fitting routine, a somewhat different set of parameters can be found, which translates in different SPI values.

Why the remaining small errors in the second figure?

Probably weird rounding errors float64 -> float32. I took the float32 at the end of _fit_start. For instance, for doy==124, we have 0.458947 vs. 0.45894697 for the shape param

# e.g. doy==124
# np.float32(old)
>>> ((0.458947,), {'scale': 4.4886667e-05, 'loc': -6.8731404e-14})
# new np2
>>> ((np.float32(0.45894697),),  {'scale': np.float32(4.488667e-05), 'loc': np.float32(-6.87314e-14)})

Taking the float64 -> float32 earlier in _fit_start could probably help, but I don't know, I tried x = x.astype(np.float32) at the beginning of _fit_start but tt was still giving me float64 after... anyways. I think the point is still pretty clear.

for more information, see https://pre-commit.ci

environment.yml

for more information, see https://pre-commit.ci

Zeitsperre · 2024-07-26T20:56:15Z

@aulemahal I'm tapping out here for now.

I needed to replace netcdf4 everywhere because the latest releases (>=1.7.0) are really unstable, but the older versions are not compatible with numpy>=2.0.0. This is causing issues with a handful of tests, mainly those that either:

Expect messages that are provided by the netcdf4 backend (e.g. progress statements).
Rely on synonymous opening of files (h5netcdf uses blocking more and has a harder time with this).

h5netcdf is now a core dependency, seeing that it is being used by default in all read/write operations. I also added the option for setting the --engine using the CLI (default is unsurprisingly h5netcdf).

Here are the issues I can see that needs adjusting:

Within the tests, there's a fixture called "open_dataset" that hard-codes the use of "h5netcdf". We should be using that more thoroughly throughout the test suite (realized this after the fact).
Perhaps we should see whether we can operate entirely without netcdf4 and investigate the acceptability of optional engines like pydap for OPeNDAP support?
The objects coming out of xclim.ensembles appear to be affected, though it isn't clear to me why that might be.

Ping me if you have questions.

aulemahal · 2024-07-26T21:56:08Z

Thanks for the work!

I think the errors you saw in the ensemble modules are those caused by the float64 issue with cftime in xarray's current release. In my env with xarray@master, I see no failures in this module.

The h5netcdf error "OSError: Unable to synchronously open file (file signature not found)" happens when you try to open a non-HDF5 netcdf with engine "h5netcdf". I.e. : if you wrote it with scipy and try to force engine='h5netcdf' on the opening. Is this the error you are referring to when you talk about tests that "Rely on synonymous opening of files " ?

In my env, the CLI tests failed because the outputs were written without a specified engine and xarray seemed to default to scipy. The subsequent opening with h5netcdf forced failed with this (sadly unclear) error.

I disagree with making h5netcdf a core dependency of xclim. My principal arguments would be that

xclim does not do I/O (except in the cli module, but even then it does very generic calls)
xarray itself does not have any engine as core dependencies. We should follow the same logic.

But it is a test dependency, of course. The best way to force tests not to use netCDF4 is to not have the package in the env in the first place. In my env, I uninstalled it and all tests pass. I'll push somehing and we'll see!

I have no strong feelings about forcing h5netcdf everywhere in the tests. But it seems to be slightly dangerous as seen with the cli tests above : a to_netcdf call without an engine might write with scipy, which would make an open_dataset(..., engine='h5netcdf') call fail.

CHANGELOG.rst

aulemahal · 2024-07-31T18:29:42Z

@Zeitsperre FYI, I removed a test that was ensuring an error was raised on unfound opendap url. As we don't install netCDF4 in the CI anymore, the CI doesn't have opendap support anyway, so I thought the test wasn't necessary anymore.

review-notebook-app · 2024-07-31T18:51:47Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

Fix NaN and Inf

583917f

github-actions bot added sdba Issues concerning the sdba submodule. indicators Climate indices and indicators labels Jun 27, 2024

aulemahal added 4 commits June 28, 2024 18:08

Fix tests for numpy 2

06d18ca

remove pint pin

7e9107e

upd changes

e3b9851

Merge branch 'main' into fix-for-np2

7a36c35

Merge branch 'main' into fix-for-np2

deaa702

aulemahal added 4 commits July 8, 2024 14:37

upd pin for cf-xarray

7330221

Merge branch 'main' into fix-for-np2

b446e60

Pin cf_xarray, pint and numpy - adapt to "1" change

c14d0b9

Add ensure_absolute_tempetature to __all__ and refactor to put both s…

da22473

…imilar functions together

aulemahal changed the title ~~Fix for Numpy 2~~ Dimensionless units as "1" - adapt to and pin numpy 2 Jul 8, 2024

coxipi and others added 3 commits July 10, 2024 11:20

new spi test values (due to np2's float32)

14c1f59

[pre-commit.ci] auto fixes from pre-commit.com hooks

7be1d57

for more information, see https://pre-commit.ci

Merge branch 'main' into fix-for-np2

cc4fe55

Zeitsperre reviewed Jul 15, 2024

View reviewed changes

environment.yml Outdated Show resolved Hide resolved

aulemahal mentioned this pull request Jul 16, 2024

Fix wet spells - generic spell statistics #1838

Merged

5 tasks

Zeitsperre and others added 4 commits July 26, 2024 16:50

Merge branch 'main' into fix-for-np2

c21b8ed

[pre-commit.ci] auto fixes from pre-commit.com hooks

d956c3f

for more information, see https://pre-commit.ci

fix changelog

97d653b

revert pins

60a186c

github-actions bot added the API Interfacing and User Concerns label Jul 26, 2024

Zeitsperre added 5 commits July 26, 2024 15:44

add h5netcdf to deptry

0f6f756

add xarray engine to cli

4aeeb89

adjust more IO calls to use h5netcdf

a2b0b42

fix cannon test

cf286d2

use h5netcdf

d557f4d

Remove netCDF4 as a dep - put h5netcdf back to dev deps - fix cli tests

4d95e63

aulemahal changed the title ~~Dimensionless units as "1" - adapt to and pin numpy 2~~ Dimensionless units as "1" - adapt to numpy 2 Jul 31, 2024

Zeitsperre and others added 4 commits July 31, 2024 11:59

replace open_dataset

3312405

Merge branch 'main' into fix-for-np2

39f1e65

Merge branch 'main' into fix-for-np2

f472a9c

Better Version checks

3f3ae71

aulemahal marked this pull request as ready for review July 31, 2024 16:20

oups wrong version

967ddbb

Zeitsperre approved these changes Jul 31, 2024

View reviewed changes

CHANGELOG.rst Outdated Show resolved Hide resolved

github-actions bot added the approved Approved for additional tests label Jul 31, 2024

aulemahal commented Jul 31, 2024

View reviewed changes

CHANGELOG.rst Outdated Show resolved Hide resolved

aulemahal added 3 commits July 31, 2024 13:56

Update CHANGELOG.rst

14a62d5

fix doctests

0524386

Remove opendap test as the CI doesnt have opendap support anymore

3d4beab

Fix notebooks for netCDF4 absence

1bc53fd

github-actions bot added the docs Improvements to documenation label Jul 31, 2024

Fix xarray 2024.7.0 regression (?) issue 9300

3ded3ed

aulemahal merged commit d13cb6c into main Jul 31, 2024
21 checks passed

aulemahal deleted the fix-for-np2 branch July 31, 2024 19:52

aulemahal mentioned this pull request Aug 1, 2024

Slow tests are failing for older numpy #1861

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dimensionless units as "1" - adapt to numpy 2 #1814

Dimensionless units as "1" - adapt to numpy 2 #1814

aulemahal commented Jun 27, 2024 •

edited

Loading

aulemahal commented Jun 28, 2024

coxipi commented Jun 29, 2024 •

edited

Loading

aulemahal commented Jul 2, 2024

Zeitsperre commented Jul 2, 2024

aulemahal commented Jul 2, 2024

aulemahal commented Jul 8, 2024 •

edited

Loading

aulemahal commented Jul 8, 2024

coxipi commented Jul 9, 2024 •

edited

Loading

Zeitsperre commented Jul 26, 2024

aulemahal commented Jul 26, 2024

aulemahal commented Jul 31, 2024

review-notebook-app bot commented Jul 31, 2024

Dimensionless units as "1" - adapt to numpy 2 #1814

Dimensionless units as "1" - adapt to numpy 2 #1814

Conversation

aulemahal commented Jun 27, 2024 • edited Loading

Pull Request Checklist:

What kind of change does this PR introduce?

Does this PR introduce a breaking change?

Other information:

aulemahal commented Jun 28, 2024

coxipi commented Jun 29, 2024 • edited Loading

aulemahal commented Jul 2, 2024

Zeitsperre commented Jul 2, 2024

aulemahal commented Jul 2, 2024

aulemahal commented Jul 8, 2024 • edited Loading

Footnotes

aulemahal commented Jul 8, 2024

coxipi commented Jul 9, 2024 • edited Loading

Zeitsperre commented Jul 26, 2024

aulemahal commented Jul 26, 2024

aulemahal commented Jul 31, 2024

review-notebook-app bot commented Jul 31, 2024

aulemahal commented Jun 27, 2024 •

edited

Loading

coxipi commented Jun 29, 2024 •

edited

Loading

aulemahal commented Jul 8, 2024 •

edited

Loading

coxipi commented Jul 9, 2024 •

edited

Loading