Simple indicators #1358

ludwiglierhammer · 2023-05-02T07:27:12Z

Hi guys,
this is my first PR regarding issue #1352.
It contains the implementation of the following indices:

snowfall_frequency
snowfall_intensity
mean_daily_windspeed named sfcWind_mean according to tg_mean
maximum_daily_maximum_wind_speed named sfcWindmax_max according to tx_max
mean_precipitation named precip_average according to precip_accumulation
solid_precip_average and liquid_precip_average

I also implemented a conversion from a snowfall flux (prsn) in kg/m**2*s to a snowfall flux in mm/day. I use this conversion to calculate snowfall_frequency and snowfall_intensity. In additon you you can find this conversion in first_snowfall, last_snowfall and days_with_snow. This allows to give thresholds in both kg/m**2*s and mm/day.

Unfortunately, I can't test the new code snippets in my local repository. I get this error message:

fixture 'xdoctest_namespace' not found

Can you help me with this issue?

I'll add some more tests and adjust CHANGES.rst

Cheers, Ludwig

docs/references.bib

coxipi · 2023-05-02T17:37:43Z

Hi Ludwig, thanks for the nice work.

I'm not sure what can cause the fixture problem, I cloned your PR and don't see a problem.

There has been some reorganization, the tests in xclim are not where they used to be. Maybe that's the root of the problem? To test a new feature, I activate xclim with conda activate xclim then update with

conda env update xclim && pip install ".[dev]"

Maybe this way of proceeding would solve your fixtures problem?

xclim/indices/_threshold.py

xclim/indices/_conversion.py

coxipi · 2023-05-02T20:02:37Z

xclim/indices/_threshold.py

+    if unit_thresh._units == unit._units:
+        prsn = prsn_to_mm_per_day(prsn, snr=snr, const=const)
    thresh = convert_units_to(thresh, prsn, context="hydro")


Similar problem here with context="hydro" and having prsn with units mm day-1.

If we have prsn [mm day-1] and thresh [kg m-2 day-1], then thresh = convert_units_to(thresh, prsn, context="hydro") will do a conversion that is not consistent with the choice of const=312 kg m-3. That is, prsn_to_mm_per_day uses const as a conversion factor, whereas convert_units_to(... context="hydro") automatically uses the conversion factor 1000 kg m-3.

On the other hand, having both prsn [mm day-1] and thresh [mm day-1] means that we would use prsn [mm day-1] as an input in prsn_to_mm_per_day, so this is a concrete occurence of the problem outlined in my comment above.

We could

Restrict the use of the indicator first_snowfall to prsn with [mass]/[area]/[time] units. (also, in this case, `context="hydro" would not be necessary).

Allow prsn with [length]/[time] units. We should then figure out what we do when we have prsn [mm day-1] and thresh [kg m-2 day-1]. We could convert prsn with a similar function prsn_to_kg_per_m2_per_day (with some better name)

Maybe taking thresh with units [kg m-2 day-1], so maybe it just complicates things for nothing. Or maybe it's really unusual to have prsn with [mm day-1] to start with, and sticking with option 1) would make sense.

Anyways, I think we need to restrict use cases (1) or make sure every case is covered correctly (2).

I implemented some if statements to convert thresh, low, high and prsn if one of them has units [length] / [time]. What do you think about this approach? Maybe we can add a function str_to_DataArray to xclim.core.units to trim the code.

xclim/indices/_threshold.py

ludwiglierhammer · 2023-05-03T06:47:53Z

Hi Ludwig, thanks for the nice work.

I'm not sure what can cause the fixture problem, I cloned your PR and don't see a problem.

There has been some reorganization, the tests in xclim are not where they used to be. Maybe that's the root of the problem? To test a new feature, I activate xclim with conda activate xclim then update with
conda env update xclim && pip install ".[dev]"
Maybe this way of proceeding would solve your fixtures problem?

Many thanks!

pip install ".[dev]"

solved my problem.

coxipi · 2023-05-03T12:56:23Z

In #1359, we discussed the maximum_consecutive_*_days is just a special case of *_spell_max_length.

The same can be said of sfcWindmax_max and sfcWind_mean (although the generalizing function is a bit more abstract here). The function xclim.indices.generic.statistics covers those cases. For instance, the indicator of sfcWindmax_max could be written as:

sfcWindmax_max = Wind(
...
    # compute=indices.sfcWindmax_max,
    compute=indices.generic.statistics,
    parameters={"reducer": "max"},
)

Anyways, it's nothing specific to these implementations, there are many indices in XClim such as tx_max that could also be replaced by a suitable use of indices.generic.statistics. It should be addressed in the future.

@Zeitsperre , what is your take on this (and the similar discussion in #1359): Should we indeed reduce the amount of needed indices to a minimum like this or keep more specific indices? Also, we could have a generic function that cover indices like XYZ_days, but maybe too much abstraction is not what we aim, I'm not sure.

for more information, see https://pre-commit.ci

coxipi · 2023-05-04T13:34:31Z

xclim/indices/_threshold.py

 def first_snowfall(
    prsn: xarray.DataArray,
    thresh: Quantified = "0.5 mm/day",
+    snr: xarray.DataArray | None = None,
+    const: Quantified = "312 kg m-3",
    freq: str = "AS-JUL",
 ) -> xarray.DataArray:
    r"""First day with solid precipitation above a threshold.


@Zeitsperre Could you give your insight on this. I realize that this index (and the next one) differs from how pr indices usually work, and I want to know if we want to allow this extra layer of complexity.

Indices with pr usually simply keep the units of pr and simply adapt the threshold if needed. This is OK since the conversion is always uniform, same density 1000 kg m-3 everywhere regardless. The case of prsn with possibly space-dependent density (prsn) is more complex.

Should we:

Allow the conversion of prsn inside the index as things are now?

Only let the index convert thresh with a constant conversion factor if needed as in pr indices. In this case, one would need to convert prsn if [length]/[time] units are desired.

Maybe the confusion also comes from the fact that [precipitation] can mean both [mass]/[area]/[time] and [length]/[time]. In case of actual pr, this is not so ambiguous, since the conversion factor is uniform. But for prsn, things are more complex. We were discussing how prsn should be reserved for [mass]/[area]/[time], and there should be another variable name when units are [length]/[time]. In this case, is the [precipitation] unit type too permissive to work with prsn (and its [length]/[time] equivalent)?

(Sorry for the back and forth on this Ludwig. Regardless of the final form of the index, the modifications you made have already been very useful, I used the ideas in the generic flux_and_rate_converter)

I can see how quickly this case is becoming complicated; I think we need to agree on some approaches for handling precipitation. The CF Conventions are pretty specific on what counts as Precipitation Flux (pr; [mass]/[area]/[time]) and Precipitation Amount (no official acronym; [mass]/[area]).

I don't like the idea of hiding conversions within an indice, especially if this conversion step might be used in multiple indices. We don't currently log these conversion steps into the metadata history either, making matters worse. I do think we should be able to support amounts and fluxes within an indicator/index, but how we handle the conversion needs to be clearly communicated (via warnings?) to the user.

The current behaviour to only use a rule-of-thumb liquid water equivalent conversion only if snr is not provided feels like a good approach, so I wouldn't change that.

The more I think about it, the more I agree that we need to define two variables for snow amount and flux. Or at the very least have some code to handle the case of prsn as an amount.

Zeitsperre · 2023-05-04T14:32:39Z

@Zeitsperre , what is your take on this (and the similar discussion in #1359): Should we indeed reduce the amount of needed indices to a minimum like this or keep more specific indices? Also, we could have a generic function that cover indices like XYZ_days, but maybe too much abstraction is not what we aim, I'm not sure.

There are definitely a few key indices for temperature that we opted to not remove when we were doing our last major refactor/simplification, simply because they're always needed (e.g. t{n|g|x}_{min|mean|max}). One reason to keep these around is that indicators can intelligently select the proper variable from a dataset if needed. @aulemahal showed a small application of this off yesterday.

Striking a balance between what is popular and what reduces redundancy is hard, but we should opt for more succinct indices that can be built from (generics) and more offerings of customized indicators.

coxipi · 2023-05-04T16:34:45Z

surface snow vs. snow_precips equivalences

@ludwiglierhammer I would say: snw is to snd what prsn is to pr_solid, do you agree?


xclim name	`snw` (kg m-2)	`swe` (m)	`snd` (m)
cfname	`surface_snow_amount`	`lwe_thickness_of_surface_snow_amount`	`surface_snow_thickness`

xclim name	`prsn` (kg m-2 s-1)	`lwe_prsn` (tentative) (m s-1)	`pr_solid` (tentative) (m s-1)
cfname	`snowfall_flux`	`lwe_snowfall_rate`	???

Do you know if pr_solid has cf_name?

snowfall instead of surface snow

Maybe things are clearer if we remove the "surface" distinction for the snow variables? A snowfall amount here is just some "snow_precipitation rate" integrated over time, isn't it? I find other cf_names in this case:

	Conversion with Water density	Conversion with Snow density
snow vars	`lwe_thickness_of_snowfall_amount`	`thickness_of_snowfall_amount`
snow_precip vars	`lwe_snowfall_rate`	`snowfall_rate` (tentative, didn't find this in CF lists)

snowfall_rate is not defined in CF_lists, but that is the name I would deduce from other entries in the table above.

I would say that the time integral of pr_solid should indeed yield thickness_of_snowfall_amount defined as:

thickness_of_snowfall_amount: "Amount" means mass per unit area. The construction thickness_of_[X_]snowfall_amount means the accumulated "depth" of snow which fell i.e. the thickness of the layer of snow at its own density. There are corresponding standard names for liquid water equivalent (lwe) thickness.

xclim name

Looking for another name than pr_solid: Would prsnd be an acceptable variable name @Zeitsperre ? My logic is that prsn is the precipitation directly related to snw (the time integration of prsn yield the increase in snw). We could have namedprsn as prsnw instead? Logically, the precipitation analog of snd would be prsnd?

Additional thoughts

The index corresponding to first_snowfall in cbcl_climate_2020 seems to use thickness_of_snowfall_amount (m) rather than snowfall_rate (m s-1), no? I'm saying this because the threshold reported is in (cm)

The timing of first snowfall (defined as the first date in the fall when snowfall ≥1 cm)

But they do define density of falling snow (100 kg m-3) which differs from their density of snow on ground (250 kg m-3), so there might be something I'm not understanding. You probably need "density of falling snow" to obtain snowfall_rate and "density of snow on ground" to obtain thickness_of_snowfall_amount, correct?

aulemahal · 2023-05-04T17:19:58Z

If I may : snowfall_rate doesn't exist I think. A [mass]/([area]*[time]) is a flux in CF's vocabulary, thus prsn would be snowfall_flux.

Indeed, I think a snowfall_rate would be [length]/[time], the pr_solid that doesn't exist, your thickness_of_snowfall_rate (I think here "thickness of rate" is the part that doesn't make sense in CF).

coxipi · 2023-05-04T17:22:27Z

If I may : snowfall_rate doesn't exist I think. A [mass]/([area]*[time]) is a flux in CF's vocabulary, thus prsn would be snowfall_flux.

Indeed, I think a snowfall_rate would be [length]/[time], the pr_solid that doesn't exist, your thickness_of_snowfall_rate (I think here "thickness of rate" is the part that doesn't make sense in CF).

Thanks, indeed, snowfall_flux is the right expression with units (kg m-2 s-1).

I think that thickness_of_snowfall_rate should have simply been snowfall_rate if I follow the logic of other entries in the second table.

(those two aspects are corrected above for clarity)

…er/xclim into simple_indicators

Zeitsperre

I think we're about done. A few bugs need to be addressed before merging, though.

tests/test_indices.py

Zeitsperre · 2023-06-07T19:45:36Z

tests/test_indices.py

+    mmday2ms = 86400000
+    prsnd = prsnd_series(
+        (30 - abs(np.arange(366) - 180)) / mmday2ms, start="2000-01-01"
+    )


Maybe it defeats the point of the tests, but wouldn't we want to use pint here instead ?

Unfortunately, I cannot follow your question. Can you give an example how you would do the test? We test whether we get the same results for both prsn and prsnd input.

My bad, what I was wondering was if we should use the existing units-conversion tools to perform these calculations to set up our time series, i.e. create an array of values in mm/day then use xclim.core.units.convert_units_to to convert these values to m/s.

Ah, that makes sense to me. I adjusted the tests with xclim.core.units.convert_units_to and added some more tests for the snowfall indices:

test with prsnd [mm day-1]

test with prsnd [m s-1]

test with prsn [kg m-2 s-1]

All those test yield the same results.

In test_indices.py there is this K2C. We could delete it and call the temperature series with:

da = tas_series(a, units="C")

Then we were more independent of hard-coded conversion things.

True! I know in some cases we are testing to see whether the convert_units_to is working properly, but for examples testing indices specifically, we should migrate, absolutely. Not necessary in this PR!

tests/test_indices.py

Zeitsperre · 2023-06-07T20:18:59Z

xclim/indicators/atmos/_precip.py

@@ -251,12 +256,27 @@ class HrPrecip(Hourly):
    description="{freq} total precipitation.",
    abstract="Total accumulated precipitation. If the average daily temperature is given, the phase parameter can be "
    "used to restrict the calculation to precipitation of only one phase (liquid or solid). Precipitation is "
-    "considered solid if the average daily temperature is below 0°C (and vice versa).",
+    "considered solid if the average daily temperature is below {thresh} (and vice versa).",


Suggested change

"considered solid if the average daily temperature is below {thresh} (and vice versa).",

"considered solid if the average daily temperature is below 0°C (and vice versa).",

The convention we landed on for these templates is to only have fillable fields in the description and long_name

I added "a given threshold" instead of "0°C" (see other threshold indicators) and added {phase} and {thresh} to long_name and description. In indices/_multivariate.py, should we set default phase to "solid and liquid". This would yield to the default long_name "Total accumulated solid and liquid precipitation". I think we should mention this phase and threshold thing in the metadata (long_name, description) too.

We can definitely set the defaults at the indicator or use the defaults that are inherited from indices. Generally, the description and long_name should have information within it that more specifically details the way that this indicator was called (containing fillable fields, like {thresh}), while the abstract would be what the user expects to see when reading the docstring for the function (before having called it).

I think we should explain what is returned when calling it without having modified the call signature.

I think I found the knot in my brain.

The indicators xclim.indicators.atmos.precip_{accumulation|average} are explicitly defined as indicators respecting both liquid and solid precipitation. If one will compute an indicator for either liquid or solid precipitation one can call xclim.indicators.atmos.{liquid|solid}_precip_{accumulation|average}.

Thus, we don't need this liquid-solid differentiation in xclim.indicators.atmos.precip_{accumulation|average} since we can find it in both long_name and description of xclim.indicators.atmos.{liquid|solid}_precip_{accumulation|average}.

Isn't it?

Yes, precisely! Indicators make it simpler for users to return a specific variable with a different default call signature from the indice, which includes all CF-checks and without needing to change any settings.

There are a lot of layers to consider, but the way I envision it is:

indices.generic: Building blocks for creating climate indexes (exposed but no checks for units, nothing specifically tied to a variable/unit).

indices.*: Climate indexes, includes checks for proper units on both input and output datasets, lots of user control and granularity in settings (operators, thresholds, methods, etc.)

indicators.*: Most user-friendly way of calling indices.* with all checks integrated. This is where very specific indexes can be defined with fixed default values and lots of finely-tuned metadata.

xclim/indicators/atmos/_precip.py

xclim/indices/_conversion.py

Co-authored-by: Éric Dupuis <71575674+coxipi@users.noreply.github.com>

Co-authored-by: Trevor James Smith <10819524+Zeitsperre@users.noreply.github.com>

ludwiglierhammer · 2023-06-08T08:03:28Z

@ludwiglierhammer @coxipi This PR is looking very good!

I can explain why some of the doctests are failing; The examples are running pieces of code against actual objects, so if you refer to path_to_sfcWind_file and path_to_sfcWindmax_file, they need to be defined within the doctest setup here:

xclim/xclim/testing/helpers.py

Line 152 in e7664ee

def add_example_file_paths(cache_dir: Path) -> dict[str]:

The file ERA5/daily_surface_cancities_1990-1993.nc contains a few variables that could be used: sfcWind and wsgsmax (standard name: "wind_speed_of_gust").

If you'd like to rename wsgsmax to sfcWindmax (not necessary), this can be done by modifying the file and opening a pull request here: https://github.com/Ouranosinc/xclim-testdata

@Zeitsperre: Thanks for your explanation. I'll try this PR (Ouranosinc/xclim-testdata#25). If this is done are there any further parts we should adjust in xclim?

…er/xclim into simple_indicators

Zeitsperre · 2023-06-08T20:46:23Z

Feel free to ignore the failures from Verify Testing Data; This is simply a workflow to ensure that the modified testdata tag in our CI matches the latest tag (it does).

The reason it is failing is that this PR is coming from a forked branch (nothing can really be done about that).

Zeitsperre · 2023-06-09T16:24:57Z

@Zeitsperre: Thanks for your explanation. I'll try this PR (Ouranosinc/xclim-testdata#25). If this is done are there any further parts we should adjust in xclim?

No, I think we're in the home stretch! I'm waiting on a review of #1388 to fix the final CI-related failures, and I think that this PR will follow right afterwards. Thanks so much once again for all your time and effort!

coxipi · 2023-06-09T17:29:22Z

Thanks for the nice work @ludwiglierhammer @Zeitsperre

Ludwig.Lierhammer added 6 commits April 28, 2023 10:19

indicator: precip_average

898f707

snowfall_intensity, snowfall_frequency

3b0d8da

metadata for snowfall_frequency and snowfall_intensity

e132d7d

pre-commit

a2ece8e

sfcWind_mean, sfcWindmax_max

3be664a

more docstrings for snowfall indices

5fcdc93

coxipi reviewed May 2, 2023

View reviewed changes

docs/references.bib Outdated Show resolved Hide resolved

coxipi reviewed May 2, 2023

View reviewed changes

xclim/indices/_threshold.py Outdated Show resolved Hide resolved

coxipi reviewed May 2, 2023

View reviewed changes

xclim/indices/_conversion.py Outdated Show resolved Hide resolved

coxipi reviewed May 2, 2023

View reviewed changes

xclim/indices/_threshold.py Outdated Show resolved Hide resolved

delete empty phrase

e49e360

Ludwig.Lierhammer added 4 commits May 3, 2023 08:52

low and/or high goes thresh

bc534a1

prsn_to_mm_per_day suggestion

d6a0cba

bunch of if statements

7569509

get intensity

2242e2e

Merge branch 'master' into simple_indicators

5592aad

github-actions bot added docs Improvements to documenation indicators Climate indices and indicators labels May 3, 2023

coxipi and others added 3 commits May 3, 2023 19:08

Generic flux/conversion tools

15902ff

[pre-commit.ci] auto fixes from pre-commit.com hooks

c451e81

for more information, see https://pre-commit.ci

description of new indicators

e81c0f6

coxipi reviewed May 4, 2023

View reviewed changes

coxipi and others added 6 commits June 7, 2023 14:12

Merge branch 'simple_indicators' of https://github.com/ludwiglierhamm…

43d5dc7

…er/xclim into simple_indicators

wsgsmax != wsgs

3e894cb

Error corrected in test: rate2flux -> flux2rate

e6aa3fa

Merge branch 'simple_indicators' of https://github.com/ludwiglierhamm…

d3e3817

…er/xclim into simple_indicators

typo units in test

5bcfbf7

clean up test TODOs, no shadowing

0014842

Zeitsperre approved these changes Jun 7, 2023

View reviewed changes

Zeitsperre and others added 4 commits June 7, 2023 16:51

Merge branch 'master' into simple_indicators

b0c63a6

solve empty return

d85bf1e

Co-authored-by: Éric Dupuis <71575674+coxipi@users.noreply.github.com>

replace fillable thresh fields

05f7432

Update tests/test_indices.py

38adc52

Co-authored-by: Trevor James Smith <10819524+Zeitsperre@users.noreply.github.com>

ludwiglierhammer mentioned this pull request Jun 8, 2023

rename wsgsmax to sfcWindmax Ouranosinc/xclim-testdata#25

Merged

coxipi and others added 4 commits June 8, 2023 10:54

Typo in rate2flux doctest

2906235

Merge branch 'simple_indicators' of https://github.com/ludwiglierhamm…

95f1122

…er/xclim into simple_indicators

Typo in doctest output

e1cef2d

update testdata tag, remove wsgsmax mapping from generate_atmos

0a46004

github-actions bot added the CI Automation and Contiunous Integration label Jun 8, 2023

Ludwig.Lierhammer added 2 commits June 9, 2023 09:25

test snowfall indices with prsnd and prsn and use convert_units_to

d3e61e1

redo metadata description

8edc1a0

coxipi approved these changes Jun 9, 2023

View reviewed changes

Merge branch 'master' into simple_indicators

89808fb

github-actions bot removed the CI Automation and Contiunous Integration label Jun 9, 2023

coxipi merged commit 6d76ebf into Ouranosinc:master Jun 9, 2023
19 checks passed

Zeitsperre mentioned this pull request Jun 9, 2023

Use mamba-org/setup-micromamba GitHub action, support Python3.11 #1388

Merged

5 tasks

coxipi added a commit to ludwiglierhammer/xclim that referenced this pull request Jun 13, 2023

add forgotten issue/pull nums from Ouranosinc#1358

b4dab36

ludwiglierhammer deleted the simple_indicators branch June 14, 2023 06:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simple indicators #1358

Simple indicators #1358

ludwiglierhammer commented May 2, 2023

coxipi commented May 2, 2023 •

edited

coxipi May 2, 2023 •

edited

coxipi May 2, 2023

ludwiglierhammer May 3, 2023

ludwiglierhammer commented May 3, 2023

coxipi commented May 3, 2023 •

edited

coxipi May 4, 2023 •

edited

Zeitsperre May 4, 2023

Zeitsperre commented May 4, 2023

coxipi commented May 4, 2023 •

edited

aulemahal commented May 4, 2023

coxipi commented May 4, 2023 •

edited

Zeitsperre left a comment

Zeitsperre Jun 7, 2023

ludwiglierhammer Jun 8, 2023

Zeitsperre Jun 8, 2023

ludwiglierhammer Jun 9, 2023

Zeitsperre Jun 9, 2023

Zeitsperre Jun 7, 2023

ludwiglierhammer Jun 8, 2023

Zeitsperre Jun 8, 2023

ludwiglierhammer Jun 9, 2023

Zeitsperre Jun 9, 2023

ludwiglierhammer commented Jun 8, 2023 •

edited

Zeitsperre commented Jun 8, 2023 •

edited

Zeitsperre commented Jun 9, 2023

coxipi commented Jun 9, 2023

	"considered solid if the average daily temperature is below {thresh} (and vice versa).",
	"considered solid if the average daily temperature is below 0°C (and vice versa).",

Simple indicators #1358

Simple indicators #1358

Conversation

ludwiglierhammer commented May 2, 2023

coxipi commented May 2, 2023 • edited

coxipi May 2, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ludwiglierhammer commented May 3, 2023

coxipi commented May 3, 2023 • edited

coxipi May 4, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zeitsperre commented May 4, 2023

coxipi commented May 4, 2023 • edited

surface snow vs. snow_precips equivalences

snowfall instead of surface snow

xclim name

Additional thoughts

aulemahal commented May 4, 2023

coxipi commented May 4, 2023 • edited

Zeitsperre left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ludwiglierhammer commented Jun 8, 2023 • edited

Zeitsperre commented Jun 8, 2023 • edited

Zeitsperre commented Jun 9, 2023

coxipi commented Jun 9, 2023

coxipi commented May 2, 2023 •

edited

coxipi May 2, 2023 •

edited

coxipi commented May 3, 2023 •

edited

coxipi May 4, 2023 •

edited

coxipi commented May 4, 2023 •

edited

coxipi commented May 4, 2023 •

edited

ludwiglierhammer commented Jun 8, 2023 •

edited

Zeitsperre commented Jun 8, 2023 •

edited