Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent usage of "degC" and "°C" in Indicator metadata #890

Open
Zeitsperre opened this issue Oct 29, 2021 · 6 comments · May be fixed by #1120
Open

Inconsistent usage of "degC" and "°C" in Indicator metadata #890

Zeitsperre opened this issue Oct 29, 2021 · 6 comments · May be fixed by #1120
Assignees
Labels
invalid This doesn't seem right standards / conventions Suggestions on ways forward wontfix This will not be worked on

Comments

@Zeitsperre
Copy link
Collaborator

I've realized that there is a lack of consistency between the usages of "degC" and "°C" in the Indicator metadata fields. For many hard-coded Indicators (primarily in French metadata translations), we tend to use "°C", values that derived from thresh can use "degC". For example:

  "FREEZETHAW_SPELL_MEAN_LENGTH": {
    "title": "Longueur moyenne des événements de gel-dégel quotidien.",
    "abstract": "Longueur moyenne des périodes de gel-dégel. Ces périodes sont des séries de jours consécutifs où la température minimale quotidienne est sous un seuil et la température maximale quotidienne au-dessus d'un autre. En général, ces deux seuils sont 0°C.",
    "description": "Longueur moyenne {freq:f} des événements de gel-dégel quotidien : Tmax > {thresh_tasmax} and Tmin <= {thresh_tasmin} pour au moins {window} jour(s) consecutif(s).",
    "long_name": "Longueur moyenne {freq:f} des événements de gel-dégel quotidien."
  },

In the fields here, we can have instances where phrases use both "°C" and thresh_* which will result in degC, if using the default indicator call signatures.

Should addressing this inconsistency involve a translation step ("deg" ⇾ "°"), or should we encourage usage of "°C" nomenclature in Indicators? If forced to choose, I would like to see an adoption of "°C" notation throughout the Indicator functions (it reads much better, IMO).

How would we go about addressing this?

@Zeitsperre Zeitsperre added invalid This doesn't seem right standards / conventions Suggestions on ways forward labels Oct 29, 2021
@Zeitsperre Zeitsperre added this to the v0.31 milestone Oct 29, 2021
@Zeitsperre Zeitsperre self-assigned this Oct 29, 2021
@Zeitsperre
Copy link
Collaborator Author

@aulemahal What degree of complexity should we be considering here?

@aulemahal
Copy link
Collaborator

I believe we could do most of the work in the Indicator's format? Right now, "quantity strings", like thresholds, are not handled differently than simple strings. I think a solution to this issue would be to parse them (into pint objects) and then reformat as a string, handling the special cases.

I am suggesting something a bit more complex than str.replace('deg', °). I guess pint has tools to help us?

If we are to play in format, another bug we have is with templated var_names. We allow to change a variable name according to passed arguments, but in the case of a threshold, for example, that means the var name contains a space!

MWE:

import xclim as xc
from xclim.testing import open_dataset

ds = open_dataset('ERA5/daily_surface_cancities_1990-1993.nc')

out = xc.indicators.cf.ctmgeTT(threshold='1 degC', ds=ds)
out.name
# 'ctmge1 degC'

Both issues can be fixed by an improvement of the formatting logic.

@Zeitsperre Zeitsperre modified the milestones: v0.31, v0.32 Nov 5, 2021
@tlogan2000 tlogan2000 modified the milestones: v0.32, v0.33 Dec 13, 2021
@Zeitsperre
Copy link
Collaborator Author

To follow up on this, if we want to base some work off of

def _convert_value_to_str(var_name, val) -> str:
(or even refactor that sub-function to utils.py), that might help us along.

@Zeitsperre Zeitsperre modified the milestones: v0.33, v0.34 Jan 27, 2022
@tlogan2000 tlogan2000 modified the milestones: v0.34, v0.35 Feb 14, 2022
@aulemahal
Copy link
Collaborator

I'm hitting more reefs that I first thought... My goal was to use cf_xarray's cf unit formatter directly, but it seems that it's not exactly what we want.

Issue 1: it prints unicode character for some units (°C), we might not want that in the units attribute of netCDFs.
Issue 2: Pint is made so that dimensionless units are hardcoded to "dimensionless" in the long version, which ain't cool.
Issue 2.5 : The CF formatter of cf_xarray is broken if we request a "short" version (in an attempt to fix issue 2).

Thus, unless the current issue is really wanted. I would wait (or code it myself) for a resolution of those upstream issues before putting work here.

@aulemahal aulemahal modified the milestones: v0.35, v0.36 Mar 17, 2022
@Zeitsperre
Copy link
Collaborator Author

Marking this as a wontfix for now, since the changes need to be performed upstream in our dependency stack.

@Zeitsperre Zeitsperre added the wontfix This will not be worked on label Mar 18, 2022
@huard
Copy link
Collaborator

huard commented Apr 22, 2022

See hgrecco/pint#1486

@Zeitsperre Zeitsperre linked a pull request Sep 13, 2022 that will close this issue
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invalid This doesn't seem right standards / conventions Suggestions on ways forward wontfix This will not be worked on
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants