Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Product Definition Templates to generalize the percentiles (100-quantiles) forecast Templates to any q-quantiles #53

Closed
sebvi opened this issue Oct 20, 2020 · 8 comments
Assignees
Milestone

Comments

@sebvi
Copy link
Contributor

sebvi commented Oct 20, 2020

Branch

https://github.com/wmo-im/GRIB2/tree/issue-53

Summary and purpose

This document proposes new templates to generalize percentile forecasts to a partitioning of any size called “quantile”, percentiles being 100-quantiles.

Action proposed

The team is requested to approve the content of this proposal for inclusion with the next update of the WMO Manual on Codes.

Discussions

At ECMWF a new experimental global post-processed product, called “ecPoint-rainfall”, was introduced in April 2019 into the suite of real-time forecast products available for use by forecasters worldwide. The methods and products have attracted a great deal of interest, with many national met services and commercial customers requesting access to the GRIB data. ecPoint has also been the subject of many presentations (e.g. see https://youtu.be/QfZW34we2u8) and short articles in print and online (e.g. https://www.harry-otten-prize.org/news.html, scroll to “21 September 2019”). A more comprehensive paper (“A new low-cost technique improves weather forecasts across the world”) was submitted to Nature and is currently in the second review round (preprint available here: https://arxiv.org/abs/2003.14397). Meanwhile further work is underway to develop related products from extended range forecasts and from the ERA5 re-analysis, post-processing 2m temperature (ecPoint-temperature) as well as rainfall; for these we expect a broad user base to develop. Furthermore, ecPoint products can be generated from any global model or ensemble.
We would thus like to archive the new ecPoint data type, but due to the “unusual” format current GRIB definitions do not permit this. A particular limitation is that the most quantiles one can store, for a probabilistic product, is 101 (i.e percentiles using the percentile Forecasts templates). We would like to extend to embrace the “permille” concept, whereby one can store 1001 quantiles (i.e. equal to 0, 0.1, 0.2, 0.3, …, 99.8, 99.9, 100% stored in quantile as 0, 1 ,2 ,3, …, 998 999, 1000). Extending in this way provides the user with much more information on the distribution tails, which is where much of the value of ecPoint output lies, particularly for anticipating extreme events, such as extreme localised rainfall that can lead to devastating flash floods.

Detailed proposal

To generalize the concept of percentile to any partitioning called of any size called “quantile”, we propose to introduce 2 new templates based on the existing percentile templates 4.6 and 4.10. Please note that a quantile is now encoded on 2 octets to allow the encoding of “permille” (1000-quantile).

Product definition template 4.86 - Quantile forecasts at a horizontal level or in a horizontal layer at a point in time.

Octet No. Contents
10 Parameter category (see Code table 4.1)
11 Parameter number (see Code table 4.2)
12 Type of generating process (see Code table 4.3)
13 Background generating process identifier (defined by originating centre)
14 Forecast generating process identifier (defined by originating centre)
15–16 Hours after reference time of data cut-off (see Note)
17 Minutes after reference time of data cut-off
18 Indicator of unit of time range (see Code table 4.4)
19–22 Forecast time in units defined by octet 18
23 Type of first fixed surface (see Code table 4.5)
24 Scale factor of first fixed surface
25–26 Scaled value of first fixed surface
29 Type of second fixed surface (see Code table 4.5)
30 Scale factor of second fixed surface
31–34 Scaled value of second fixed surface
35-36 Total number of Quantiles q
37-38 Quantile value (between 0 and q )

Note: Hours greater than 65534 will be coded as 65534.

Product definition template 4.87 – Quantile forecasts at a horizontal level or in a horizontal layer in a continuous or non-continuous time interval

Octet No. Contents
10 Parameter category (see Code table 4.1)
11 Parameter number (see Code table 4.2)
12 Type of generating process (see Code table 4.3)
13 Background generating process identifier (defined by originating centre)
14 Forecast generating process identifier (defined by originating centre)
15–16 Hours after reference time of data cut-off (see Note)
17 Minutes after reference time of data cut-off
18 Indicator of unit of time range (see Code table 4.4)
19–22 Forecast time in units defined by octet 18
23 Type of first fixed surface (see Code table 4.5)
24 Scale factor of first fixed surface
25–26 Scaled value of first fixed surface
29 Type of second fixed surface (see Code table 4.5)
30 Scale factor of second fixed surface
31–34 Scaled value of second fixed surface
35-36 Total number of Quantiles q
37-38 Quantile value (between 0 and q )
39–40 Year of end of overall time interval
41 Month of end of overall time interval
42 Day of end of overall time interval
43 Hour of end of overall time interval
44 Minute of end of overall time interval
45 Second of end of overall time interval
46 n – number of time range specifications describing the time intervals used to calculate the statistically processed field
47-50 Total number of data values missing in the statistical process
51–62 Specification of the outermost (or only) time range over which statistical processing is done
51 Statistical process used to calculate the processed field from the field at each time increment during the time range (see Code table 4.10)
52 Type of time increment between successive fields used in the statistical processing (see Code table 4.11)
53 Indicator of unit of time for time range over which statistical processing is done (see Code table 4.4)
54-57 Length of the time range over which statistical processing is done, in units defined by the previous octet
58 Indicator of unit of time for the increment between the successive fields used (see Code table 4.4)
59-62 Time increment between successive fields, in units defined by the previous octet (see Note 3)
63–nn These octets are included only if n > 1, where nn = 50 + 12 x n
63–74 As octets 51–62, next innermost step of processing
75–nn Additional time range specifications, included in accordance with the value of n. Contents as octets 51 to 62, repeated as necessary.

Notes:
(1) Hours greater than 65534 will be coded as 65534.
(2) The reference time in section 1 and the forecast time together define the beginning of the overall time interval.
(3) An increment of zero means that the statistical processing is the result of a continuous (or near-continuous) process, not the processing of a number of discrete samples. Examples of such continuous processes are the temperatures measured by analogue maximum and minimum thermometers or thermographs, and the rainfall measured by raingauge.

These new templates should be properly referenced in Code table 4.0
Octet No. Meaning
86 Quantile forecasts at a horizontal level or in a horizontal layer at a point in time
87 Quantile forecasts at a horizontal level or in a horizontal layer in a continuous or non-continuous time interval

@amilan17 amilan17 added this to the FT-2021-1 milestone Nov 3, 2020
@amilan17 amilan17 removed their assignment Nov 5, 2020
@sebvi
Copy link
Contributor Author

sebvi commented Nov 23, 2020

added a tentative commit for this branch, please check.

ON a side note I noticed that template 4.10 is still tagged "experimental" in the csv file. Anyone knows why?

@jitsukoh
Copy link

jitsukoh commented Jan 18, 2021

@sebvi could you provide sample data using this template and a decode output to complete the validation process? I understand that the use of products encoded using this new template will be limited to users of ecCodes at this moment and therefore there is no reason not to approve it because there is no other decoders available. Please comment if there is objection.

C.f. 7.3 Testing with relevant applications
For changes that have an impact on automated processing systems, the extent of the testing required before validation should be decided by the designated committee on a case-by-case basis, depending on the nature of the change. Changes involving a relatively high risk and/or impact on the systems should be tested by the use of at least two independently developed tool sets and two independent centres. In that case, results should be made available to the designated committee with a view to verifying the technical specifications.

ON a side note I noticed that template 4.10 is still tagged "experimental" in the csv file. Anyone knows why?

When GRIB2 was introduced, not all templates were validated because there were no users, and at a later stage, these templates were flagged as “experimental” (there is a note under each template saying “Preliminary note: This template was not validated at the time of publication and should be used with caution. Please report any use to the WMO Secretariat (Observing and Information Systems Department) to assist for validation.”) and this is indicated as "experimental" in the status column of computer-readable tables. PDT 4.10 is one of these templates.

@sebvi
Copy link
Contributor Author

sebvi commented Jan 21, 2021 via email

@sebvi
Copy link
Contributor Author

sebvi commented Jan 22, 2021 via email

@sebvi
Copy link
Contributor Author

sebvi commented Jan 22, 2021

seems to work now

I have uploaded a sample file for each new template (zip). The bitmap and data section have been removed otherwise the files were too big
PDT86_ecmwf.zip
PDT87_ecmwf.zip

here is the dump in txt files using ecCodes
PDT86_ecmwf.txt
PDT87_ecmwf.txt

@amilan17
Copy link
Member

@sebvi - The CSV needs some improvement and also, given the discussion today, should we keep the status as Experimental? It looks like you put this here, because the product is experimental.... 

quantile forecasts at a horizontal level or in a horizontal layer in a continuous or non-continuous time interval,35-36, Total number of quantiles q

@sebvi
Copy link
Contributor Author

sebvi commented Jan 29, 2021

@amilan17 you are correct, I forgot the few empty columns at the end
I will update that.

Status: My understanding is that, using the current workflow (that we may drop soon), the status should change to operational once validated.

@amilan17
Copy link
Member

@sebvi  thanks for the fix. That's not a practice that I want to support, so I'm going to change it to Operational now so I don't have to edit later and to minimize confusion when cleaning up the status columns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants