Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New BUFR descriptors to represent observations from TROPICS #117

Closed
marijanacrepulja opened this issue Apr 11, 2022 · 26 comments · Fixed by #133
Closed

New BUFR descriptors to represent observations from TROPICS #117

marijanacrepulja opened this issue Apr 11, 2022 · 26 comments · Fixed by #133
Assignees
Labels
Milestone

Comments

@marijanacrepulja
Copy link
Contributor

marijanacrepulja commented Apr 11, 2022

Summary and purpose

ECMWF is proposing new BUFR descriptors for representing TROPICS observations.

Stakeholders

Action proposed

The team is kindly asked to review and approve the contents for inclusion within the next update to the WMO Manual on Codes.

Discussions

ECMWF is preparing for operational assimilation of TROPICS observations. Therefore, we need a new BUFR descriptors to accommodate observation from TROPICS.

Detailed proposal

1. Add a new element in table B 0-33-094 for TROPICS calibration quality flags

F X Y ELEMENT NAME UNIT SCALE REFERENCE VALUE DATA WITH (bits)
0-33-094 Calibration quality control flags Numeric 0 0 24

2. Add a new CODE/FLAG TABLE 0-33-094 for TROPICS calibration quality flags

Bit No Calibration quality control flags
1-15 Reserved
16 Non ocean
17 Lunar or Solar intrusion
18 Spacecraft maneuver
19 Cold calibration consistency
20 Warm calibration consistency
21 Descending
22 Night
23 Payload rear orientation
All 24 Missing

3. Add a new table B descriptor 0-08-097 for Method used to calculate the average instrument temperature

F X Y ELEMENT NAME UNIT SCALE REFERENCE VALUE DATA WITH (bits)
0-08-097 Method used to calculate the average instrument temperature Numeric 0 0 7

4. Add a new CODE/FLAG TABLE 0-08-097 for Method used to calculate the average instrument temperature

Code Method used to calculate the average instrument temperature
0 The average of six temperature sensors placed throughout the payload
1 Average of WF-band IFP and RFE sensors for channels 1 to 8
2 Average of G-band RFE sensor for channels 9 to 12
3-126 Reserved
127 Missing value
@marijanacrepulja marijanacrepulja self-assigned this Apr 11, 2022
@marijanacrepulja marijanacrepulja added this to the FT2022-2 milestone Apr 11, 2022
@marijanacrepulja marijanacrepulja added this to Submitted in BUFR4 Amendments via automation Apr 11, 2022
@marijanacrepulja
Copy link
Contributor Author

@SimonElliottEUM, during our meeting in March we concluded that would be beneficial if we communicate new BUFR template for TROPICS with CGMS. Could you please ask for their feedback.
@jbathegit could you please communicate new BUFR template for TROPICS with relevant colleagues on your side as well. As I am happy to take your comments on board.
My colleagues from research have circulated BUFR template with relevant group, and they are satisfied with it.
Many thanks, Marijana

@amilan17 amilan17 moved this from Submitted to In discussion in BUFR4 Amendments May 11, 2022
@jbathegit
Copy link
Contributor

  • What's the difference between the 3 different instrument temperatures that are being replicated using 1-01-003? If you really want to report 3 different temperatures, then I think you need to add in a significance qualifier, or some other coordinate class descriptor within this replicated sequence in order to distinguish the difference between them.
  • If you only have 12 channels, then why do you need to use an extended delayed descriptor replication factor? In other words, why not use 0-31-001 (8-bits allowing for up to 255 replications) instead of 0-31-002 (16 bits allowing for up to 65535 replications)?

@marijanacrepulja
Copy link
Contributor Author

Many thanks for valuable comment @jbathegit
I have not introduce new descriptor in order to distinguish the difference between 3 instrument temperature.

@amilan17
Copy link
Member

https://github.com/wmo-im/CCT/wiki/Teleconference-18-and-19.5.2022 notes:

  1. update bits in flag table 0-33-094
  2. update acronym table
  3. Simon neeeds to present to CGMS group and request feedback by mid-June, if there isn't agreement, we can postpone to FT2023-1
  4. Simon will help validate

@amilan17 amilan17 added the branch label Jun 7, 2022
@amilan17
Copy link
Member

amilan17 commented Jun 8, 2022

https://github.com/wmo-im/CCT/wiki/Teleconference-8.6.2022 notes:

still waiting for feedback by mid-June

@amilan17 amilan17 moved this from In discussion to In Validation in BUFR4 Amendments Jun 13, 2022
@amilan17 amilan17 moved this from In Validation to In discussion in BUFR4 Amendments Jun 13, 2022
@amilan17
Copy link
Member

@SimonElliottEUM @marijanacrepulja Did you get feedback? Do you still want this to go through FT2022-2? If so, please update the branch ASAP. Thanks!

@SimonElliottEUM
Copy link
Contributor

@amilan17 @marijanacrepulja @jbathegit Comments after CGMS TFSDC review as follows:
Add new satellite sub-identifier 0-01-016 at the start of the Table D sequence.
Giving various coordinate descriptors (in sequence those from 0-05-001 to 0-05-022) for every channel and in every subset will make for a very big product. We should be really sure that all the descriptors really need to reproduced separately for every channel.

@SimonElliottEUM
Copy link
Contributor

@amilan17 @marijanacrepulja @jbathegit Also please change the name of the sequence not to refer to TROPICS, then we can use it for other similar data (like forthcoming AWS). If it were only for TROPICS, we wouldn't need delayed replication, etc.

Not suggestion to add satellite sub-identifier 0-01-016 will also help make the Table D sequence useful for other constellations

@marijanacrepulja
Copy link
Contributor Author

@SimonElliottEUM @jbathegit @amilan17
Thank you for feedback, I will update proposal accordingly
I would name the sequence,
Observations from constellation of smallsats
Could you please let me know your thoughts.

@SimonElliottEUM
Copy link
Contributor

@jbathegit @amilan17 @marijanacrepulja Thanks for working on the sequence name Marijana. We could also use the sequence for a single satellite not part of a constellation so we should enable that.
How about we call the sequence "Satellite sounding data" - this would be suitably generic and match the style of 3-10-026 (Satellite radio occultation data).

@marijanacrepulja
Copy link
Contributor Author

marijanacrepulja commented Jun 22, 2022

@SimonElliottEUM @jbathegit @amilan17
Please see comment from my RD colleagues Niels

Question: Giving various coordinate descriptors (in sequence those from 0-05-001 to 0-05-022) for every channel and in every subset will make for a very big product. We should be really sure that all the descriptors really need to reproduced separately for every channel.

Answer: It's a valid point, but it reflects the nature of the instrument - and it is becoming increasingly common that not all channels have the same geolocation. Please note though that for TROPICS the geolocation will not be different for every channel; some groups of channels will have the same geolocation. Compression methods in BUFR should recognise this and use it to reduce file size accordingly.

@SimonElliottEUM
Copy link
Contributor

@marijanacrepulja @jbathegit @amilan17 We are used to having groups of channels with their own coordinates. But the compression does not handle this - it works between subsets, not between channels in one subset. The approach we will adopt for similar cases (MWI, ICI for example), is to group the channels and have the coordinates for each group. This saves a lot of space and also works with compression. For example, if the 12 channels were in groups of 4, 3 and 5:
First group
0-05-042
...
0-05-022
1-04-004
0-12-163
...
0-33-094
then second group
0-05-042
...
0-05-022
1-04-003
0-12-163
...
0-33-094
then third group
0-05-042
...
0-05-022
1-04-005
0-12-163
...
0-33-094
In this way the compression within the subset is not needed, and we save a lot of space.

@marijanacrepulja
Copy link
Contributor Author

@SimonElliottEUM @jbathegit @amilan17

One difficulty with re-using templates tends to be the quality flags (33094) which often are specific to instrument processing. But one could probably accommodate different processing flags for other missions.

We can rename 0-33-094 TROPICS calibration quality flags to calibration quality flags.

@marijanacrepulja
Copy link
Contributor Author

@SimonElliottEUM thank you for example of groping channels

We need to have channels ids related with location, brightness temperature and quality information.
There are 3d arrays e.g brightness_temperature (channels, scans, spots) , latitude(channels, scans, spots), longitude(channels, scans, spots), satellite-zenith(channels, scans, spots), bearing(channels, scans, spots),
solar-zenith(channels, scans, spots), solar-azimuth(channels, scans, spots)

where channels =12, scans=2859, spots=81,

Also, we need to use subsets to accommodate 81 spots.

@marijanacrepulja
Copy link
Contributor Author

@SimonElliottEUM @jbathegit @amilan17

One difficulty with re-using templates tends to be the quality flags (33094) which often are specific to instrument processing. But one could probably accommodate different processing flags for other missions.

We can rename 0-33-094 TROPICS calibration quality flags to calibration quality flags.

@SimonElliottEUM we already have 033076 Calibration quality flags
I believe we should come up with other name for 033094.
Shall we name it Calibration quality control flags?

@SimonElliottEUM
Copy link
Contributor

@marijanacrepulja @jbathegit @amilan17 we already have 0-33-076 "Calibration quality flags". But it doesn't matter even if the same name comes twice. "Calibration quality control flags" is also fine.
Much more important is the replication for the arrays. I think there is work to be done here and the proposal is not mature enough. As TROPICS 2 & 3 have just been destroyed, and there will be a pause, we are not in a great rush. Please could we give ourselves another 6 months and address this carefully?

@marijanacrepulja
Copy link
Contributor Author

marijanacrepulja commented Jun 27, 2022

@SimonElliottEUM @jbathegit @amilan17

The size of the BUFR file is 37MB. It contains 1.5 hours of data. Input netCDF file size is 56MB. I believe it should not be an issue. With my RD colleagues we did consider the differences for looping over channels vs. ‘bands’, deciding that looping over channels was simpler and likely more straightforward if applied to future sensors. Despite file size, grouping over bands for geolocation could be misleading and not easy to relate to channel number, brightness temperature and quality information. Also, we would like to use data in operation and delay in approving BUFR template for TROPICS and making it operational in November 2023 would be too late.

@amilan17
Copy link
Member

amilan17 commented Jul 1, 2022

@marijanacrepulja @SimonElliottEUM @jbathegit -- 

Marijana wants this to go through fast-track 2022-2 and if all are in agreement with current discussion resolutions, then the branch needs to be updated and validated by mid-next week. Is this possible? If not, we will have to wait until FT2023-1.

@marijanacrepulja
Copy link
Contributor Author

marijanacrepulja commented Jul 1, 2022

@SimonElliottEUM
I followed your approach to group the channels and have the coordinates for each group. The BUFR size is 10MB for 1.5 hours of data which is in line with other data sets. e.g FY-3D MWHS-2 data from one 6-hour file in BUFR has size of 39MB
Could you please have a look.
If we can agree with it I will update the branch and provide sample data.
Many thanks.

310078 Satellite sounding data Note
001007 SATELLITE IDENTIFIER
001016 SATELLITE SUB IDENTIFIER
002019 SATELLITE INSTRUMENTS
002020 SATELLITE CLASSIFICATION
001033 IDENTIFICATION OF ORIGINATING/GENERATING CENTRE
001034 IDENTIFICATION OF ORIGINATING/GENERATING SUB-CENTRE
301011 YEAR, MONTH, DAY
301013 HOUR,MINUTE, SECOND
004007 SECONDS WITHIN A MINUTE (MICROSECOND ACCURACY)
005040 ORBIT NUMBER
201132 INCREASE WIDTH
005041 SCAN LINE NUMBER
201000 INCREASE WIDTH CANCEL
005043 FIELD OF VIEW NUMBER
033079 GRANULE LEVEL QUALITY FLAGS
033080 SCAN LEVEL QUALITY FLAG
033078 GEOLOCATION QUALITY
007002 HEIGHT OR ALTITUDE
102003 REPLICATE 2 DESCRIPTORS 3 TIMES
008097 METHOD USED TO CALCULATE THE AVERAGE INSTRUMENT TEMPERATURE
012164 INSTRUMENT TEMPERATURE
117000 DELAYED REPLICATION OF 17 DESCRIPTORS
031001 DELAYED DESCRIPTOR REPLICATION FACTOR
005001 LATITUDE (HIGH ACCURACY)
006001 LONGITUDE (HIGH ACCURACY)
007024 SATELLITE ZENITH ANGLE
005021 BEARING OR AZIMUTH
007025 SOLAR ZENITH ANGLE
005022 SOLAR AZIMUTH
109000 DELAYED REPLICATION OF 9 DESCRIPTORS
31001 DELAYED REPLICATION FACTOR
005042 CHANNEL NUMBER
002153 SATELLITE CHANNEL CENTRE FREQUENCY
002154 SATELLITE CHANNEL BAND WIDTH
002104 ANTENNA POLARISATION
012066 ANTENNA TEMPERATURE
012163 BRIGHTNESS TEMPERATURE
012158 NOISE-EQUIVALENT DELTA TEMPERATURE WHILE VIEWING COLD TARGET
012159 NOISE-EQUIVALENT DELTA TEMPERATURE WHILE VIEWING WARM TARGET
033094 CALIBRATION QUALITY CONTROL FLAGS

@SimonElliottEUM
Copy link
Contributor

I am still uncomfortable with this. I note that it was probably written in a bit of a hurry.
"102003 DELAYED DESCRIPTOR REPLICATION FACTOR" does not refer to delayed replication and there is no replication factor in the sequence.
"117000 REPLICATION OF 17 DESCRIPTORS" is delayed replication but the wording doesn't reflect that.
Generally double nested delayed replication data are very tricky to handle in terms of memory allocation. For TROPICS (again noting no rush as satellites 2 and 3 just failed) there could equally well be fixed replication.
I am on leave until 07/07/22 for family issues in UK. I remain uncomfortable rushing this through for FT 2022-2, but in extremis I guess it could work

@marijanacrepulja
Copy link
Contributor Author

@SimonElliottEUM many thanks for your feedback.

I have fixed the wording for the delayed replication.
As for double nested delayed replication, I had a look at the existing BUFR sequences and there are few BUFR sequences using that approach. e.g. 307102, 308015. I believe users can handle that.
I have used double nested delayed replication to make this BUFR sequence generic so can be re used for the future mission. We could have any number of groups of channels with the same geolocation, and any number of channels within that group. If I make fix replication that will work only for TROPICS. My understanding was that main idea is to create BUFR sequence to be generic.
I don't know what additional work needs to be done, I believe we can use the sequence to encode TROPICS and other similar missions. Therefore I updated branch with new entries, also provided the sample data. I would be grateful if we can finalise validation.

tropics.tar.gz

@marijanacrepulja
Copy link
Contributor Author

@SimonElliottEUM can we agree what is the plan for the TROPICS template. Would you be able to validate the samples or have any other proposals? Many thanks.

@marijanacrepulja marijanacrepulja changed the title New BUFR template for TROPICS New BUFR descriptors to represent observations from TROPICS Jul 26, 2022
@marijanacrepulja marijanacrepulja moved this from In discussion to In Validation in BUFR4 Amendments Jul 26, 2022
@marijanacrepulja
Copy link
Contributor Author

@amilan17 as we discussed I have changed this issue to propose new descriptors for representation of TROPICS data. This will help using data in operation until we define BUFR template. The BUFR descriptors have been discussed at the meeting.

Could you please review the branch and move it to ready to FT.

@amilan17 amilan17 mentioned this issue Jul 26, 2022
@amilan17 amilan17 moved this from In Validation to Validated in BUFR4 Amendments Jul 26, 2022
@amilan17 amilan17 linked a pull request Jul 26, 2022 that will close this issue
@amilan17 amilan17 moved this from Validated to Ready for FT Approval Procedure in BUFR4 Amendments Jul 26, 2022
@marijanacrepulja
Copy link
Contributor Author

tropics.tar.zip

@SimonElliottEUM Could you please validate the provided sample? Many thanks!

@SimonElliottEUM
Copy link
Contributor

@marijanacrepulja @amilan17 So far it looks good to me. Can you let me have the corresponding data in their original format (netCDF or HDF maybe), so that I can confirm the decoded output from BUFR matches?

@marijanacrepulja
Copy link
Contributor Author

@SimonElliottEUM
Many thanks for your support with this. The size of netCDF exceeds the 25MB limit. I'll send you an email with the file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
BUFR4 Amendments
Ready for FT Approval Procedure
Development

Successfully merging a pull request may close this issue.

4 participants