Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GRIB2 - Template 42 #18

Closed
Dadoof opened this issue Jul 22, 2024 · 10 comments · Fixed by #22
Closed

GRIB2 - Template 42 #18

Dadoof opened this issue Jul 22, 2024 · 10 comments · Fixed by #22

Comments

@Dadoof
Copy link

Dadoof commented Jul 22, 2024

Hello there folks,

At ECMWF - a major producer of GRIB files of weather data - they are undergoing a transition of sorts. They want to use GRIB2 exclusively, as it offers superior compression to GRIB1. The can move more data more quickly the more compressed it is.

The data, which can be obtained here: https://registry.opendata.aws/ecmwf-forecasts/ is, as near as I can tell, causing problems for NGRIB.

Iin essence we need to be able to add in a DataRepresentation section that handles template 42, I think. Details:
Section 5 - Template 42 : Grid point data - CCSDS recommended lossless compression](https://codes.ecmwf.int/grib/format/grib2/templates/5/42/)

Regards,
Brian E. / Dadoof

@Dadoof Dadoof changed the title GRIB2 GRIB2 - Template 42 Jul 23, 2024
@nmangue
Copy link
Owner

nmangue commented Jul 29, 2024

Hello,

I have looked at what is needed to support this template and this is not a trivial change. This data representation uses Adaptive Entropy Coding library. There is a C++ implementation of the algorithm that eccodes uses for decoding libaec. But nothing equivalent in the .NET world.
It would require to wrap and pack the native libraries or reimplement the algorithm in C#.

@Dadoof
Copy link
Author

Dadoof commented Jul 29, 2024

Good day. I see what you mean - part of why I myself (nor anyone else where I work) had yet tried it.

I will say this: our use of NGRIB has been tremendous over the years because of how well it works across platforms. We sure hope to continue to use it given how good it is. Thus, as these ECMWF data will become the standard over time, we'll need to be sure to be able to work with them.

Your point that there is no libaec equivalent in .NET is... significant. I am totally grasping at straws, but would this be at all helpful: https://github.com/GribApiDotNet/GribApi.NET

Thanks for the response, and for seeing what is involved. We appreciate your work a great deal.

@nmangue
Copy link
Owner

nmangue commented Jul 31, 2024

The GribApi.NET project is a wrapper of the grib_api native library which itself depends on libaec. It also seems to provide only a Windows binary. For NGrib, I would like to keep the implementation cross-platform.

I have found the matching eccodes unpack function. The implementation of the grib part is relatively straightforward. The heavy lifting is being able to use libaec from C#.

I will give it a try.

@simonkingrwe
Copy link

Hi I work with @Dadoof and have looked at this, as a .net developer rather than a c developer. I have been able to wrap the libaec so that it can be called from c# and in a copy of NGrib added a DataRepresentation section for template 42 inheriting from GridPointDataSimplePacking. In the DoEnumerateDataValues method I have added a call to the decompressor that wraps libaec, this in turn converts the resulting values into an array of int those are passed into the inherited Unpack method to return the full set of floats.

I have added a unit test that has allowed me to verify the results and compare with WGrib output for the same GRIB2 file which match perfectly.

There are a few of things that need attention:

  1. I have had to add a wrapper method into libaec for the call to decode as a straight call from C# is difficult to achieve due to some of the structs in libaec not being constructable. To be able to use libaec I will need to get that wrapper method accepted by them.
  2. as libaec is a c library we may need to look at how it is supplied into NGrib as it will need to be compiled for the target environment;
  3. The DataRepresentation template handler needs to know the expected size of the output which can be derived from DataPointsNumber on the GridDefinition section. But I haven't found a simple way to pass this value across yet (I have been focussed on the decompress/decode element first).

Hopefully that all makes sense, I am happy to create a fork and add my changes there for you to review, I do need to clean it up now that it is working and add a few more unit tests. Let me know what you would like to do. Additionally I will discuss with the libaec project to see if I can include my wrapper in their project too.

@nmangue
Copy link
Owner

nmangue commented Aug 4, 2024

I was able to get a working version that you can check on the branch feature/template-5.42. I have created a new package Libaec that handles the interop, provides a managed API, and supports cross-platform functionality. This implementation does not require changes to the original libaec library.

I used the latest eccodes version to build the expected test results. The read values are almost equal, with a tolerance of 10-4.

@simonkingrwe
Copy link

simonkingrwe commented Aug 5, 2024

Thank you, that looks like it will be perfect for what we (and others) will need for the ECMWF Open data feeds. Given your approach to the interop do you think that libaec included in your Libaec wrapper package will support the code running in a docker linux container? If you believe that is the case when do you think you'll be able to update the NGrib nuget package?

@nmangue
Copy link
Owner

nmangue commented Aug 5, 2024

Yes it should be ok. I have just pre-release v0.12.0. Could you test it ?

@simonkingrwe
Copy link

Hi thank you, I will test it today and feedback tomorrow.

@simonkingrwe
Copy link

Hi apologies it has taken a bit longer, I have been able to verify that the new version works in both windows and linux on Intel chips. The linux deployment was in a docker container. I found an oddity which is more related to ECWMF where for precipitation (parameter tp) you normally do not get a dataset for hour 0 but for some reason in the Open Data records they create a dataset for the precipitation that contains no data just the meta data (see below). I have added a check on processing the index to see what the length the dataset is and if not within expectations ignore it.

image

@Dadoof
Copy link
Author

Dadoof commented Aug 9, 2024

Success I'd say - I consider this ticket now closed/resolved. Thanks all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants