-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pdbufr does not parse data correctly. #25
Comments
Hi @meteoDaniel, Many thanks for this report! I hope I have good news for you :) First, it seems that you found an interesting behaviour in pdbufr - it expects 'proper' tuples as input, and if you omit the trailing comma as you do in The second point is that we have quite a large code refactor waiting to be released - if you're in a position to install pdbufr from git, I encourage you to do so and use the latest master branch. We used data very similar to yours to test develop and test it with, so we expect it to work with this version. When I do this, I get sensible results from your filter (I hand-checked a few using Metview's BUFR examiner, otherwise known as CodesUI in its standalone form). I'd be interested to know if these tips allow you to get what you need from pdbufr. Cheers, |
@iainrussell thanks for this update. E.g. filtering for the 2m Temperature works right now. I want to give you an update of my investigations:
And then a value (here
I think the point is that this strategy only works if I can match each measure to the correct And I have testes other |
Hi @meteoDaniel, Glad the fix is working! For timePeriod, I think you may just have a typo? There is no timePeriod of 10.0 in the data (I believe), but there is -10, and putting that in the filter instead of 10 works for me. I can also query all the unique values of timePeriod like this:
and I get this:
Metview also agrees that these are all the values in the file. For your larger question, I think I'm getting a bit lost in terms of what you want. I can indeed see that this is a complicated BUFR file, so it would be good to be able to handle it properly. Can you describe what information you'd like to retrieve from it please? Many thanks! |
Thanks a lot for your support. I will make further investigations later on pdbufr. |
I know dealing with bufr is a mess and my first impression of pdbufr is good. But it is not able to deal with one of the main problems of observation data decoded in bufr. So my intention is to work on that issue with you together to provide the world a bufr reader that is as good as cfgrib is.
Take a look into the data I attached and you will find out that
airTemperature
is defined multiple times within one subset. (From my experience the reports are seperated in subsets) . So now I thought I could use pdbufrfilter
to access the right temperature.The results in an empty Dataframe. As well It is necessary to filter for
timePeriod
but it does not work, too.In my own implementation
I use
bufr_dump
and parse the output first as a bytes object and afterwards as a JSON and dump it into a dataframe.Then I loop through the lines and store each
timePeriod
andheightOfSensor
information to map them to the measures. The rule is that that the latest sensor Information and/or time period information is valid for the value. I guess this behaviour should be implement behind the filter function too.Why do I name this Issue that pdbufr does not parse the data correctly?
-> It is not clear what kind of
airTemperature
is parsed (2m or 0.05m) but it is mandatory to know this information to parse the data correctly from my point of view.Another point: During my investigation of eccodes+python and bufr_dump I have found out that bufr_dump is much compared to the use of the eccodes python interface (or what is suggested in the documentation eccodes doc ).
@alexamici
Z__C_EDZW_20210214100000_bda01.synop_bufr_GER_999999_999999__MW_536.zip
The text was updated successfully, but these errors were encountered: