Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Level2File() cannot open NEXRAD Level II chunk files available on AWS S3 #1470

Closed
timsliwin opened this issue Aug 19, 2020 · 2 comments · Fixed by #1476
Closed

Level2File() cannot open NEXRAD Level II chunk files available on AWS S3 #1470

timsliwin opened this issue Aug 19, 2020 · 2 comments · Fixed by #1476
Labels
Area: IO Pertains to reading data Type: Enhancement Enhancement to existing functionality
Milestone

Comments

@timsliwin
Copy link

timsliwin commented Aug 19, 2020

As part of the NOAA Big Data project, NEXRAD level2 data is available on Amazon's AWS S3 service. There are 2 buckets with NEXRAD data. The main one is the 'noaa-nexrad-level2' bucket containing full volume files. However, a second bucket called 'unidata-nexrad-level2-chunks' contains near real time radar data, but it is not yet assembled into full volumes.

MetPy has functionality to read full level II files with metpy.io.Level2File, however when this same function is applied to the chunk data, Level2File does not work well.

According to Appendix B in this NEXRAD Interface Control Document (https://www.roc.noaa.gov/wsr88d/PublicDocs/ICDs/2620010H.pdf), the chunk files have an indicator in their filenames of either S, I, or E. S indicates Start, I indicates Intermediate, and E indicated End of the complete volume. When Level2File is applied to files ending in S, these files open normally, but both I and E files fail with a value error saying the year is not within the expected range. Since the S file is the first of the volume sequence, the file likely includes Level II header information that the other files lack, allowing Level2File to initialize properly.

Here is example code showing the problem:

-----Code-----

import boto3
import botocore
from botocore.client import Config
import metpy
from metpy.io import Level2File

s3 = boto3.resource('s3', config=Config(signature_version=botocore.UNSIGNED, user_agent_extra='Resource'))
bucket = s3.Bucket('unidata-nexrad-level2-chunks')

for obj in bucket.objects.filter(Prefix='KLBB'):
    if obj.key[-1] == 'S':
        datas = obj
    else:
        dataei = obj

print(datas.key)
Level2File(datas.get()['Body'])
print(dataei.key)
Level2File(dataei.get()['Body'])

-----Output-----

KLBB/902/20200819-033259-001-S
<metpy.io.nexrad.Level2File object at 0x7f7e69d35910>
KLBB/901/20200819-032322-043-E
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/centos/dev/radar_venv/lib/python3.7/site-packages/metpy/io/nexrad.py", line 179, in __init__
    self._read_volume_header()
  File "/home/centos/dev/radar_venv/lib/python3.7/site-packages/metpy/io/nexrad.py", line 196, in _read_volume_header
    self.dt = nexrad_to_datetime(self.vol_hdr.date, self.vol_hdr.time_ms)
  File "/home/centos/dev/radar_venv/lib/python3.7/site-packages/metpy/io/nexrad.py", line 83, in nexrad_to_datetime
    return datetime.datetime.utcfromtimestamp((julian_date - 1) * day + ms_midnight * milli)
ValueError: year 3830650 is out of range

Desired behavior would be to be able to read the data from the individual chunks in the 'unidata-nexrad-level2-chunks' S3 bucket once they become available such that near real time data can be utilized rather than having to wait for full volume assembly and the data to appear in the 'noaa-nexrad-level2' S3 bucket sometime up to 20-30 minutes delayed.

@timsliwin timsliwin added the Type: Bug Something is not working like it should label Aug 19, 2020
@dopplershift dopplershift added Area: IO Pertains to reading data Type: Enhancement Enhancement to existing functionality and removed Type: Bug Something is not working like it should labels Aug 19, 2020
@dopplershift
Copy link
Member

Thanks for the sample code, that cut my time in half to test this out. Here's the solution to go in Level2File.__init__() that will end up going in in some form:

        try:
            self._read_volume_header()
        except ValueError:
            self._buffer._offset = 0

(My real version is a bit nicer, but that requires changing more lines than are easy to communicate here...)

So if you can bear to edit your MetPy source, you can have this work today. I was already working on a much larger update for our NEXRAD code, and this will go in with it. Expect it to appear with the 1.0 release.

@dopplershift dopplershift added this to the 1.0 milestone Aug 19, 2020
@timsliwin
Copy link
Author

Made the source code change, and the fix works like a charm. I'm now able to plow through the real-time level 2 chunk files with ease. Thank you!

dopplershift added a commit to dopplershift/MetPy that referenced this issue Aug 24, 2020
This, for instance, makes it possible to use Level2File with the
realtime chunks available on AWS.
@dopplershift dopplershift mentioned this issue Aug 24, 2020
5 tasks
dopplershift added a commit to dopplershift/MetPy that referenced this issue Aug 26, 2020
This, for instance, makes it possible to use Level2File with the
realtime chunks available on AWS.
dopplershift added a commit to dopplershift/MetPy that referenced this issue Aug 26, 2020
This, for instance, makes it possible to use Level2File with the
realtime chunks available on AWS.
dopplershift added a commit to dopplershift/MetPy that referenced this issue Oct 2, 2020
This, for instance, makes it possible to use Level2File with the
realtime chunks available on AWS.
dopplershift added a commit to dopplershift/MetPy that referenced this issue Oct 2, 2020
dopplershift added a commit to dopplershift/MetPy that referenced this issue Oct 2, 2020
This, for instance, makes it possible to use Level2File with the
realtime chunks available on AWS.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: IO Pertains to reading data Type: Enhancement Enhancement to existing functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants