Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating scene with SEVIRI HRIT reader fails with UnicodeDecodeError #2073

Closed
gerritholl opened this issue Mar 30, 2022 · 9 comments · Fixed by #2077
Closed

Creating scene with SEVIRI HRIT reader fails with UnicodeDecodeError #2073

gerritholl opened this issue Mar 30, 2022 · 9 comments · Fixed by #2077

Comments

@gerritholl
Copy link
Collaborator

Describe the bug

Reading SEVIRI HRIT files fails with UnicodeDecodeError upon scene creation.

To Reproduce

import os
from glob import glob
from satpy import Scene
from satpy.utils import debug_on; debug_on()
seviri_files = glob("/media/nas/x21308/scratch/SEVIRI/202103300900/H-000*")
sc = Scene(filenames=seviri_files, reader=["seviri_l1b_hrit"])

Expected behavior

Success.

Actual results

[DEBUG: 2022-03-30 11:36:33 : satpy.readers.yaml_reader] Reading ('/data/gholl/checkouts/satpy/satpy/etc/readers/seviri_l1b_hrit.yaml',)
/data/gholl/checkouts/satpy/satpy/readers/seviri_base.py:453: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  ('GsicsCalMode', np.bool),
/data/gholl/checkouts/satpy/satpy/readers/seviri_base.py:454: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  ('GsicsCalValidity', np.bool),
[DEBUG: 2022-03-30 11:36:33 : satpy.readers.yaml_reader] Assigning to seviri_l1b_hrit: ['/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-HRV______-000013___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-HRV______-000009___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-HRV______-000014___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-HRV______-000023___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-HRV______-000005___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-HRV______-000003___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-HRV______-000017___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-HRV______-000007___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-HRV______-000002___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-HRV______-000018___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-HRV______-000015___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-HRV______-000020___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-HRV______-000011___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-HRV______-000010___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-HRV______-000021___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-HRV______-000004___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-HRV______-000001___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-HRV______-000006___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-HRV______-000019___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-HRV______-000008___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-HRV______-000012___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-HRV______-000022___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-HRV______-000024___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-HRV______-000016___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_016___-000004___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_016___-000006___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_016___-000003___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_016___-000007___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_016___-000005___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_016___-000001___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_016___-000008___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_016___-000002___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_039___-000001___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_039___-000008___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_039___-000002___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_039___-000006___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_039___-000004___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_039___-000007___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_039___-000003___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_039___-000005___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_087___-000008___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_087___-000006___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_087___-000005___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_087___-000004___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_087___-000007___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_087___-000003___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_087___-000001___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_087___-000002___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_097___-000005___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_097___-000006___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_097___-000002___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_097___-000008___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_097___-000007___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_097___-000003___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_097___-000004___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_097___-000001___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_108___-000002___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_108___-000004___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_108___-000001___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_108___-000005___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_108___-000006___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_108___-000007___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_108___-000008___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_108___-000003___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_120___-000004___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_120___-000003___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_120___-000006___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_120___-000001___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_120___-000008___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_120___-000007___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_120___-000005___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_120___-000002___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_134___-000007___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_134___-000001___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_134___-000002___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_134___-000006___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_134___-000003___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_134___-000005___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_134___-000008___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-IR_134___-000004___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-VIS006___-000005___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-VIS006___-000001___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-VIS006___-000007___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-VIS006___-000002___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-VIS006___-000008___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-VIS006___-000003___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-VIS006___-000004___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-VIS006___-000006___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-VIS008___-000006___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-VIS008___-000008___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-VIS008___-000004___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-VIS008___-000002___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-VIS008___-000007___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-VIS008___-000005___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-VIS008___-000001___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-VIS008___-000003___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-WV_062___-000008___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-WV_062___-000003___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-WV_062___-000007___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-WV_062___-000001___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-WV_062___-000004___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-WV_062___-000006___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-WV_062___-000005___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-WV_062___-000002___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-WV_073___-000004___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-WV_073___-000007___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-WV_073___-000008___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-WV_073___-000002___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-WV_073___-000006___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-WV_073___-000003___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-WV_073___-000001___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-WV_073___-000005___-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-_________-PRO______-202103300900-__', '/media/nas/x21308/scratch/SEVIRI/202103300900/H-000-MSG4__-MSG4________-_________-EPI______-202103300900-__']
Traceback (most recent call last):
  File "/data/gholl/checkouts/protocode/mwe/seviri-unicode-error.py", line 6, in <module>
    sc = Scene(filenames=seviri_files, reader=["seviri_l1b_hrit"])
  File "/data/gholl/checkouts/satpy/satpy/scene.py", line 106, in __init__
    self._readers = self._create_reader_instances(filenames=filenames,
  File "/data/gholl/checkouts/satpy/satpy/scene.py", line 127, in _create_reader_instances
    return load_readers(filenames=filenames,
  File "/data/gholl/checkouts/satpy/satpy/readers/__init__.py", line 569, in load_readers
    reader_instance.create_filehandlers(
  File "/data/gholl/checkouts/satpy/satpy/readers/yaml_reader.py", line 1156, in create_filehandlers
    created_fhs = super(GEOSegmentYAMLReader, self).create_filehandlers(
  File "/data/gholl/checkouts/satpy/satpy/readers/yaml_reader.py", line 616, in create_filehandlers
    filehandlers = self._new_filehandlers_for_filetype(filetype_info,
  File "/data/gholl/checkouts/satpy/satpy/readers/yaml_reader.py", line 604, in _new_filehandlers_for_filetype
    return list(filtered_iter)
  File "/data/gholl/checkouts/satpy/satpy/readers/yaml_reader.py", line 572, in filter_fh_by_metadata
    for filehandler in filehandlers:
  File "/data/gholl/checkouts/satpy/satpy/readers/yaml_reader.py", line 513, in _new_filehandler_instances
    yield filetype_cls(filename, filename_info, filetype_info, *req_fh, **fh_kwargs)
  File "/data/gholl/checkouts/satpy/satpy/readers/seviri_l1b_hrit.py", line 228, in __init__
    super(HRITMSGPrologueFileHandler, self).__init__(filename, filename_info,
  File "/data/gholl/checkouts/satpy/satpy/readers/seviri_l1b_hrit.py", line 207, in __init__
    super(HRITMSGPrologueEpilogueBase, self).__init__(filename, filename_info, filetype_info, hdr_info)
  File "/data/gholl/checkouts/satpy/satpy/readers/hrit_base.py", line 161, in __init__
    self._get_hd(hdr_info)
  File "/data/gholl/checkouts/satpy/satpy/readers/hrit_base.py", line 182, in _get_hd
    hdr_id = np.frombuffer(fp.read(common_hdr.itemsize), dtype=common_hdr, count=1)[0]
  File "/data/gholl/mambaforge/envs/py310/lib/python3.10/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 3: invalid start byte

Environment Info:

  • OS: openSUSE 15.3
  • Satpy Version: fails with latest main

Additional context

Fails with Satpy main. Succeeds with Satpy 0.35.0. git bisect suggests this bug was introduced in 78ef550.

@mraspaud
Copy link
Member

@pdebuyl do you have an idea what is happening here?

@gerritholl
Copy link
Collaborator Author

It looks like a file is getting opened in text mode instead of binary mode.

@pnuu
Copy link
Member

pnuu commented Mar 30, 2022

Looks like the indicated PR assumed mode kwarg were always passed explicitly. The seviri_l1b_hrit doesn't define it, thus mode doesn't exist. Adding the kwarg to seviri_l1b_hrit.py and hrit_base.py fixes the reader.

@pdebuyl
Copy link
Contributor

pdebuyl commented Mar 30, 2022

I started with the mode=rb set in the context manager. I did test that it would work but might not have found all the cases with actual data (the mocks wouldn't cover it). I can fix it tomorrow.

@pdebuyl
Copy link
Contributor

pdebuyl commented Mar 30, 2022

Two solutions:

  1. Add the default mode to the constructor of generic_open
    def __init__(self, filename, *args, mode='rb', **kwargs):
        """Keep filename and mode."""
        self.filename = filename
        self.open_args = args
        self.open_kwargs = kwargs | {'mode': mode}
  1. Add the mode explicitly to the calls to generic_open in hrit_base.py and in seviri_l1b_hrit.py.

I don't know if you would have a preference beween those two solutions.

@pdebuyl
Copy link
Contributor

pdebuyl commented Mar 31, 2022

I went with "explicit" to avoid bypassing the defaults of the regular open function.

@djhoese
Copy link
Member

djhoese commented Mar 31, 2022

self.open_kwargs = kwargs | {'mode': mode}

The | handling for dicts was only recently added, right? So maybe kwargs.setdefault("mode", mode) and then self.open_kwargs = kwargs?

@pdebuyl
Copy link
Contributor

pdebuyl commented Mar 31, 2022

Hi @djhoese I did submit a PR with an explicit mode=... in the hrit readers instead. The kwargs | {'mode': mode} proposal made rb the default which is not the standard behavior of open, so I preferred to not use it.

@djhoese
Copy link
Member

djhoese commented Apr 1, 2022

which is not the standard behavior of open,

Sounds good. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants