Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a reader for FY-4A LMI level 2 data #1103

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions satpy/etc/readers/lmi_l2.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# References:
# - L2 Data of FY4A Lightning Mapping Imager
# - http://fy4.nsmc.org.cn/data/en/data/realtime.html

reader:
name: lmi_l2
description: Lightning Mapping Imager Dataset reader
reader: !!python/name:satpy.readers.yaml_reader.FileYAMLReader
sensors: [lmi]

file_types:
lmi:
file_reader: !!python/name:satpy.readers.lmi_l2.LMIL2FileHandler
file_patterns: ['{platform_id:4s}-_{instrument:3s}---_N_{observation_type:s}_{longitude:5s}_L2-_LMI{product:s}_SING_NUL_{start_time:%Y%m%d%H%M%S}_{end_time:%Y%m%d%H%M%S}_{resolution:s}_N{subtask_num:s}V{version:s}.NC']

datasets:
LAT:
name: LAT
file_type: lmi
standard_name: latitude
units: degree_north

LON:
name: LON
file_type: lmi
standard_name: longitude
units: degree_east
154 changes: 154 additions & 0 deletions satpy/readers/lmi_l2.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,154 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Copyright (c) 2019 Satpy developers
#
# This file is part of satpy.
#
# satpy is free software: you can redistribute it and/or modify it under the
# terms of the GNU General Public License as published by the Free Software
# Foundation, either version 3 of the License, or (at your option) any later
# version.
#
# satpy is distributed in the hope that it will be useful, but WITHOUT ANY
# WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR
# A PARTICULAR PURPOSE. See the GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License along with
# satpy. If not, see <http://www.gnu.org/licenses/>.
"""Lightning Mapping Imager Dataset reader

The files read by this reader are described in:

http://fy4.nsmc.org.cn/data/en/data/realtime.html

"""

import numpy as np
from satpy.readers.netcdf_utils import NetCDF4FileHandler, netCDF4
import logging

logger = logging.getLogger(__name__)


class LMIL2FileHandler(NetCDF4FileHandler):
"""File handler for LMI L2 netCDF files."""

@property
def start_time(self):
return self.filename_info['start_time']

@property
def end_time(self):
return self.filename_info['end_time']

@property
def platform_shortname(self):
return self.filename_info['platform_id']

@property
def sensor(self):
return self.filename_info['instrument'].lower()

def available_datasets(self, configured_datasets=None):
"""Automatically determine datasets provided by this file."""
logger.debug("Available_datasets begin...")

handled_variables = set()

# update previously configured datasets
logger.debug("Starting previously configured variables loop...")
for is_avail, ds_info in (configured_datasets or []):
if ds_info['name'] == 'LAT':
lat_shape = self[ds_info['name']+'/shape']

# some other file handler knows how to load this
if is_avail is not None:
yield is_avail, ds_info

var_name = ds_info.get('file_key', ds_info['name'])
matches = self.file_type_matches(ds_info['file_type'])
# we can confidently say that we can provide this dataset and can
# provide more info
if matches and var_name in self:
logger.debug("Handling previously configured variable: %s",
var_name)
# Because assembled variables and bounds use the same file_key,
# we need to omit file_key once.
handled_variables.add(var_name)
new_info = ds_info.copy() # don't mess up the above yielded
yield True, new_info
elif is_avail is None:
# if we didn't know how to handle this dataset
# and no one else did,
# then we should keep it going down the chain.
yield is_avail, ds_info

# Iterate over dataset contents
for var_name, val in self.file_content.items():
# Only evaluate variables
if isinstance(val, netCDF4.Variable):
logger.debug("Evaluating new variable: %s", var_name)
var_shape = self[var_name + "/shape"]
logger.debug("Dims:{}".format(var_shape))
if (lat_shape == var_shape):
logger.debug("Found valid additional dataset: %s",
var_name)

# Skip anything we have already configured
if (var_name in handled_variables):
logger.debug("Already handled, skipping: %s", var_name)
continue
handled_variables.add(var_name)
logger.debug("Using short name of: %s", var_name)

# Create new ds_info object
new_info = {
'name': var_name,
'file_key': var_name,
'coordinates': ['LON', 'LAT'],
'file_type': self.filetype_info['file_type'],
'resolution': self.filename_info['resolution'].lower()
}
yield True, new_info

def get_metadata(self, data, ds_info):
"""Get metadata."""
metadata = {}
metadata.update(data.attrs)
units = data.attrs['units']
# fix the wrong unit "uJ/m*m/ster"
if not units.isascii():
metadata['units'] = b'\xc2\xb5J/m*m/ster'
metadata.update(ds_info)
metadata.update({
'platform_shortname': self.filename_info['platform_id'],
'sensor': self.filename_info['instrument'].lower(),
'start_time': self.start_time,
'end_time': self.end_time,
})

return metadata

def get_dataset(self, ds_id, ds_info):
"""Load a dataset."""
logger.debug("Getting data for: %s", ds_id.name)
file_key = ds_info.get('file_key', ds_id.name)
data = self[file_key]
data.attrs = self.get_metadata(data, ds_info)
# rename coords
data = data.rename({'x': 'y'})
# assign 'y' coords which is useful for multiscene,
# although the units isn't meters
len_data = data.coords['y'].shape[0]
data.coords['y'] = np.arange(len_data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to assign these? I don't think you need y coordinates here. Maybe I'm wrong and naming it y is a bad idea, but I think that is consistent with other readers and the rest of Satpy. The dimension should be y, but no coords for it. My thought was instead that you add a data.coords['time'] = [... series of times ...] then the MultiScene timeseries function could be updated to look at that coordinate. So time would be the only coordinate you add in the reader, the base reader will then add lon and lat for you.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@djhoese

Got it! Here's the method:

        tseries = pd.DatetimeIndex(np.repeat(data.attrs['start_time'], data.shape[0]))
        data = data.expand_dims({'time': tseries})
        data = data.rename({'x': 'y'})

Example:

<xarray.DataArray 'ER' (time: 31, y: 31)>
dask.array<where, shape=(31, 31), dtype=float32, chunksize=(31, 31), chunktype=numpy.ndarray>
Coordinates:
  * time     (time) datetime64[ns] 2019-07-25T05:25:10 ... 2019-07-25T05:25:10
Dimensions without coordinates: y

But, I will get this error when using MultiScene by mscn = MultiScene.from_files(filenames, reader='lmi_l2', group_keys=('start_time', 'subtask_num')) and imgs = mscn.blend(blend_function=timeseries):

ValueError: arguments without labels along dimension 'y' cannot be aligned because they have different dimension sizes: {8, 74, 44, 45, 86, 55, 24, 60, 93, 31}

The modification of MultiScene is mentioned in #1173.

# check fill value
fill = data.attrs.pop('FillValue')
data = data.where(data != fill)
# remove attributes that could be confusing later
data.attrs.pop('ancillary_variables', None)
data.attrs.pop('Description', None)
# select valid data
data = data.where((data >= min(data.attrs['valid_range'])) &
(data <= max(data.attrs['valid_range'])))

return data