New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve MiRS reader handling of missing metadata #1671
Merged
Merged
Changes from 30 commits
Commits
Show all changes
38 commits
Select commit
Hold shift + click to select a range
d01b1e0
Merge branch 'iss1387' of github.com:joleenf/satpy
joleenf aa10285
Merge branch 'master' of https://github.com/pytroll/satpy
joleenf b2f71df
Merge with upstream development
joleenf c164f61
Merge remote-tracking branch 'refs/remotes/origin/master'
joleenf f8e1c42
Merge branch 'master' of https://github.com/pytroll/satpy
joleenf 826122a
Add reader_kwarg to omit limb correction on ATMS sensors.
joleenf 79218d7
Fix docstring in mirs reader class
joleenf 1cdbc0c
add check for calling limb correction only when sensor is atms
joleenf e783a8c
Fix docstring
joleenf fc610c8
Add a kwarg test for limb_correction.
joleenf fdba5fe
Remove unused variables and add a noaa-20 test
joleenf a66e3bc
Fold reader_kwarg test into basic_load
joleenf 6132c67
Simply if/then statement, split parameterize for readability.
joleenf f4bdbda
Simplify assertion if/then statement for limb_correction.
joleenf b0140d1
Remove extra line getting the name of the sensor
joleenf e5038a5
Getting changes to dask_ewa resampling
joleenf b9b7a69
Merge remote-tracking branch 'upstream/master'
joleenf 9f85942
Use valid range when present
joleenf d1a0bb5
Check to confirm that valid range is no longer in attributes.
joleenf 582837d
Add a test to check valid range was applied correctly.
joleenf 879fce4
valid_range is inclusive so include both min/max in acceptable values
joleenf 214ad77
Merge branch 'master' of https://github.com/pytroll/satpy
joleenf 08a82ce
Add attributes when missing/apply attributes from both file and yaml …
joleenf ad76084
Fix _FillValue so that it is read as an integer
joleenf b5b3f86
Update reading of yaml
joleenf 0da2c2c
Merge branch 'main' of https://github.com/pytroll/satpy into mirs_met…
joleenf bb9df5a
Test units for TEST_VARS
joleenf f4010fc
Add descriptions for some of the variables in the yaml.
joleenf 6988f3b
BUG FIXES: apply attributes before limb correction and fix typos in yaml
joleenf c40ee15
Remove the change to the file_patterns, the extra file_type is not ne…
joleenf 3e367c8
Add BT to yaml and use in creation of ds_info for data_id
joleenf 18d3356
commit the yaml mentioned in previous commit
joleenf 2ad0ba3
Take file_key out of BT dataset so that it does not get carried throu…
joleenf ebc2eff
Don't add more in yaml than necessary
joleenf fa82afa
Make sure btemp information is only initializing with yaml when neces…
joleenf 6a5c817
Simplify the reading of coefficients so reading old/versus new coeffi…
joleenf 1b79da9
Take out check for n_chn and n_fov since they are fixed now.
joleenf 1649241
Simplify logic which repeatedly checked if file_type matched yaml dat…
joleenf File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -309,7 +309,7 @@ def _get_coeff_filenames(self): | |
|
||
return coeff_fn | ||
|
||
def get_metadata(self, ds_info): | ||
def update_metadata(self, ds_info): | ||
"""Get metadata.""" | ||
metadata = {} | ||
metadata.update(ds_info) | ||
|
@@ -334,44 +334,70 @@ def _nan_for_dtype(data_arr_dtype): | |
return np.nan | ||
|
||
@staticmethod | ||
def _scale_data(data_arr, attrs): | ||
# handle scaling | ||
# take special care for integer/category fields | ||
scale_factor = attrs.pop('scale_factor', 1.) | ||
add_offset = attrs.pop('add_offset', 0.) | ||
def _scale_data(data_arr, scale_factor, add_offset): | ||
"""Scale data, if needed.""" | ||
scaling_needed = not (scale_factor == 1 and add_offset == 0) | ||
if scaling_needed: | ||
data_arr = data_arr * scale_factor + add_offset | ||
return data_arr, attrs | ||
return data_arr | ||
|
||
def _fill_data(self, data_arr, attrs): | ||
try: | ||
global_attr_fill = self.nc.missing_value | ||
except AttributeError: | ||
global_attr_fill = None | ||
fill_value = attrs.pop('_FillValue', global_attr_fill) | ||
|
||
fill_out = self._nan_for_dtype(data_arr.dtype) | ||
def _fill_data(self, data_arr, fill_value, scale_factor, add_offset): | ||
"""Fill missing data with NaN.""" | ||
if fill_value is not None: | ||
fill_value = self._scale_data(fill_value, scale_factor, add_offset) | ||
fill_out = self._nan_for_dtype(data_arr.dtype) | ||
data_arr = data_arr.where(data_arr != fill_value, fill_out) | ||
return data_arr, attrs | ||
return data_arr | ||
|
||
def _apply_valid_range(self, data_arr, attrs): | ||
# handle valid_range | ||
valid_range = attrs.pop('valid_range', None) | ||
def _apply_valid_range(self, data_arr, valid_range, scale_factor, add_offset): | ||
"""Get and apply valid_range.""" | ||
if valid_range is not None: | ||
valid_min, valid_max = valid_range | ||
valid_min = self._scale_data(valid_min, scale_factor, add_offset) | ||
valid_max = self._scale_data(valid_max, scale_factor, add_offset) | ||
|
||
if valid_min is not None and valid_max is not None: | ||
data_arr = data_arr.where((data_arr >= valid_min) & | ||
(data_arr <= valid_max)) | ||
return data_arr, attrs | ||
return data_arr | ||
|
||
def apply_attributes(self, data, ds_info): | ||
"""Combine attributes from file and yaml and apply. | ||
|
||
File attributes should take precedence over yaml if both are present | ||
|
||
""" | ||
try: | ||
global_attr_fill = self.nc.missing_value | ||
except AttributeError: | ||
global_attr_fill = 1.0 | ||
|
||
# let file metadata take precedence over ds_info from yaml, | ||
# but if yaml has more to offer, include it here, but fix | ||
# units. | ||
ds_info.update(data.attrs) | ||
|
||
scale = ds_info.pop('scale_factor', 1.0) | ||
offset = ds_info.pop('add_offset', 0.) | ||
fill_value = ds_info.pop("_FillValue", global_attr_fill) | ||
valid_range = ds_info.pop('valid_range', None) | ||
|
||
units_convert = {"Kelvin": "K"} | ||
data_unit = ds_info['units'] | ||
ds_info['units'] = units_convert.get(data_unit, data_unit) | ||
|
||
data = self._scale_data(data, scale, offset) | ||
data = self._fill_data(data, fill_value, scale, offset) | ||
data = self._apply_valid_range(data, valid_range, scale, offset) | ||
|
||
return data, ds_info | ||
|
||
def get_dataset(self, ds_id, ds_info): | ||
"""Get datasets.""" | ||
if 'dependencies' in ds_info.keys(): | ||
idx = ds_info['channel_index'] | ||
data = self['BT'] | ||
data, ds_info = self.apply_attributes(data, ds_info) | ||
data = data.rename(new_name_or_name_dict=ds_info["name"]) | ||
|
||
if self.sensor.lower() == "atms" and self.limb_correction: | ||
|
@@ -385,19 +411,25 @@ def get_dataset(self, ds_id, ds_info): | |
data = data[:, :, idx] | ||
else: | ||
data = self[ds_id['name']] | ||
data, ds_info = self.apply_attributes(data, ds_info) | ||
|
||
data.attrs = self.update_metadata(ds_info) | ||
|
||
data.attrs = self.get_metadata(ds_info) | ||
return data | ||
|
||
def _available_if_this_file_type(self, configured_datasets): | ||
handled_vars = set() | ||
for is_avail, ds_info in (configured_datasets or []): | ||
if is_avail is not None: | ||
# some other file handler said it has this dataset | ||
# we don't know any more information than the previous | ||
# file handler so let's yield early | ||
yield is_avail, ds_info | ||
continue | ||
if self.file_type_matches(ds_info['file_type']): | ||
djhoese marked this conversation as resolved.
Show resolved
Hide resolved
|
||
handled_vars.add(ds_info['name']) | ||
yield self.file_type_matches(ds_info['file_type']), ds_info | ||
yield from self._available_new_datasets(handled_vars) | ||
|
||
def _count_channel_repeat_number(self): | ||
"""Count channel/polarization pair repetition.""" | ||
|
@@ -433,6 +465,7 @@ def _available_btemp_datasets(self): | |
'name': new_name, | ||
'description': desc_bt, | ||
'units': 'K', | ||
'scale_factor': self.nc['BT'].attrs['scale_factor'], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why only |
||
'channel_index': idx, | ||
'frequency': "{}GHz".format(normal_f), | ||
'polarization': normal_p, | ||
|
@@ -447,20 +480,19 @@ def _get_ds_info_for_data_arr(self, var_name): | |
'name': var_name, | ||
'coordinates': ["longitude", "latitude"] | ||
} | ||
|
||
if var_name in ["longitude", "latitude"]: | ||
ds_info['standard_name'] = var_name | ||
return ds_info | ||
|
||
def _is_2d_yx_data_array(self, data_arr): | ||
has_y_dim = data_arr.dims[0] == "y" | ||
has_x_dim = data_arr.dims[1] == "x" | ||
return has_y_dim and has_x_dim | ||
|
||
def _available_new_datasets(self): | ||
def _available_new_datasets(self, handled_vars): | ||
"""Metadata for available variables other than BT.""" | ||
possible_vars = list(self.nc.items()) + list(self.nc.coords.items()) | ||
for var_name, data_arr in possible_vars: | ||
if var_name in handled_vars: | ||
continue | ||
if data_arr.ndim != 2: | ||
# we don't currently handle non-2D variables | ||
continue | ||
|
@@ -479,7 +511,6 @@ def available_datasets(self, configured_datasets=None): | |
|
||
""" | ||
yield from self._available_if_this_file_type(configured_datasets) | ||
yield from self._available_new_datasets() | ||
yield from self._available_btemp_datasets() | ||
|
||
def __getitem__(self, item): | ||
|
@@ -491,10 +522,6 @@ def __getitem__(self, item): | |
|
||
""" | ||
data = self.nc[item] | ||
attrs = data.attrs.copy() | ||
data, attrs = self._scale_data(data, attrs) | ||
data, attrs = self._fill_data(data, attrs) | ||
data, attrs = self._apply_valid_range(data, attrs) | ||
|
||
# 'Freq' dimension causes issues in other processing | ||
if 'Freq' in data.coords: | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both scale_factor and _FillValue aren't provided in the files?