Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to replace saturated MODIS L1b values with max valid value #2057

Merged
merged 14 commits into from Apr 8, 2022

Conversation

djhoese
Copy link
Member

@djhoese djhoese commented Mar 11, 2022

As discussed on slack, the MODIS L1b files include a special fill value (65535) that indicates that the detector saturated. This value is only present for band 2. Additionally there is a fill value that means that the value could not be aggregated from the finer resolution version of the band to the 500m or 1km version. This "can't aggregate" value shows up when saturated pixels as well.

This PR adds a new "mask_saturated" kwarg to the modis_l1b reader that, by default, replaces these values with NaNs as it is now. If False, the saturated and can't aggregate fill values are replaced with the max valid value. The name of this kwarg matches the MSI reader (https://github.com/pytroll/satpy/blob/main/satpy/readers/msi_safe.py#L114) which has the same default and behavior.

  • Tests added
  • Fully documented

@djhoese djhoese added enhancement code enhancements, features, improvements component:readers labels Mar 11, 2022
@djhoese djhoese requested a review from simonrp84 March 11, 2022 16:57
@djhoese djhoese requested a review from mraspaud as a code owner March 11, 2022 16:57
@djhoese djhoese self-assigned this Mar 11, 2022
@codecov
Copy link

codecov bot commented Mar 11, 2022

Codecov Report

Merging #2057 (5e8a088) into main (a6f5cdf) will increase coverage by 0.00%.
The diff coverage is 86.81%.

@@           Coverage Diff           @@
##             main    #2057   +/-   ##
=======================================
  Coverage   93.86%   93.86%           
=======================================
  Files         283      283           
  Lines       42328    42379   +51     
=======================================
+ Hits        39730    39781   +51     
  Misses       2598     2598           
Flag Coverage Δ
behaviourtests 4.72% <0.00%> (-0.01%) ⬇️
unittests 94.43% <86.81%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
satpy/readers/modis_l1b.py 64.75% <80.64%> (+9.08%) ⬆️
satpy/readers/hdfeos_base.py 91.86% <100.00%> (ø)
satpy/tests/reader_tests/_modis_fixtures.py 97.60% <100.00%> (+0.09%) ⬆️
satpy/tests/reader_tests/test_modis_l1b.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a6f5cdf...5e8a088. Read the comment docs.

@coveralls
Copy link

coveralls commented Mar 11, 2022

Coverage Status

Coverage increased (+0.007%) to 94.37% when pulling 5e8a088 on djhoese:feature-modis-saturation into a6f5cdf on pytroll:main.

@djhoese
Copy link
Member Author

djhoese commented Mar 11, 2022

Turns out the uncertainty indexes are 15 in real world files where the pixels are saturated. Current solution is to not check uncertainty if not masking saturated pixels.

@djhoese
Copy link
Member Author

djhoese commented Mar 23, 2022

@mraspaud I've been talking to Kathy and @simonrp84 (on slack) and we can't figure out how/why/if the uncertainty should be used. How do you feel about me removing them in favor of the more "traditional" fill value filtering that is/was already used.

@mraspaud
Copy link
Member

I'll be back at the office on Friday, I'll try to find time to talk to the original author of that test and see if he can remember why he did it in the first place. For now, deactivating the uncertainty check when the new option is set to true seems reasonable.

Copy link
Member

@mraspaud mraspaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The principle is sound, I just have a couple of suggestions. Also, 65532 fill value seems to have some relation with saturation, should we use it?

satpy/readers/modis_l1b.py Show resolved Hide resolved
satpy/readers/modis_l1b.py Outdated Show resolved Hide resolved
@djhoese
Copy link
Member Author

djhoese commented Mar 31, 2022

Regarding 65532, Simon and I talked about that on slack. That is apparently related to the space view being saturated and is not useful here.

@djhoese djhoese changed the title Add option to replace saturated MODIS L1b band 2 values with max reflectance Add option to replace saturated MODIS L1b values with max valid value Apr 5, 2022
Copy link
Member

@mraspaud mraspaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this PR. It looks good on the functionality, I just have some style comments.

array = xr.DataArray(from_sds(subdata, chunks=CHUNK_SIZE)[band_index, :, :],
dims=['y', 'x']).astype(np.float32)
valid_range = var_attrs['valid_range']
array = self._mask_invalid_or_fill_saturated(array, valid_range)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like having one function doing two things here, could we move the if self._mask_saturated logic here and split that function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could, but it would result in duplicated code as the two options both check valid_min...nevermind, re-reading it is clear that I could do the masking first as a separate method. I'll see what I can do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I separated this out. I'm not sure if one placement of the if statement is better than another. Let me know what you think.

# lats=satscene[band].area.lats[indices, :])
self._add_satpy_metadata(key, projectable)
return projectable
return subdata, uncertainty, var_attrs, band_index
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a lot of return values :) Would it be worth having a class for this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or build the DataArray here and put eg uncertainty[band_index] as an ancillary dataset in the attrs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I didn't like this much either, but it wasn't obvious how to do it any other way. It overall seemed better than the huge method that was here before. My goal was to avoid letting datasets or dataset (the name of the variable in the file to use) leak to the outer scope, but maybe I make one method that returns dataset and band_index...nah because then I still have to self.sd.select the dataset again and get the var_attrs.

The problem with making the DataArray here is that it actually doesn't get me much since var_attrs isn't actually applied to the DataArray at all. I'm not sure how I feel about the uncertainty indexes going in the attrs either. I'm leaning towards don't like it 😉

I'll think about doing a separate helper class or something while I work on other stuff today.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so it required a tiny bit of duplicated logic (.select), BUT I think it is overall cleaner. Check it out.

250: ['EV_250_RefSB'],
}

def __init__(self, filename, filename_info, filetype_info, mask_saturated=True, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the kwargs for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The kwargs are actually needed if this reader (file handler) is loaded at the same time as other readers. This is an unfortunate side effect of how the Scene handles reader keyword arguments and we've done it in other readers. The Scene, when given a dictionary of keyword arguments, will pass them to all readers being loaded. If a file handler/reader doesn't have **kwargs then any unrecognized keyword arguments will raise an exception.

Recently @gerritholl added the ability to specify the exact reader you want to pass keyword arguments to from the Scene __init__ so maybe this isn't required anymore, but I'd be nervous about deprecating it too fast. I could remove it for this reader with the understanding that I only need to handle mask_saturated and the assumption that the base HDFEOS file handler has no other keyword arguments. I could then remove the **kwargs that I added to the base HDFEOS file handler above.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I looked at this more and I think we need this in the other file handlers. The base and geo file handlers in hdfeos.py will also both receive the mask_saturated=True keyword argument and won't know what to do with it. The easiest way to capture that and ignore it is to use **kwargs.

satpy/readers/modis_l1b.py Outdated Show resolved Hide resolved
Copy link
Member

@mraspaud mraspaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@mraspaud
Copy link
Member

mraspaud commented Apr 8, 2022

CI is still failing, I'm merging main

@djhoese
Copy link
Member Author

djhoese commented Apr 8, 2022

I'll fix the failing tests as soon as I can.

@djhoese
Copy link
Member Author

djhoese commented Apr 8, 2022

The issues identified by the code analyzers that are actually introduced by this PR aren't things that can easily be changed. Given the approval, I'm going to merge this now.

@djhoese djhoese merged commit 6985a3f into pytroll:main Apr 8, 2022
@djhoese djhoese deleted the feature-modis-saturation branch April 8, 2022 13:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:readers enhancement code enhancements, features, improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants