Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add reader kwarg to 'ahi_hrit' to disable exact start_time #1986

Merged
merged 3 commits into from Feb 4, 2022

Conversation

pnuu
Copy link
Member

@pnuu pnuu commented Jan 20, 2022

The 'ahi_hrit' by default takes the exact start_time from the data it self. The start_time is down to microsecond level, and is different for every dataset. This will cause for example Sun zenith angle correction to be computed separately for every VIS channel.

This PR introduces a reader keyword argument use_acquisition_time_as_start_time to control whether nominal (parsed from the filename) or exact acquisition start time will be used. The default, due to significantly better performance, is the nominal time parsed from the filename (reader_kwargs={'use_acquisition_time_as_start_time': False}).

The exact times are always available from the channel coordinates:

scene.load(["B03"])
print(scene["B03].coords["acq_time"].data)
array(['2021-12-08T06:00:20.131200000', '2021-12-08T06:00:20.191948000',
       '2021-12-08T06:00:20.252695000', ...,
       '2021-12-08T06:09:39.449390000', '2021-12-08T06:09:39.510295000',
       '2021-12-08T06:09:39.571200000'], dtype='datetime64[ns]')

The first value represents the exact start time, and the last one the exact end time of the data acquisition.

Another option would've been to ensure all the angle computations were using a rounded time (to closest minute or so), but that might not be desirable for some use cases.

@pnuu pnuu added the enhancement code enhancements, features, improvements label Jan 20, 2022
@pnuu pnuu self-assigned this Jan 20, 2022
@codecov
Copy link

codecov bot commented Jan 20, 2022

Codecov Report

Merging #1986 (e0168d8) into main (cccbe06) will increase coverage by 0.02%.
The diff coverage is 96.15%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1986      +/-   ##
==========================================
+ Coverage   93.54%   93.56%   +0.02%     
==========================================
  Files         279      281       +2     
  Lines       41322    41480     +158     
==========================================
+ Hits        38654    38811     +157     
- Misses       2668     2669       +1     
Flag Coverage Δ
behaviourtests 4.77% <0.00%> (-0.02%) ⬇️
unittests 94.09% <96.15%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
satpy/readers/hrit_jma.py 97.98% <80.00%> (+0.04%) ⬆️
satpy/tests/reader_tests/test_ahi_hrit.py 100.00% <100.00%> (ø)
satpy/tests/enhancement_tests/test_viirs.py 100.00% <0.00%> (ø)
satpy/tests/enhancement_tests/test_enhancements.py 100.00% <0.00%> (ø)
satpy/readers/msu_gsa_l1b.py 100.00% <0.00%> (ø)
satpy/tests/reader_tests/test_msu_gsa_l1b.py 100.00% <0.00%> (ø)
satpy/tests/writer_tests/test_cf.py 99.69% <0.00%> (+<0.01%) ⬆️
satpy/writers/cf_writer.py 94.35% <0.00%> (+0.05%) ⬆️
satpy/resample.py 79.73% <0.00%> (+0.38%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cccbe06...e0168d8. Read the comment docs.

@coveralls
Copy link

coveralls commented Jan 20, 2022

Coverage Status

Coverage increased (+0.02%) to 94.025% when pulling e0168d8 on pnuu:bugfix-ahi-start_time-slowdown into cccbe06 on pytroll:main.

@djhoese
Copy link
Member

djhoese commented Jan 20, 2022

To be clear, this PR does not fix any of the performance issues you ran into right?

I know the exact start time should be used by default, but it is too bad for performance that we can't set this new option to False by default. Otherwise, this looks good to me.

Copy link
Member

@sfinkens sfinkens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, very useful! Just nitpicking...


def __init__(self, filename, filename_info, filetype_info):
By default, the reader computes the exact start time. As this time is different for every channel,
things like angle calculation (SZA correction) can get very slow. To use approximate times, the user can
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd favor "nominal" over "approximate", what do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I think there are two whitespaces in the beginning of that sentence.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I'll use nominal. And the double whitespace is a leftover reflex from using Emacs, which thinks wrapping works properly only if there are two whitespaces after a comma...

@pnuu
Copy link
Member Author

pnuu commented Jan 20, 2022

To be clear, this PR does not fix any of the performance issues you ran into right?

No, but setting use_exact_start_time=False helps.

I know the exact start time should be used by default, but it is too bad for performance that we can't set this new option to False by default. Otherwise, this looks good to me.

So I should make the nominal time the default?

The exact scanline times will any way be available for any scientific users.

@pnuu
Copy link
Member Author

pnuu commented Jan 21, 2022

I opted to read Dave's comment so that False should be the default. Updated the docstring and tests to reflect this.

@pnuu
Copy link
Member Author

pnuu commented Jan 21, 2022

For comparison, generation of 6 full disk composites that use SZA correction. First with use_exact_start_time=True (the previous default):
Screenshot 2022-01-21 at 11-04-45 Bokeh Plot
Elapsed time: 49.85 seconds. Maximum memory usage approximately 9.2 GB.

And with use_exact_start_time=False (new default):
Screenshot 2022-01-21 at 11-04-56 Bokeh Plot
Elapsed time: 39.99 seconds. Maximum memory usage approximately 7 GB.

Copy link
Member

@mraspaud mraspaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I just have a suggestion:
I think use_exact_start_time is not precise enough. How about use_nominal_time_as_start_time or use_acquisition_time_as_start_time?

@pnuu
Copy link
Member Author

pnuu commented Feb 4, 2022

Changed the kwarg to use_acquisition_time_as_start_time.

@mraspaud mraspaud merged commit 34bd429 into pytroll:main Feb 4, 2022
@pnuu pnuu deleted the bugfix-ahi-start_time-slowdown branch February 4, 2022 11:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement code enhancements, features, improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AHI HRIT reader has gotten slower
5 participants