Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for temporal composites #2495

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

gerritholl
Copy link
Collaborator

@gerritholl gerritholl commented May 25, 2023

Add a framework for temporal composites.

Current status

Implementation:

Draft.

When loading a temporal composite from a multiscene, the first iteration is failing as expected, because time dimension information is missing. The MultiScene blend function then calls generate_possible_composites after timeseries blending, and the composite is loaded.

See below for a list of open tasks.

Unit tests

Passing.

Documentation

Present.

  • Closes Define and calculate composites with temporal dependencies #1748
  • Tests added
  • Fully documented
  • Fix implementation
  • Don't throw a warning for every single scene
  • Disambiguate between "needs resampling" and "needs blending", as the latter is going to be very rare and such a message will just confuse users
  • Add functionality to handle the temporal component of the prerequisites
  • Add one or more composites that use the new temporal rgb compositor (this also helps as documentation).
  • Add time tolerances or slicing, for the (common) case where start times are not rounded

Started work on unit tests for a temporal composites framework.
Currently the unit test fails because there is no implementation, and it is
also not quite testing the right thing.
@gerritholl gerritholl requested a review from pnuu May 25, 2023 09:55
Implement a first working draft of temporal compositors.
Needs thinking and tuning still.
@gerritholl gerritholl marked this pull request as ready for review May 25, 2023 16:07
@gerritholl gerritholl requested a review from mraspaud as a code owner May 25, 2023 16:07
@gerritholl gerritholl marked this pull request as draft May 26, 2023 06:49
When blending a multiscene, build the new wishlist from the wishlists
shared between the scenes, analogous to the common datasets.  If we just
copy the wishlist from the first scene, generating possible composites
will fail if there are datasets in the first scene that are not in the
multiscene because they don't have the same datasets.
@gerritholl gerritholl marked this pull request as ready for review May 26, 2023 09:49
@gerritholl
Copy link
Collaborator Author

Although I still have some open ends, a first round of review would be already valuable, in case developers have doubt with the direction this is going.

@codecov
Copy link

codecov bot commented May 26, 2023

Codecov Report

Merging #2495 (224d79e) into main (3b7a477) will increase coverage by 0.02%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main    #2495      +/-   ##
==========================================
+ Coverage   94.83%   94.85%   +0.02%     
==========================================
  Files         337      338       +1     
  Lines       49430    49558     +128     
==========================================
+ Hits        46875    47007     +132     
+ Misses       2555     2551       -4     
Flag Coverage Δ
behaviourtests 4.44% <10.16%> (+<0.01%) ⬆️
unittests 95.47% <100.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
satpy/composites/__init__.py 91.51% <100.00%> (+0.30%) ⬆️
satpy/multiscene/_multiscene.py 94.15% <100.00%> (+1.33%) ⬆️
satpy/scene.py 93.52% <100.00%> (ø)
...tests/multiscene_tests/test_temporal_composites.py 100.00% <100.00%> (ø)

... and 2 files with indirect coverage changes

@coveralls
Copy link

coveralls commented May 26, 2023

Pull Request Test Coverage Report for Build 5092382948

  • 118 of 118 (100.0%) changed or added relevant lines in 4 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.02%) to 95.419%

Totals Coverage Status
Change from base Build 5078286241: 0.02%
Covered Lines: 47120
Relevant Lines: 49382

💛 - Coveralls

@pnuu
Copy link
Member

pnuu commented May 26, 2023

No comments on the code, just a question: is this functional already? For example, can the "temporal RGB" in #2488 already created? If yes, we could close that in favor of this more flexible approach. A simple example built-in temporal composite would be great, and at some point a short section to MultiScene documentation with a code example how to create the said composite even better :-)

@gerritholl
Copy link
Collaborator Author

import hdf5plugin
from glob import glob
from satpy.multiscene import MultiScene, timeseries
from satpy.utils import debug_on; debug_on()

ms = MultiScene.from_files(
        glob("/media/nas/x21308/MTG_test_data/2022_05_MTG_Testdata/RC007[012]/*BODY*.nc"),
        reader="fci_l1c_nc",
        group_keys=["repeat_cycle_in_day"])
ms.load(["temporal_rgb_vis06"])
sc = ms.blend(blend_function=timeseries)
ls = sc.resample("eurol")
ls.save_datasets()

temporal_rgb_vis06_20170920_114008

@gerritholl gerritholl requested a review from adybbroe as a code owner May 26, 2023 11:35
@pnuu
Copy link
Member

pnuu commented May 26, 2023

Hmm. I tried this composite:

  temporal_rgb_hrv:
    compositor: !!python/name:satpy.composites.TemporalRGB
    prerequisites:
      - name: HRV
        time: 0
      - name: HRV
        time: -60 min
      - name: HRV
        time: -120 min
    standard_name: temporal_rgb_hrv

with this code:

import sys

from satpy import MultiScene
from satpy.multiscene import timeseries


fnames = sys.argv[1:]

mscn = MultiScene.from_files(
    fnames,
    reader='seviri_l1b_hrit',
    group_keys=['start_time'],
)
mscn.load(['temporal_rgb_hrv'])
blended = mscn.blend(blend_function=timeseries)
scn = blended.resample('euron1', resampler='gradient_search')
scn.save_datasets(filename="/tmp/{start_time:%Y%m%d_%H%M}_{platform_name}_{name}.png")

and get this trace:

The following datasets were not created and may require resampling or temporal blending to be generated: DataID(name='temporal_rgb_hrv')
Traceback (most recent call last):
  File "pandas/_libs/index.pyx", line 548, in pandas._libs.index.DatetimeEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 2263, in pandas._libs.hashtable.Int64HashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 2273, in pandas._libs.hashtable.Int64HashTable.get_item
KeyError: 1683705309504000000

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/lahtinep/mambaforge/envs/py310/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3802, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas/_libs/index.pyx", line 516, in pandas._libs.index.DatetimeEngine.get_loc
  File "pandas/_libs/index.pyx", line 550, in pandas._libs.index.DatetimeEngine.get_loc
KeyError: Timestamp('2023-05-10 07:55:09.504000')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/lahtinep/mambaforge/envs/py310/lib/python3.10/site-packages/pandas/core/indexes/datetimes.py", line 736, in get_loc
    return Index.get_loc(self, key, method, tolerance)
  File "/home/lahtinep/mambaforge/envs/py310/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3804, in get_loc
    raise KeyError(key) from err
KeyError: Timestamp('2023-05-10 07:55:09.504000')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/lahtinep/mambaforge/envs/py310/lib/python3.10/site-packages/xarray/core/indexes.py", line 486, in sel
    indexer = self.index.get_loc(label_value)
  File "/home/lahtinep/mambaforge/envs/py310/lib/python3.10/site-packages/pandas/core/indexes/datetimes.py", line 738, in get_loc
    raise KeyError(orig_key) from err
KeyError: numpy.datetime64('2023-05-10T07:55:09.504000000')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/lahtinep/bin/test_temporal_rgb_timeseries.py", line 17, in <module>
    blended = mscn.blend(blend_function=timeseries)
  File "/home/lahtinep/Software/pytroll/pytroll_packages/satpy/satpy/multiscene/_multiscene.py", line 365, in blend
    new_scn.generate_possible_composites(True)
  File "/home/lahtinep/Software/pytroll/pytroll_packages/satpy/satpy/scene.py", line 1430, in generate_possible_composites
    keepables = self._generate_composites_from_loaded_datasets()
  File "/home/lahtinep/Software/pytroll/pytroll_packages/satpy/satpy/scene.py", line 1449, in _generate_composites_from_loaded_datasets
    return self._generate_composites_nodes_from_loaded_datasets(needed_comp_nodes)
  File "/home/lahtinep/Software/pytroll/pytroll_packages/satpy/satpy/scene.py", line 1455, in _generate_composites_nodes_from_loaded_datasets
    self._generate_composite(node, keepables)
  File "/home/lahtinep/Software/pytroll/pytroll_packages/satpy/satpy/scene.py", line 1513, in _generate_composite
    composite = compositor(prereq_datasets,
  File "/home/lahtinep/Software/pytroll/pytroll_packages/satpy/satpy/composites/__init__.py", line 1775, in __call__
    new_projectables = self._apply_temporal_prerequisites(projectables)
  File "/home/lahtinep/Software/pytroll/pytroll_packages/satpy/satpy/composites/__init__.py", line 1763, in _apply_temporal_prerequisites
    new_da = proj.sel(time=ref_time+pd.Timedelta(dq["time"]))
  File "/home/lahtinep/mambaforge/envs/py310/lib/python3.10/site-packages/xarray/core/dataarray.py", line 1531, in sel
    ds = self._to_temp_dataset().sel(
  File "/home/lahtinep/mambaforge/envs/py310/lib/python3.10/site-packages/xarray/core/dataset.py", line 2580, in sel
    query_results = map_index_queries(
  File "/home/lahtinep/mambaforge/envs/py310/lib/python3.10/site-packages/xarray/core/indexing.py", line 189, in map_index_queries
    results.append(index.sel(labels, **options))
  File "/home/lahtinep/mambaforge/envs/py310/lib/python3.10/site-packages/xarray/core/indexes.py", line 488, in sel
    raise KeyError(
KeyError: "not all values found in index 'time'. Try setting the `method` keyword argument (example: method='nearest')."

Could this be something data related?

@gerritholl
Copy link
Collaborator Author

Good question. Maybe something about the different start times (nominal, actual, etc.). The time matching as I've currently implemented it needs to be exact, which would only work with nominal start times. I'll have a look.

@gerritholl
Copy link
Collaborator Author

We probably need a way to add tolerance or slicing capability, "search between T-65 and T-55" or so.

@pnuu
Copy link
Member

pnuu commented May 26, 2023

The SEVIRI datasets have these time-related keys:

nominal_start_time 2023-05-10 08:00:10.708720
nominal_end_time 2023-05-10 08:05:10.430222
start_time 2023-05-10 08:00:10.708000
end_time 2023-05-10 08:04:17.969000

There doesn't seem to be exact times available...

satpy/composites/__init__.py Outdated Show resolved Hide resolved
gerritholl and others added 3 commits May 26, 2023 14:37
Co-authored-by: Panu Lahtinen <pnuu+git@iki.fi>
Add a test for a direct compositor construction and calling with the
temporal RGB, and correct an error in the documentation that I found as
a consequence.
@djhoese djhoese added enhancement code enhancements, features, improvements component:compositors component:multiscene labels May 26, 2023
Copy link
Member

@djhoese djhoese left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of questions, but this is looking pretty good. One major high-level concern though.

... group_keys=["repeat_cycle_in_day"])
>>> ms.load(["vis_06"])
>>> sc = ms.blend(blend_function=satpy.multiscene.timeseries)
>>> comp = temporal([sc["vis_06"]]*3) # pass three times: same source for each timestep
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how I feel about this. I had figured the user might ask for the prerequisites without any time information, have a separate time kwarg, and the composite would then parse the time dimension and coordinate to determine which time steps to use. So in this case it would receive a single vis_06 with (time, y, x) dimensions and use data_arr.coords["time"] to find/match the necessary time requirements.

This gets at a larger issue I have or at least something that scares me which is putting the time information in the DataQuery. The DataQuery was designed without time in mind and putting it there now just because it doesn't break seems...risky. Or at least, seems like something we're going to regret later on. I was hoping the time information would go in a separate kwarg like time_order or time_intervals or something.

Comment on lines +1707 to +1708
class MissingTime(Exception):
"""Raised when temporal composite building lacks time dimension."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer this at the top of the module next to IncompatibleAreas.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe this exception and all of the temporal classes should go in a new satpy/composites/temporal.py?



class BaseTemporalCompositor(CompositeBase):
"""Compositors combining multiple time steps.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Compositors combining multiple time steps.
"""Compositor combining data from different times.

Comment on lines +1716 to +1722
To use this, the user must start with a :class:`MultiScene` and load the
temporal composite from there. Composite generation will fail due to missing
temporal information in the containing scenes, but it will add the temporal
composite to the scene wishlists. Now we run ``MultiScene.blend(timeseries)``,
which will create DataArrays with a time dimension. After blending, the
blend method tries again to generate the composites, and on this second run
the generation should work.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theoretically a user could create a Scene and add a DataArray with a time dimension and do .load and it would work just fine (I think). Alternatively, a reader could (some day) return a DataArray with a time dimension.

Comment on lines +1739 to +1746
def _ensure_prerequisites_have_times(self, prerequisites):
"""Make sure all prerequisites have times defined."""
for preq in prerequisites:
if "time" not in preq.to_dict():
raise KeyError(
"In a temporal composite, all prerequisites should have "
"time information defined. Time information is missing for "
f"{preq!s}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this also make sure that pd.Timedelta understands the time value? Or should that wait until __call__ time as you have it now?

Comment on lines +654 to +660
prerequisites:
- wavelength: 0.64
time: 0
- wavelength: 0.64
time: -10 min
- wavelength: 0.64
time: -20 min
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this means green and blue are before the red channel in time? Just curious, is this a standard way of looking at this or would +10, +20, should something equivalent?

Comment on lines +362 to +365
new_scn._wishlist = self._shared_wishlist()
# without copying the dependency tree, there is a KeyError in dependency_tree.trunk
new_scn._dependency_tree = self.first_scene._dependency_tree.copy()
new_scn.generate_possible_composites(True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you/we want to try copying the first Scene to get around this copying of hidden/private properties? I know we were worried about it, but maybe it is worth it? 🤷‍♂️

Comment on lines +1524 to +1526
except (IncompatibleAreas, MissingTime):
LOG.debug("Delaying generation of %s because of incompatible areas "
"or missing time dimension", str(compositor.id))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we split this debug message or make it specific to the case (temporal versus area)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:compositors component:multiscene enhancement code enhancements, features, improvements
Projects
Status: Ready for review
Development

Successfully merging this pull request may close these issues.

Define and calculate composites with temporal dependencies
4 participants