Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Scene loading and dependency tree #1351

Merged
merged 14 commits into from
Dec 23, 2020

Conversation

mraspaud
Copy link
Member

@mraspaud mraspaud commented Sep 8, 2020

This is a start for refactoring the loading of data in the scene class.

More changes will be needed when a way forward emerges from the discussion in #1469

This PR also solves #1443.

@mraspaud mraspaud added enhancement code enhancements, features, improvements refactor component:scene labels Sep 8, 2020
@mraspaud mraspaud self-assigned this Sep 8, 2020
@coveralls
Copy link

coveralls commented Sep 8, 2020

Coverage Status

Coverage increased (+0.006%) to 90.275% when pulling dae9f28 on mraspaud:refactor-scene-loading into 477136b on pytroll:master.

@codecov
Copy link

codecov bot commented Sep 8, 2020

Codecov Report

Merging #1351 (055bc02) into master (c0e21e6) will increase coverage by 1.03%.
The diff coverage is 97.80%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1351      +/-   ##
==========================================
+ Coverage   90.58%   91.62%   +1.03%     
==========================================
  Files         239      246       +7     
  Lines       34313    36015    +1702     
==========================================
+ Hits        31081    32997    +1916     
+ Misses       3232     3018     -214     
Flag Coverage Δ
behaviourtests 4.42% <17.98%> (-0.12%) ⬇️
unittests 92.08% <97.36%> (+1.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
satpy/dependency_tree.py 97.57% <87.50%> (+0.31%) ⬆️
satpy/node.py 88.88% <95.45%> (+2.52%) ⬆️
satpy/scene.py 90.92% <98.02%> (+0.31%) ⬆️
satpy/tests/test_node.py 100.00% <100.00%> (ø)
satpy/tests/test_scene.py 99.73% <100.00%> (+<0.01%) ⬆️
satpy/tests/utils.py 97.70% <100.00%> (+2.29%) ⬆️
satpy/readers/seviri_l1b_hrit.py 90.93% <0.00%> (-1.25%) ⬇️
satpy/readers/_geos_area.py 100.00% <0.00%> (ø)
satpy/tests/reader_tests/test_glm_l2.py 100.00% <0.00%> (ø)
... and 40 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c0e21e6...055bc02. Read the comment docs.

satpy/scene.py Outdated
".mitiff": "mitiff"}
return mapping.get(extension.lower(), 'simple_image')

def remove_failed_datasets(self, keepables):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did this have to be made public for the case where we call it above on new_scn?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, exactly. As you know, it's bad practice to use the private methods of a class outside of it.

@mraspaud
Copy link
Member Author

mraspaud commented Sep 8, 2020

The refactoring here is a WIP (hence the draft). I'm trying to see if I can get the channel and composite loading logic out of the Scene class, eg in a delegate SceneLoader class.

@mraspaud mraspaud mentioned this pull request Dec 18, 2020
3 tasks
@mraspaud mraspaud marked this pull request as ready for review December 18, 2020 15:21
satpy/node.py Show resolved Hide resolved
Copy link
Member

@djhoese djhoese left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GitHub made this really hard to review as it didn't really understand the rearrangement of Scene methods. I'm not sure I fully understand everything that was done, but I know a lot of the complexity of the dep tree is covered by the tests so I'm not afraid to merge this.

However, you removed property from missing_datasets which is a significant change for a refactor PR, especially since it doesn't seem like it is needed (it doesn't take any additional arguments). I also had some questions on some of the variable names you chose.

satpy/scene.py Outdated
ds = xr.Dataset(ds_dict, coords={"latitude": (["y", "x"], lats),
"longitude": (["y", "x"], lons)})

ds.attrs = mdata
return ds

def _get_actual_dataarrays(self, datasets):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a better name for this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replaced with _get_dataarrays_from_identifiers


def _find_matching_ids_in_readers(self, dataset_key):
matching_ids = {}
for reader_name, reader_instance in self.readers.items():
matching_ids[reader_name] = []
for file_handler_name, reader_instance in self.readers.items():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These aren't file handlers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. I got confused by looking at the values, forgetting that the reader name is also set in the yaml file for the format. Reverted.

@@ -343,13 +339,13 @@ def _get_unique_id_from_sorted_ids(sorted_ids, distances):
raise MissingDependencies
return result

def _get_unique_node_from_id(self, result, data):
def _get_unique_reader_node_from_id(self, unique_id, reader_name):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unique_id? Why not call it data_id if that's what it is?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, thanks

satpy/scene.py Outdated
@@ -217,7 +217,6 @@ def end_time(self):
"""Return the end time of the file."""
return self.attrs['end_time']

@property
def missing_datasets(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this no longer a property? This is a backwards incompatibility. Not really the best thing for a refactor PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this shouldn't be in a refactor PR, I will remove it. The PR was hastily finished, and that needed more work obviously which hasn't been done yet.

However, the reason I have this here is that I believe it should be a function, not a property. I got properly confused when doing the refactoring because I wasn't expecting the value of missing_dataset to be changing as it did between load calls. Maybe that's just me, but it seemed much more appropriate to make this a function then.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because I wasn't expecting the value of missing_dataset to be changing as it did between load calls

Not sure I understand how making it a method stops it from changing between load calls. Also not sure why it wouldn't change between load calls. If you try to load something and it fails, it is missing, if you try to load something else and it fails, it is also missing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

making it a method doesn't stop it from changing of course, but a method feels more dynamic as opposed to a property or attribute. Anyway this is changed back now.

Copy link
Member

@djhoese djhoese left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the changes. Looks good to me as far as I can tell. Like I said GitHub isn't showing the diffs super well with all of the reorganization. I trust you.

@djhoese djhoese changed the title Refactor scene loading Refactor Scene loading and dependency tree Dec 23, 2020
@djhoese djhoese merged commit a701b9a into pytroll:master Dec 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:scene enhancement code enhancements, features, improvements refactor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Loading and resampling composites sometimes discards their dependencies
3 participants