Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata filtering #58

Merged
merged 8 commits into from
Aug 22, 2017
Merged

Metadata filtering #58

merged 8 commits into from
Aug 22, 2017

Conversation

mraspaud
Copy link
Member

This adds the possibility to use metadata to filter data to load, eg:
glbl = Scene(service='0° Service', sensor="seviri", reader="hrit_msg", start_time=tslot, end_time=tslot + dt.timedelta(minutes=5), base_dir=thedir)
which would filter on service if it is defined in the filehandler metadata

@mraspaud mraspaud requested a review from djhoese August 18, 2017 08:16
@coveralls
Copy link

coveralls commented Aug 18, 2017

Coverage Status

Coverage increased (+0.005%) to 58.904% when pulling 9687721 on metadata-filtering into 65833a8 on develop.

@coveralls
Copy link

coveralls commented Aug 18, 2017

Coverage Status

Coverage increased (+0.005%) to 58.904% when pulling c636990 on metadata-filtering into 65833a8 on develop.

@coveralls
Copy link

coveralls commented Aug 18, 2017

Coverage Status

Coverage increased (+0.005%) to 58.904% when pulling 860add8 on metadata-filtering into 65833a8 on develop.

Copy link
Member

@djhoese djhoese left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good except removing of **kwargs should be looked at.

@@ -381,7 +381,9 @@ def __init__(self,
start_time=None,
end_time=None,
area=None,
filter_filenames=True, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not 100% sure, but I believe the **kwargs is needed for any additional keyword arguments that may have been specified for one reader, but are not useful for another. The two cases where this would show up are 1. a Scene with more than one reader (not a thing right now I think) and 2. instantiating readers in order to match files names and passing all user provided keyword arguments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, putting it back.

@@ -543,8 +560,8 @@ def new_filehandlers_for_filetype(self, filetype_info, filenames):
filehandler_iter = self.new_filehandler_instances(filetype_info,
filename_iter)
return [fhd
for fhd in self.filter_fh_by_area(self.filter_fh_by_time(
filehandler_iter))]
for fhd in self.filter_fh_by_area(self.filter_fh_by_time(self.filter_fh_by_mda(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these are generators maybe it would look better to instantiate them separately, above the list comprehension. Then use the final one in the list comprehension...or shouldn't this just be list(generator)?

Copy link
Member Author

@mraspaud mraspaud Aug 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so like

filtered = self.filter_fh_by_area(self.filter_fh_by_time(filehandler_iter))
return filtered

?
We could probably return the iterator btw

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No we can't

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean:

filtered_time = self.filter_fh_by_time(filehandler_iter)
filtered_area = self.filter_fh_by_area(filtered_time)
return list(filtered_area)

Or something similar.

yield filehandler
continue
for key, val in self.metadata.items():
if key in filehandler.mda and val != filehandler.mda[key]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is mda a common shorthand for metadata? Should we be more specific and consistent and call it metadata?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, maybe the base file handler should create the metadata dictionary and assign the filename info to it?

@coveralls
Copy link

coveralls commented Aug 22, 2017

Coverage Status

Coverage decreased (-0.07%) to 58.832% when pulling 8ac62c0 on metadata-filtering into 65833a8 on develop.

3 similar comments
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.07%) to 58.832% when pulling 8ac62c0 on metadata-filtering into 65833a8 on develop.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.07%) to 58.832% when pulling 8ac62c0 on metadata-filtering into 65833a8 on develop.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.07%) to 58.832% when pulling 8ac62c0 on metadata-filtering into 65833a8 on develop.

@mraspaud mraspaud merged commit 1faa560 into develop Aug 22, 2017
@mraspaud mraspaud deleted the metadata-filtering branch August 22, 2017 13:37
@djhoese djhoese mentioned this pull request Sep 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants