New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ObservationFilter class #1912

Merged
merged 6 commits into from Nov 14, 2018

Conversation

Projects
2 participants
@dcfidalgo
Contributor

dcfidalgo commented Nov 8, 2018

This PR is part of the PIG #1877 (roughly the 2 scenario of the implementation road map).

I send the PR now to maybe discuss it tomorrow in the call, but i am still missing tests, so not ready to merge yet!

In principle this PR already enables us to add a time filter and some event filters to an observation, which will be applied on the fly when accessing the events and gti properties of the DataStoreObservation.

@cdeil cdeil self-assigned this Nov 9, 2018

@cdeil cdeil added the feature label Nov 9, 2018

@cdeil cdeil added this to To do in Data via automation Nov 9, 2018

@cdeil cdeil added this to the 0.9 milestone Nov 9, 2018

@cdeil

This comment has been minimized.

Member

cdeil commented Nov 9, 2018

@dcfidalgo - Thanks!

I had a quick look.

One concern is that there needs to be a strategy concerning the handling of the edges of the intervals.
Time interval operations are similar to binning and grouping, and our experience with energy bin grouping in Gammapy was a huge mess (for users and devs), we had multiple iterations that were buggy over a year or two, because we didn't have a well thought out strategy and code review and tests from the start. I think now we have something that works reliably, in the end to achieve that always (e1, e2, e3) grouped onto itself would yield (e1, e2, e3) again we had to introduce something complex here to achieve this in a stable way even if there are rounding errors:

temp = np.interp(ebounds, ebounds_src, bin_edges_src)
bin_edges = np.round(temp, decimals=0).astype(np.int)

I'm not sure if we need this for time or phase intervals, it depends on what processing we do for those and what the output should be, but probably what you add here isn't it yet, specifically I see that you do a < x < b in several places, i.e. left and right edge excluded. Probably that's not what we want, we should do the common choice of left inclusive and right exclusive a <= x < b, see e.g. here: https://github.com/chaimleib/intervaltree
I think introducing eps in the code is never a good solution, we should try to get the behaviour we want just with < and <=, and if really needed with np.interp like we did with energy.

There's also the issue that the select methods you're adding on EventList here are almost identical to the ones we already have here: https://docs.gammapy.org/0.8/_modules/gammapy/data/obs_table.html#ObservationTable
I'm not sure what to do about that. Maybe an internal TableFilter class that takes a Table on init and then applies the filter, which would mean code re-use and consistent behaviour if both EventList and ObservationTable have one-line methods that do something like return TableFilter(self.table).select_time( <forward arguments >). The part I don't like about that is that the docstring will still be duplicated and is pretty long.

Finally, my usual comment: I think you're doing too much in one PR here. Scenario 2 says "Add an empty filter to an Observation"

https://github.com/gammapy/gammapy/blob/2d4955067f43f958d101907599f78479c47372d0/docs/development/pigs/pig-006.rst#implementation-road-map

You are doing that, plus already implementing scenario 3 "filter an Observation by time" and even filter by phase also. These extra things need some thought and discussion and better docs and added tests, especially the time filtering, but also the phase filtering how it should behave and the edges 0 and 1.

The example in the class docstring of ObservationFilter is nice. But it just creates the filter, doesn't do anything useful with it yet. Suggest to move that to a new section on this RST page https://docs.gammapy.org/0.8/data/index.html for now, or a filter sub-page there, and to load up one obs and apply a filter. Alternatively, you can also move this docs example to a test case for now, and then the docs additions come later. But IMO it should be extended a bit to also show the methods you are adding here in action, that helps for code review to start from the "user perspective".

So @dcfidalgo - overall this is a great addition, thanks! If you have time next week and want to continue in this PR and code review, that's OK I think, as long as it gets merged by the end of next week. This might be the fastest way. The other options are to split something out into a smaller PR, or to just merge this in soon and to open one or two follow-up issues or PRs to discuss the tricky parts concerning the detailed behaviour of time and phase filter and add the tests for that functionality there.

@dcfidalgo

This comment has been minimized.

Contributor

dcfidalgo commented Nov 9, 2018

@cdeil Thanks for your comments!

There's also the issue that the select methods you're adding on EventList here are almost identical to the ones we already have here: https://docs.gammapy.org/0.8/_modules/gammapy/data/obs_table.html#ObservationTable
I'm not sure what to do about that. Maybe an internal TableFilter class that takes a Table on init and then applies the filter, which would mean code re-use and consistent behaviour if both EventList and ObservationTable have one-line methods that do something like return TableFilter(self.table).select_time( ). The part I don't like about that is that the docstring will still be duplicated and is pretty long.

Looking ahead, i think the future of the ObservationTable class is not clear given that the whole data accessing stuff will likely change during the next year. So i would wait before trying to unify these methods, maybe in the end only the selection methods of EventList will be needed.

Finally, my usual comment: I think you're doing too much in one PR here. Scenario 2 says "Add an empty filter to an Observation"

The idea was to give a broader overview of the filter implementation, but i could have done it without explicitly implementing the new selection methods! I will replace the implementation with a NotImplementedError, then we can focus this PR and discussion on the ObservationFilter class.

So i will add some tests for the ObservationFilter class and improve the doc string, and then maybe it can go in within the next week.

@dcfidalgo dcfidalgo force-pushed the dcfidalgo:introducing_filter_class branch from 39bc137 to 302c9bc Nov 14, 2018

@dcfidalgo

This comment has been minimized.

Contributor

dcfidalgo commented Nov 14, 2018

I removed the implementations of the select_time and select_custom (for phase selection) method in the GTI and EventListBase class, respectively, and replaced them with a NotImplementedError.

I also improved a bit the example in the ObservationFilter class to show how to hook it up to an observation.

Two tests were also added:

  • check the event filter types
  • run the test case of an empty observation filter

@dcfidalgo dcfidalgo force-pushed the dcfidalgo:introducing_filter_class branch from 71bf84d to 3492019 Nov 14, 2018

@cdeil

cdeil approved these changes Nov 14, 2018

Thanks!

@cdeil cdeil merged commit eb4b495 into gammapy:master Nov 14, 2018

1 of 4 checks passed

Codacy/PR Quality Review Hang in there, Codacy is reviewing your Pull request.
Details
continuous-integration/appveyor/pr Waiting for AppVeyor build to complete
Details
continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
Scrutinizer Analysis: No new issues – Tests: passed
Details

Data automation moved this from To do to Done Nov 14, 2018

@dcfidalgo dcfidalgo deleted the dcfidalgo:introducing_filter_class branch Nov 15, 2018

@cdeil cdeil changed the title from Introducing the ObservationFilter class to Add ObservationFilter class Nov 23, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment