New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Take observation time from GTI table #1908

Merged
merged 6 commits into from Nov 14, 2018

Conversation

Projects
2 participants
@dcfidalgo
Contributor

dcfidalgo commented Nov 7, 2018

This PR addresses issues #1866 and #1504.

The observation time will now be calculated using the GTI table pointed to in the HDU index file.
For backward compatibility, it will take the observation time from the Obs index file if there is no GTI table available.

Maybe we could add a warning that taking the obs time from the Obs index file is deprecated and will raise an error in the future?

I checked the data sets we have in gammapy-extra for differences between the live-times (computed from the GTI table and taken from the obs index file). Only the data sets

  • datasets/hess-crab4-hd-hap-prod2 and
  • test_datasets/cube/data

show a deviation of about 0.1-0.5%. For the rest of the data sets there is basically no difference.
For datasets/hess-crab4-pa there is no GTI table available.

@cdeil cdeil added the cleanup label Nov 9, 2018

@cdeil cdeil added this to To do in Data via automation Nov 9, 2018

@cdeil cdeil added this to the 0.9 milestone Nov 9, 2018

@cdeil cdeil self-assigned this Nov 9, 2018

@dcfidalgo

This comment has been minimized.

Contributor

dcfidalgo commented Nov 9, 2018

We just discussed this PR in the call. Conclusion: rather than changing the observation_time_duration and observation_live_time_duration property, it would be better to modify the gti property in a way that it creates a GTI table on the fly if necessary.

I will make this change here in this PR.

@dcfidalgo

This comment has been minimized.

Contributor

dcfidalgo commented Nov 13, 2018

I introduced a _create_missing_gti() method, that is called when no GTI table is present in the HDU index file. Also added a test for this method.

Also, observation_time_duration and observation_live_time_duration are no "lazyproperties" anymore, since when applying a time filter, these values can change. So i think it is safer to not cache here.

@cdeil From my side this would be ready to merge.

@dcfidalgo dcfidalgo force-pushed the dcfidalgo:obs_time_from_gti branch 2 times, most recently from d5dbcc4 to a04ddf2 Nov 13, 2018

@dcfidalgo dcfidalgo force-pushed the dcfidalgo:obs_time_from_gti branch from a04ddf2 to edf1ebb Nov 13, 2018

@dcfidalgo

This comment has been minimized.

Contributor

dcfidalgo commented Nov 13, 2018

Just did a rebase on the master, after Axels message on slack.

@cdeil

@dcfidalgo - Thanks!

Some suggestions inline.

I was expecting that you would read the EVENTS header and create the GTI from there.
Here you create it from the OBS_INDEX file.
OK, fine to leave as-is. But I'd suggest to move the method to make the GTI object on the ObservationTable class. This will also simplify the test.
Later we will probably have code to make the GTI table or an ObservationInfo object that also contains pointing info from the EVENTS header, that's needed if we no longer rely on index files.

I specifically don't like the name _create_missing_gti because it doesn't mention from which info this gti is created and wanted to suggest a better / longer name - but @dcfidalgo if you agree to move this to a method on the ObservationTable class, the point is resolved.

The caching, i.e. where we have lazyproperty and where property is pretty arbitrary by now.
Possibly the solution will be to introduce an ObservationInfo object that contains all time and pointing-related info, and that would be cached. Basically an Observation would consists of three parts: events, irfs, info.

Given that here all the remaining lazy properties (except possibly GTI) just access obs_info, which is just a row from a table, i.e. very fast, doesn't access the disk, you could either remove all lazyproperty completely in this file, or only keep them for the obs_info and gti properties. Or just leave as-is for now, as something to be figured out in a follow-up PR.

HDUDOC="https://github.com/open-gamma-ray-astro/gamma-astro-data-formats",
HDUVERS="0.2",
HDUCLAS1="GTI",
MJDREFI=self.data_store.obs_table.meta["MJDREFI"],

This comment has been minimized.

@cdeil

cdeil Nov 13, 2018

Member

We already use time_ref_from_dict(self.data_store.obs_table.meta) in other places in this file. Suggest to use it here also to get consistent behaviour.

This comment has been minimized.

@dcfidalgo

dcfidalgo Nov 14, 2018

Contributor

time_ref_from_dict already returns a Time object, whereas here i rather need the int and float values.

'obs index' header. The `DataStoreObservation._create_missing_gti()` method creates a GTI table on-the-fly using
exactly those values.
"""
ds = data_store()

This comment has been minimized.

@cdeil

cdeil Nov 13, 2018

Member

data_store is a session-scoped fixture. Fixtures shouldn't be called, they should be injected into tests. (see other existing uses of this fixture). This will give a warning or error with the latest pytest release.

Suggest to put the _create_missing_gti method as ObservationTable.make_gti_table(obs_id) and add a test for it there without doing something complex like modifying a DataStore fixture object. Given that this is just a small one-line workaround I think it's fine to have no extra unit test from the DataStoreObservation side.

ds.hdu_table.remove_row(1) # Remove reference to GTI table -> GTI table will be created on-the-fly
obs_time_2 = ds.obs(20136).observation_time_duration
assert obs_time_1 == obs_time_2

This comment has been minimized.

@cdeil

cdeil Nov 13, 2018

Member

Exact asserts on floats probably isn't a good idea.

Do an allclose assert for floats, and use one of our helper functions if it makes the assert allclose simpler:
https://docs.gammapy.org/0.8/utils/index.html#id11

This comment has been minimized.

@dcfidalgo

dcfidalgo Nov 14, 2018

Contributor

Done! See the new test in test_obs_table.py

@dcfidalgo

This comment has been minimized.

Contributor

dcfidalgo commented Nov 14, 2018

@cdeil Thanks for the comments!

I moved the method to the ObservationTable class. For the test i had to modify a bit the test observation table to reflect the time header format.

I left a small test in the test_observations.py that just checks if we correctly catch the case when a GTI table is missing in the HDU Index file.

Given that here all the remaining lazy properties (except possibly GTI) just access obs_info, which is just a row from a table, i.e. very fast, doesn't access the disk, you could either remove all lazyproperty completely in this file, or only keep them for the obs_info and gti properties. Or just leave as-is for now, as something to be figured out in a follow-up PR.

For now i simply removed all lazy properties, maybe we can discuss a possible optimization in a follow-up PR.

@cdeil

@dcfidalgo - Thanks!

Two small comments inline, I'll make one follow-up commit and merge this now.

)
idx = self.get_obs_idx(obs_id)[0]
start = self["TSTART"][idx].astype("float64")

This comment has been minimized.

@cdeil

cdeil Nov 14, 2018

Member

I don't think an astype("float64") here can be helpful; the spec says it should be 64-bit, and if it isn't going 32 -> 64 at this point doesn't help, if the precision is already low, the result will be the same.

I'll do a small edit here to ensure the unit in the GTI table is "s". It should already be in the obs table, but given that you hardcode self.meta.get("TIMEUNIT", "s") above, I think it's a little less error-prone to also go to "s" for the value here.

"""This tests the case when a GTI table is missing in the HDU index file.
For backward compatibility we create a GTI table on-the-fly by means of the Obs Index file
TODO: This method can be removed once GTI tables are explicitly required in Gammapy.

This comment has been minimized.

@cdeil

cdeil Nov 14, 2018

Member

@dcfidalgo - I'll remove this test already now. I've had very bad experiences with tests that modify fixture objects, and then depending on test execution order other tests broke and it was very hard to debug. I also don't like to have such complex tests where state is backed up and restored.
Overall this is an edge case that should go away soon, I think it's OK to leave this one line untested.

@cdeil

cdeil approved these changes Nov 14, 2018

@cdeil cdeil merged commit 9db3efb into gammapy:master Nov 14, 2018

0 of 3 checks passed

Scrutinizer Created
Details
continuous-integration/appveyor/pr Waiting for AppVeyor build to complete
Details
continuous-integration/travis-ci/pr The Travis CI build is in progress
Details

Data automation moved this from To do to Done Nov 14, 2018

@dcfidalgo dcfidalgo deleted the dcfidalgo:obs_time_from_gti branch Nov 14, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment