Allows DataStoreMaker to be used with IRFs not following CALDB structure #3846

QRemy · 2022-03-09T16:19:07Z

DataStore.from_events_files is used to create HDU and observation index tables from the EVENTS header. But this works only for IRFs following the CALDB structure, and the events file header have to contains the TELESCOP, CALDB, and IRF keywords (not part of GADF).

This PR improve the generation of the irf filename from the CALDB infos, and alternatively allow to read it directly from an extra keyword IRF_FILE in the events metadata.
Now, the events_sampling notebook show how to add this keyword, so the index files generated are not broken.

review-notebook-app · 2022-03-09T16:19:11Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

adonath · 2022-03-11T13:41:58Z

Related: open-gamma-ray-astro/gamma-astro-data-formats#183

adonath · 2022-03-11T13:57:06Z

Conclusion from the dev meeting:

Adding new keywords requires changes and discussion in GADF (see link above)
We can introduce Observation.write() to write events and IRF into one file
We can introduce a helper function like create_hdu_index_table(observations, irf_filenames={}) to create the HDU index table for the case where only one single set of IRFs is used

QRemy · 2022-03-11T15:37:36Z

I modified the DataStoreMaker to accept an irf_paths optional argument,
so now we can call DataStore.from_events_file(events_paths, irf_paths)
If irfs_paths is provided it must be the same length than events_paths (shown in the notebook).
If None the events file header have to contain CALDB and IRF keywords to locate the IRF file,
otherwise the IRFs are assumed to be contained in the events files.
This should cover all the use cases we have.

adonath · 2022-03-11T16:43:44Z

@QRemy Maybe still support the case where irf_path is a single dict internally and document it? Just to make it even more convenient for users?

QRemy · 2022-03-11T17:09:40Z

@QRemy Maybe still support the case where irf_path is a single dict internally and document it? Just to make it even more convenient for users?

As the irfs argument in Observation.create ?

adonath · 2022-03-11T17:52:01Z

No, I meant something along the lines of:

def from_events_files(event_paths, irf_paths):
    """..."""
    if len(irf_paths) == 1:
        irf_paths = [irf_paths] * len(event_paths)

codecov · 2022-03-14T13:48:07Z

Codecov Report

Merging #3846 (81f6a80) into master (fd0bb96) will increase coverage by 0.03%.
The diff coverage is 95.00%.

@@            Coverage Diff             @@
##           master    #3846      +/-   ##
==========================================
+ Coverage   93.77%   93.81%   +0.03%     
==========================================
  Files         162      162              
  Lines       20044    20060      +16     
==========================================
+ Hits        18797    18819      +22     
+ Misses       1247     1241       -6

Impacted Files	Coverage Δ
gammapy/data/data_store.py	`92.80% <94.59%> (+1.89%)`	⬆️
gammapy/utils/scripts.py	`95.65% <100.00%> (+0.19%)`	⬆️
gammapy/modeling/iminuit.py	`94.39% <0.00%> (+1.86%)`	⬆️

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

registerrier

Thanks @QRemy .

I have left some inline comments in the notebook.

@fabiopintore is this new behavior fine?

docs/tutorials/analysis/3D/event_sampling.ipynb

registerrier · 2022-04-01T17:24:39Z

docs/tutorials/analysis/3D/event_sampling.ipynb

@@ -747,8 +746,11 @@
   "outputs": [],
   "source": [
    "%%time\n",
+    "n_obs = len(tstarts)\n",
+    "irf_paths = [Path(irf_filename)] * n_obs\n",


Is it useful here? It is only when reading that you need the path to the irf now, no?

In general observations can have different irf files so the list of irfs_path have to be defined before the loop. I wanted to stress that in the notebook, even if it is not necessary for this use case (but we could use 2 different zenith angles for example).

registerrier · 2022-04-01T17:26:59Z

docs/tutorials/analysis/3D/event_sampling.ipynb

-    "paths = list(path.rglob(\"events*.fits\"))\n",
-    "data_store = DataStore.from_events_files(paths)\n",
+    "events_paths = list(path.rglob(\"events*.fits\"))\n",
+    "data_store = DataStore.from_events_files(events_paths, irf_paths)\n",


This is probably where you could define the list of irf_path no?

fabiopintore · 2022-04-03T12:18:53Z

It seems OK to me and no further comments from my side. Thanks @QRemy to have taken care of this!

registerrier

Thanks @QRemy . This looks good to me.

QRemy requested a review from registerrier March 9, 2022 16:19

QRemy added bug feature labels Mar 9, 2022

adonath added this to the 1.0 milestone Mar 11, 2022

adonath added this to To do in gammapy.data via automation Mar 11, 2022

QRemy force-pushed the nb_events_sampling_caldb branch 2 times, most recently from 601f255 to c3b34e0 Compare March 14, 2022 17:37

adonath self-assigned this Mar 22, 2022

registerrier previously approved these changes Apr 1, 2022

View reviewed changes

QRemy added 12 commits April 4, 2022 12:13

read irf filename from events header without CALDB

14524c3

add example in the notebook

d7d8834

use expandvars before glob

f89498e

irf_paths in input

a9ccfab

adapt notebook

8de837b

docstring

6be690c

fix keyword

fce9393

events_paths list from loop

cea0cfc

support only one path for irf

0d45bfb

remove unused import

241a737

use iterdir instead of glob

6d64268

adapt tests

197eb44

QRemy added 7 commits April 4, 2022 12:13

fix tests

3a4bd63

$CALDB replace for monkeypatch env

58f58a8

move datastore for dc1 with monkeypatch into fixture

a07117c

remove unused import

ec12488

typo in notebook

44f2eda

define caldb in docstring example

9b37bec

notebook typo

81f6a80

QRemy dismissed registerrier’s stale review via 81f6a80 April 4, 2022 10:20

QRemy force-pushed the nb_events_sampling_caldb branch from 59a5717 to 81f6a80 Compare April 4, 2022 10:20

registerrier approved these changes Apr 7, 2022

View reviewed changes

registerrier merged commit c92ce85 into gammapy:master Apr 7, 2022

gammapy.data automation moved this from To do to Done Apr 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allows DataStoreMaker to be used with IRFs not following CALDB structure #3846

Allows DataStoreMaker to be used with IRFs not following CALDB structure #3846

QRemy commented Mar 9, 2022

review-notebook-app bot commented Mar 9, 2022

adonath commented Mar 11, 2022

adonath commented Mar 11, 2022

QRemy commented Mar 11, 2022 •

edited

adonath commented Mar 11, 2022

QRemy commented Mar 11, 2022 •

edited

adonath commented Mar 11, 2022

codecov bot commented Mar 14, 2022 •

edited

registerrier left a comment

registerrier Apr 1, 2022

QRemy Apr 1, 2022

registerrier Apr 1, 2022

fabiopintore commented Apr 3, 2022

registerrier left a comment

Allows DataStoreMaker to be used with IRFs not following CALDB structure #3846

Allows DataStoreMaker to be used with IRFs not following CALDB structure #3846

Conversation

QRemy commented Mar 9, 2022

review-notebook-app bot commented Mar 9, 2022

adonath commented Mar 11, 2022

adonath commented Mar 11, 2022

QRemy commented Mar 11, 2022 • edited

adonath commented Mar 11, 2022

QRemy commented Mar 11, 2022 • edited

adonath commented Mar 11, 2022

codecov bot commented Mar 14, 2022 • edited

Codecov Report

registerrier left a comment

Choose a reason for hiding this comment

registerrier Apr 1, 2022

Choose a reason for hiding this comment

QRemy Apr 1, 2022

Choose a reason for hiding this comment

registerrier Apr 1, 2022

Choose a reason for hiding this comment

fabiopintore commented Apr 3, 2022

registerrier left a comment

Choose a reason for hiding this comment

QRemy commented Mar 11, 2022 •

edited

QRemy commented Mar 11, 2022 •

edited

codecov bot commented Mar 14, 2022 •

edited