New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make background cube models #319

Merged
merged 50 commits into from Aug 21, 2015

Conversation

Projects
None yet
2 participants
@mapazarr
Member

mapazarr commented Aug 8, 2015

Adding script to produce bg cube models. This is a continuation of the work started in: gammasky/hess-host-analyses#1.
TODO list:

  • divide the code into smaller functions
  • move most of the functionality to Gammapy classes/functions
  • implement H.E.S.S. to Gammapy obs lists converter
  • reorganize the bg models class
  • add simple dummy bg event list generator
  • add tests for scripts and new methods in Gammapy
  • add script to compare 2 sets of bg cube models
  • remove DEBUG plots
  • implement comments from: gammasky/hess-host-analyses#1 (and this PR)
  • update/implement docs (stringdocs and high-level docs)

@mapazarr mapazarr added this to the 0.4 milestone Aug 8, 2015

@mapazarr mapazarr force-pushed the mapazarr:prod_bg_cubes branch from 8b5a461 to 942adff Aug 8, 2015

@mapazarr

This comment has been minimized.

Member

mapazarr commented Aug 8, 2015

I'm aware of the travis-ci failure. It's because of the use of matplotlib without marking it as optional dependency (using the proper decorator in the tests). I'll work on that. It'll get fixed once I remove the plotting part (used for DEBUG for now).

@mapazarr

This comment has been minimized.

Member

mapazarr commented Aug 8, 2015

@cdeil I will need some fake data (event lists) for the tests. Is there something available? Otherwise I would write a supersimple random tool to do so.

@mapazarr mapazarr force-pushed the mapazarr:prod_bg_cubes branch 2 times, most recently from 30cc622 to 5996121 Aug 8, 2015

@mapazarr

This comment has been minimized.

Member

mapazarr commented Aug 8, 2015

I'm aware of the travis-ci failure. It's because of the use of scipy without marking it as optional dependency (using the proper decorator in the tests). I'll work on that.

@cdeil

This comment has been minimized.

Member

cdeil commented Aug 9, 2015

@mapazarr – A simple tool that samples background events according to some simple shape and spectrum would be good, even if it takes a day (but not more) to build. This way you'll have an easy way to generate example data, and you'll have "true" background cube models that you can compare the "reconstructed" ones against.

@cdeil cdeil added the feature label Aug 10, 2015

@cdeil cdeil self-assigned this Aug 10, 2015

@mapazarr

This comment has been minimized.

Member

mapazarr commented Aug 10, 2015

I'm aware that I need to rebase. Since the PR is still WIP, I'll do this later.

@mapazarr

This comment has been minimized.

Member

mapazarr commented Aug 11, 2015

@cdeil I have divided the bg cube model production into smaller functions and added them to Gammapy. (I still need to do it for the observation grouping.) Can you please have a look at the structure of the code in background/make.py and tell me if you like it?

# 'set_zero_level',
# 'smooth'
# @ C. Deil: does it make sense?

This comment has been minimized.

@mapazarr

mapazarr Aug 11, 2015

Member

@cdeil What do you think about this TODO?

This comment has been minimized.

@cdeil

cdeil Aug 11, 2015

Member

Yes, I think having separate container classes makes sense:

  • one that represents a cube
  • one that contains the counts, time and rate cubes

I don't have a clear picture yet on whether it would be good to have other classes, e.g.

  • Grouping specification
  • The algorithm that takes a grouping specification and produces all the cubes

If you can come up with a written down proposal for the hangout on Wednesday that would be good,
otherwise we start by reviewing this PR and then do some brainstorming how to best structure this (within the given time left, ~ 1 week).

This comment has been minimized.

@mapazarr

mapazarr Aug 21, 2015

Member

OK, separated.

@cdeil

This comment has been minimized.

Member

cdeil commented Aug 11, 2015

Please rebase on master now (making a backup copy before you start).
This has several advantages:

  • travis-ci runs
  • I can more easily try out your branch
  • sometimes the diff on github is more accurate
  • one large rebase at the end is harder than smaller rebases with only few conflicts from time to time.
Observation list to use for the histogramming.
fits_path : str
Path to the data files.
events_cube : `~gammapy.background.CubeBackgroundModel`

This comment has been minimized.

@cdeil

cdeil Aug 11, 2015

Member

Please rename events_cube -> counts_cube everywhere.

This comment has been minimized.

@mapazarr
Cube container for the events.
livetime_cube : `~gammapy.background.CubeBackgroundModel`
Cube container for the livetime.
DEBUG : int

This comment has been minimized.

@cdeil

cdeil Aug 11, 2015

Member

This is not a good way to do logging.

Please change to this way of generating log messages everywhere:
https://gammapy.readthedocs.org/en/latest/development/index.html?highlight=logging#generating-log-messages

This comment has been minimized.

@mapazarr

mapazarr Aug 21, 2015

Member

Fixed.

return events_cube, livetime_cube
def divide_bin_volume(cube):

This comment has been minimized.

@cdeil

cdeil Aug 11, 2015

Member

Such a function that just operates on one object is better located as a method on the object.
Any reason not to make this a CubeBackgroundModel member function?

This comment has been minimized.

@mapazarr

mapazarr Aug 21, 2015

Member

Moved to Cube.

return cube
def set_zero_level(cube):

This comment has been minimized.

@cdeil

cdeil Aug 11, 2015

Member

Same here ... why not make this a cube member function?
Is it because you are thinking about code re-use with other cube objects (like SpectralCube) that could also use this function?

This comment has been minimized.

@mapazarr

mapazarr Aug 21, 2015

Member

Moved to Cube.

Smooth method:
1. slice model in energy bins: 1 image per energy bin

This comment has been minimized.

@cdeil

cdeil Aug 11, 2015

Member

If the smoothing happens on 2D images, it's best (for readability and code re-use) to split that part that operates on 2D images out into a separate function. You could put it in gammapy/image/utils.py.

This comment has been minimized.

@mapazarr

mapazarr Aug 21, 2015

Member

Sorry, no time for this right now. Since we anyway are not sure this is the best smoothing method, I'm not sure it is worth the effort. Maybe in a later PR. Added to list in #333.

Observation list to use for the histogramming.
fits_path : str
Path to the data files.
DEBUG : int

This comment has been minimized.

@cdeil

cdeil Aug 11, 2015

Member

Again ... it's important to use Python logging.

This function could be one of those cases I'm talking about in the last paragraph here where it makes sense that they optionally take their own logger as input. Example.

This comment has been minimized.

@mapazarr
def create_bg_observation_list(fits_path):
"""Make total observation list and filter the observations.
In a first version, all obs taken within 3 deg of a known source will be rejected. If a source is extended, twice the extension is added to the corresponding exclusion region radius of 3 deg.

This comment has been minimized.

@cdeil

cdeil Aug 11, 2015

Member

Break line so that it's easily readable.

This comment has been minimized.

@mapazarr
# https://github.com/mapazarr/gammapy/blob/bg-api/dev/background-api.py#L55
altitude_edges = Angle([0, 20, 23, 27, 30, 33, 37, 40, 44, 49, 53, 58, 64, 72, 90], 'degree')
if DEBUG > 1:
altitude_edges = Angle([0, 45, 90], 'degree')

This comment has been minimized.

@cdeil

cdeil Aug 11, 2015

Member

I think we and others will want to try many different ways of grouping the off runs, and binning / smoothing / fitting the cubes. So having one hard-coded config and one alternative "debug" config for quicker tests doesn't scale to this task.

Making this nice and easy to configure is a hard problem and we have to think / discuss how to do this best.
My first thought would be do have some "...Config" or "...Specification" class (or plain Python dict for now)
and then for the command line tool accept a YAML config file describing this configuration as input.

I think that's what I want to do in Gammapy for all the configurable analyses ... YAML config files.
There's many packages that do this ... here's just one example: http://fermipy.readthedocs.org/en/latest/quickstart.html#creating-a-configuration-file

This comment has been minimized.

@cdeil

cdeil Aug 11, 2015

Member

@mapazarr – For now you could go one step in this direction and collect the parameters for the "hess-mayer" and "hess-debug" config in a dict and let the user select which config to run via a string option.

This comment has been minimized.

@mapazarr

mapazarr Aug 21, 2015

Member

Superdone: ObservationGroups class.

# save the observation list to a fits file
outfile = outdir +\
'bg_observation_table_alt{0}_az{1}.fits.gz'.format(i_alt, i_az)

This comment has been minimized.

@cdeil

cdeil Aug 11, 2015

Member

This way of naming files doesn't scale to other grouping parameters or schemes.
Maybe the best would be to just use GROUP_ID here and then have a separate data structure that defines the grouping and can be written / read.

(I'm not claiming to have a concrete great suggestion here ... this is the issue we've been discussing for a long time now that it's hard to have a good API / classes for observation grouping.)

This comment has been minimized.

@mapazarr

mapazarr Aug 21, 2015

Member

Done. It's hard though to compare models. See workaround in examples/plot_bg_cube_model_comparison.py (included in this PR). Lines ~50 - ~70.
But yes, I agree it is better.

for oldfile in os.listdir(outdir):
os.remove(outdir + oldfile)
# loop over altitude and azimuth angle bins: remember 1 bin less than bin boundaries

This comment has been minimized.

@cdeil

cdeil Aug 11, 2015

Member

Again, this function only works for exactly this grouping in (azimuth, zenith).
It would be much more flexible and better to have one for loop over observation groups here, i.e. to have the observation grouping and background modeling decoupled!

This comment has been minimized.

@mapazarr

mapazarr Aug 21, 2015

Member

Again: done.

setup.py Outdated
@@ -94,6 +93,7 @@
'gammapy-iterative-source-detect = gammapy.scripts.iterative_source_detect:main',
'gammapy-look-up-image = gammapy.scripts.look_up_image:main',
'gammapy-model-image = gammapy.scripts.model_image:main',
'gammapy-make_bg_cube_models = gammapy.scripts.make_bg_cube_models:main',

This comment has been minimized.

@cdeil

cdeil Aug 11, 2015

Member

Please stick with the naming scheme to always use hyphens, not underscores in command line tools.
Hyphens are easier to type (no shift) and mixing hyphens and underscores will mean that users always have to guess.
gammapy-make_bg_cube_models -> gammapy-make-bg-cube-models

This comment has been minimized.

@mapazarr

mapazarr Aug 19, 2015

Member

Fixed.

@@ -10,3 +10,4 @@
from .template import *
from .kernel import *
from .models import *
from .make import *

This comment has been minimized.

@cdeil

cdeil Aug 11, 2015

Member

I think make is too generic a name, plus it would be good to have the class and functions operating on that class (if any remain standalone) in the same file.
So how about renaming make.py -> cube_model.py and moving the CubeBackgroundModel class here?

This comment has been minimized.

@mapazarr

mapazarr Aug 12, 2015

Member

Well, most of the functions should probably move to other places. I like having the generic naming of models.py and make.py, in case other models come in the future and can share the same files. I don't have a strong opinion on that. I was just following the same convention as in the dataset module.

After this clarification: should I keep it as it is? or should I use cube specific naming?

BTW, the generic Cube class mentioned in #319 (diff) could go into a cube.py file, similar to maps.py.

This comment has been minimized.

@cdeil

cdeil Aug 12, 2015

Member

I especially don't like make.py, because a year from now there will be lots of other code that makes things in gammpay.background, e.g. reflected regions or exclusion regions or ... and that code shouldn't be in the same file as what's in make.py now.

So either cube_model.py like I suggested, or cube.py like you did seems better to me.

But in the end ... we will accumulate functionality for a while and then try to do a consistent naming across Gammapy before the 1.0 release. So it's not terribly important what we choose now ... you decide!

This comment has been minimized.

@mapazarr

mapazarr Aug 21, 2015

Member

Done. I moved all methods to either Cube in cube.py or CubeBackgroundModel in models.py. I left only the skeleton in make_bg_cube_model in make.py.

@cdeil

This comment has been minimized.

Member

cdeil commented Aug 11, 2015

@mapazarr – I've left a bunch of inline comments. Some (like rebase, logging) are straightforward to address, others (like grouping, configuring) are vague and need more thought / discussion.

One thing that would help me very much to give useful feedback on this would be if I had a high-level docs page with examples how to run the command line tool so that I can quickly start playing around with this. I'm not sure if generating a ~ 10 run example dataset and using it for a public docs page in Gammapy is best, or a HESS-internal page showing how to do it for a small and quick and the full HESS off run dataset. Ideally both. :-)

@mapazarr mapazarr force-pushed the mapazarr:prod_bg_cubes branch from 0712892 to 8205f3f Aug 11, 2015

@mapazarr

This comment has been minimized.

Member

mapazarr commented Aug 12, 2015

@cdeil OK, I added a documentation page. Sorry, I thought it was clear (for now), when running the inline command with --help and left the doc for later. It's a subpage of the main background page.

I still didn't have time to work on the test data for Gammapy. As for now, it runs on H.E.S.S. data (prod02). You can run a quick test with:

gammapy-make_bg_cube_models /path/to/fits/event_lists/base/dir --test True

The whole production (without --test) takes ~ 5 min on my computer.

The output files are created in the current directory:
* ``bg_observation_table.fits.gz``: total observatin table used for

This comment has been minimized.

@cdeil

cdeil Aug 12, 2015

Member

Typo: observatin -> observation

This comment has been minimized.

@mapazarr

mapazarr Aug 21, 2015

Member

Fixed.

@mapazarr mapazarr referenced this pull request Aug 21, 2015

Closed

Improve cube bg model production #333

1 of 20 tasks complete
@mapazarr

This comment has been minimized.

Member

mapazarr commented Aug 21, 2015

@cdeil OK, I addressed almost all comments. Please don't add any new ones, since I still need to finish a couple of thing (and sleep).

mapazarr added a commit to mapazarr/gammapy that referenced this pull request Aug 21, 2015

@mapazarr mapazarr force-pushed the mapazarr:prod_bg_cubes branch from 21c3562 to 73a45d3 Aug 21, 2015

@mapazarr

This comment has been minimized.

Member

mapazarr commented Aug 21, 2015

@deil Ok, I finished with the comments. Can you please press the green button? There is surely room for improvement, but I can't stay awake longer.

@cdeil

This comment has been minimized.

Member

cdeil commented Aug 21, 2015

@mapazarr – Merging this now. Thank you!

This is the biggest PR we've ever had in Gammapy (+3,682 −882). It contains a lot of great new functionality (observation handling, grouping, cube background model production).

One issue is that this goes in without substantial code review of methods, API, implementation, tests, docs because GSoC is over and I wasn't available much.
So we'll have to see how we handle this for the coming weeks / months. One possibility is that I take some time (a few days needed?) review the diff of this PR and the task list in the follow-up issue #333 to this PR.
Or we'll just do a big systematic review of all of Gammapy and try to improve it in the months before the 1.0 release ... probably we should do this as most code in Gammapy hasn't been thoroughly reviewed and many things changed (like became available in Astropy or we decided to use Sherpa).
Or a little bit of both.

cdeil added a commit that referenced this pull request Aug 21, 2015

Merge pull request #319 from mapazarr/prod_bg_cubes
Make background cube models

@cdeil cdeil merged commit 912db36 into gammapy:master Aug 21, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@cdeil

This comment has been minimized.

Member

cdeil commented Aug 21, 2015

The Sphinx build on RTD succeeded:
https://readthedocs.org/projects/gammapy/builds/3264248/
(and RTD since yesterday? shows a much better build log, with the commands they are running and the corresponding output)

Here's some of the new high-level docs pages from this PR ... they all look OK to me (formatting-wise ... I have to take a day next week to go through them):

@mapazarr

This comment has been minimized.

Member

mapazarr commented Aug 21, 2015

snapshot42
Release Gammapy 0.3.1 ? :-D

@mapazarr

This comment has been minimized.

Member

mapazarr commented Aug 21, 2015

The work should continue in #333, or other subsequent PRs.

@cdeil

This comment has been minimized.

Member

cdeil commented Aug 21, 2015

Release Gammapy 0.3.1?

No, we'll just release 0.4 in ~ 1 month.
Better to get some usage / testing for the stuff from this PR. And I also want to have a first version of morphology and spectrum fitting in the next release (which is what most users want).

JonathanDHarris added a commit to JonathanDHarris/gammapy that referenced this pull request Sep 25, 2015

JonathanDHarris added a commit to JonathanDHarris/gammapy that referenced this pull request Sep 25, 2015

JonathanDHarris added a commit to JonathanDHarris/gammapy that referenced this pull request Sep 25, 2015

JonathanDHarris added a commit to JonathanDHarris/gammapy that referenced this pull request Sep 25, 2015

JonathanDHarris added a commit to JonathanDHarris/gammapy that referenced this pull request Sep 25, 2015

JonathanDHarris added a commit to JonathanDHarris/gammapy that referenced this pull request Sep 25, 2015

JonathanDHarris added a commit to JonathanDHarris/gammapy that referenced this pull request Sep 25, 2015

dlennarz pushed a commit to dlennarz/gammapy that referenced this pull request Oct 11, 2015

dlennarz pushed a commit to dlennarz/gammapy that referenced this pull request Oct 11, 2015

dlennarz pushed a commit to dlennarz/gammapy that referenced this pull request Oct 11, 2015

dlennarz pushed a commit to dlennarz/gammapy that referenced this pull request Oct 11, 2015

dlennarz pushed a commit to dlennarz/gammapy that referenced this pull request Oct 11, 2015

dlennarz pushed a commit to dlennarz/gammapy that referenced this pull request Oct 11, 2015

dlennarz pushed a commit to dlennarz/gammapy that referenced this pull request Oct 11, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment