Skip to content

GSoC 2015 Application Manuel Paz Arribas: Astropy: background modeling for Gammapy

mapazarr edited this page Mar 27, 2015 · 8 revisions

##Sub-organization information

Sub-organization:

Astropy

Personal background and programming experience

I am a PhD student working on gamma-ray astronomy. I work on data analysis using H.E.S.S. (ref: http://www.mpi-hd.mpg.de/hfm/HESS/) observations and CTA (ref: https://portal.cta-observatory.org/Pages/Home.aspx) simulations.

As a member of both the H.E.S.S. Collaboration and the CTA Consortium I have been working on the past few years in developing, testing and improving the analysis code. In particular for H.E.S.S. I have been active in morphology analysis (skymaps and profiles), spectral analysis and background modeling. For CTA I have helped in addapting the H.E.S.S. code for reading CTA simulations. In addition I have developped code to perform trigger rate and data rate estimations as well as the analysis of simulated gamma-ray sources.

The software I have been working on serves a large community (200+ members), therefore I am used to work in a collaborative way. Unfortunately the code is not open source, so I cannot provide links to my contributions in the application. Since the code is mostly written in C++ and based on ROOT (ref: https://root.cern.ch) classes, I have a large experience in C++ coding. I have also some experience with python scripting and some of its packages: numpy, matplotlib, scipy.

I have largely worked with CVS and sometimes with SVN, therefore I understand the procedures of working on repositories. Although git is new to me I already performed some tutorials and several pull requests for the Gammapy affiliated package, so I already have the basic know-how needed for actively participate in the development.

Since the use of python as a programming language for analysis code and scripting is extending among the scientific community in general and the astronomy community in particular, I feel excited about participating in a project, where I can develop my programming skills in python and its packages. In addition, the astronomy community is putting much effort in the development of common tools in python with Astropy (ref: http://www.astropy.org/) and its affiliated packages like Gammapy (ref: https://gammapy.readthedocs.org/en/latest/). This project would allow me to gain expertise with these tools that I can later apply to my research.

##Project proposal information

Proposal title:

Astropy: background modeling for Gammapy

Proposal abstract:

Gamma-ray astronomy has experienced a fast development in the past 2 decades with both ground-based imaging atmospheric Cherenkov telescope (IACT) experiments like H.E.S.S. (ref: http://www.mpi-hd.mpg.de/hfm/HESS/), MAGIC (ref: https://magic.mpp.mpg.de/) and VERITAS (ref: http://veritas.sao.arizona.edu/) and satellites like Fermi (ref: http://fermi.gsfc.nasa.gov/). In addition the next generation of IACT experiment CTA (ref: https://portal.cta-observatory.org/Pages/Home.aspx) is in its prototype phase.

The Fermi data is publicly accessible and the CTA data will also be in the future. Gammapy (ref: https://gammapy.readthedocs.org/en/latest/) is an open source (BSD licensed) gamma-ray astronomy Python package. It is an in-development affiliated package of Astropy (ref: http://www.astropy.org/) that builds on the core scientific Python stack to provide tools to simulate and analyze the gamma-ray sky for telescopes such as Fermi, H.E.S.S. and CTA.

The instruments developed for detecting gamma-rays accumulate many background events. The majority is rejected either by using an intelligent triggering system for the detectors, or in early stages of the data analysis. Unfortunately, there is still a dominant fraction of events passing the gamma-ray selection cuts, called the gamma-like background. In order to extract the gamma-ray signal, clever algorithms for modeling the gamma-like background are essential.

If my project is accepted, I will implement the most successful background modeling methods largely in use by the gamma-ray community in the Astropy/Gammapy framework. In a first step I will implement tools to create background model templates from observations with no or only a few gamma-ray sources in the field of view. In a second step, I will develop algorithms to estimate the background in observations containing gamma-ray sources to detect them and measure their spatial shape and energy spectrum, in some cases using the model templates from the first step.

Proposal detailed description:

The Gammapy package reads as input a list of events with reconstructed properties like direction, energy and time, and aims to deliver high-level analysis output like skymaps and spectra or cubes (x, y, energy). In order to do this, accurate background models to reject the dominant gamma-like background are essential.

The background modeling techniques currently in use by the community are the 2D background model methods described in Berge 2007 (ref: http://adsabs.harvard.edu/abs/2007A%26A...466.1219B): the ring background, the reflected region background, the template background, the field of view (FoV) background and the ON/OFF background. In addition a 3D model (the cube background model) is largely in use by Fermi. All these methods can be classified in two categories, according to the observation strategy:

  1. Background models from OFF observations, where the background is modeled from observations far away from any known sources. IACT experiments require dedicated OFF-source observations for modeling the background. The methods are:
  • (3D) cube background (long-lat-energy): the background is determined from observations of empty sky at the same conditions as the ON observations (i.e. observations of the region of interest). The events are binned in spatial coordinates (for instance long, lat) and energy (x, y, e). The normalization between ON and OFF observations is deduced from the total event numbers of ON and OFF observations excluding the ON region.

  • (2D) ON/OFF: similar to the cube background, except that the binning is done only in spatial coordinates.

  1. Background models from ON observations, where the background is modeled using observation within or close-by to the region of interest (a.k.a. ON region). The methods are:
  • FoV background: the entire FoV, excluding the ON region and other possible gamma-ray sources (i.e. exclusion regions) is used to derive a normalization of the acceptance model derived from OFF observations.

  • (Adaptive) ring background: a ring region around the trial position in the sky is used to determine the background. The background normalization in each position of the ring needs to be corrected for the acceptance of the system. Depending on the extension of the exclusion regions, the size of the ring can automatically be adjusted for each pixel in the sky, according to the usable area of the ring, so that enough background falls into it.

  • Reflected background: the on region is reflected w.r.t. the center of the FoV at different angles. Since the acceptance of the system is to a good degree of approximately radial w.r.t. the center of the camera, it does not need to be corrected. This method requires observations in the so called wobble mode, where the detectors do not point directly to the target, but slightly offset.

  • Template background: the background is estimated from the events within the ON region, but displaced in the image-shape parameter space.

None of these methods are currently available in Gammapy. In the project I will concentrate on the background methods suited for skymaps and cube production. Specifically: cube background, FoV background, ring background and adaptive ring background.

I will start the project by implementing the necessary tools in Astropy for the implementation of the background techniques. In particular, fill and lookup methods for NDData, a table model defined on a grid with interpolation and smoothing methods and a radially symmetric 2D model with evaluation methods. The strategy for the latter is unclear at present. The idea is to extract the functionality from gammapy.irf.TablePSF (ref: https://gammapy.readthedocs.org/en/latest/api/gammapy.irf.TablePSF.html) and put it in a more general and accessible class. This general class could go into either Astropy, Photutils (ref: https://photutils.readthedocs.org/en/latest/) or Gammapy: the exact location still needs to be discussed within the Astropy-dev mailing list, or as issue. In addition, methods for derivation of radial profiles from images are necessary. For this, the radial profile code from image_tools (ref: https://github.com/keflavich/image_tools/blob/master/image_tools/radialprofile.py) can be used as a starting point. This functionality could fit in the astropy.nddata.utils module (ref: http://astropy.readthedocs.org/en/latest/nddata/index.html#module-astropy.nddata.utils).

Once this functionality is available, the actual background models can be implemented in Gammapy. While the corresponding pull requests await to be merged into Astropy, the code will be temporarilly placed in gammapy.extern (ref: https://github.com/gammapy/gammapy/tree/master/gammapy/extern), so that the work on the background models does not suffer from delays. In a first step, the tools needed for deriving background models should be implemented. Specifically, container classes for the data cubes and the radially symmetric acceptance models, masks to implement exclusion regions, methods for filling and smoothing the background models. The smoothing is necessary in order to get rid of Possion noise due to low number of background events, especially at high energies. In a second step, the models can be used for deriving background subtraction techniques from Berge 2007 (ref: http://adsabs.harvard.edu/abs/2007A%26A...466.1219B). In the case of the (adaptive) ring background, a downsample of the image (skymap) using gammapy.image.downsample_2N (ref: https://gammapy.readthedocs.org/en/latest/api/gammapy.image.downsample_2N.html) might be necessary to speed up the process.

In order to achieve all these goals, I will work in an incremental way on several pull requests (once every week or every two weeks). A tentative agenda is given below. I am aware that adding unit and functional tests and writing docstrings and high-level docs is a significant part of the time needed to realise this project, in order to have a clean, easy to-understand code base for users and future developers. To keep the schedule short this is not listed separately for each item below, but I estimate that roughly 30% of the time will be needed to develop and verify tests and 20% to write docstrings and high-level docs.

On a personal note I think Gammapy is a software package with a bright future in the gamma-ray community, especially with the upcoming CTA observatory and I am looking forward to using the methods I implement in the future for my own gamma-ray data analyses.

Tentative agenda:

Total duration of coding period: 12 weeks: May 25 - August 21 Tests and documentation are already taken into account in the schedule.

Bonding period:

  • familiarize myself with the specific tools needed from Scipy (interpolation methods), Astropy (NDData and models) and Gammapy (gammapy.background and gammapy.data) before the beginning of the coding period.
  • discuss/define in detail the API and methods to be implemented in gammapy.background on the gammapy mailing list and in Github issues.
  • start Astropy issue for table model
  • start discussion about the best way to implement the radially symmetric model and its emplacement.

General tools in Astropy:

  • week 1: May 25 - May 31: implement lookup and fill methods for NDData, solving issue 619 (ref: https://github.com/astropy/astropy/issues/619)
  • week 2: June 1 - June 7: implement table model with interpolation and smoothing methods
  • week 3: June 8 - June 14: implement radially symmetric model with evaluation methods and radial profile extraction utility

Toolbox for deriving background models from OFF observations in Gammapy:

  • week 4: June 15 - June 21: develop toy background simulation tools to create events lists for testing, solving issue 1 (ref: https://github.com/gammapy/gammapy/issues/1)
  • week 5: June 22 - June 28: develop the container classes in gammapy.data: 1 class for counts cube data set and 1 class for radially symmetric acceptance
  • week 6: June 29 - July 5: use masks to implement exclusion regions for known gamma-ray sources and bright stars and correct livetime accordingly -> mid-term evaluations happen between weeks 5 and 6
  • weeks 7 and 8: July 6 - July 12 and July 13 - July 19: implement methods for smoothing of background model: a simple approach with a smoothing kernel, and a more complex and computationally intensive but more accurate case with the fitting of a function in a sliding window.

Background models from ON observations in Gammapy:

  • week 9: July 20 - July 26: use toolbox for implementing FoV background: methods to project background template model images/cubes onto count images/cubes and normalize them outside exclusion regions.
  • week 10: July 27 - August 2: use toolbox for implementing ring background: methods to convolve the FoV background with a ring and normalize it according to the acceptance in each part of the ring.
  • week 11: August 10 - August 16: implement adaptive ring background: methods to adapt automatically the dimensions of the ring according to the area factor of the ring after applying the exclusion regions mask.

Buffer:

  • ~week 12: August 17 - August 21: reserved as a buffer period to finish a pull request in case something takes longer than expected or unforeseen difficulties arise. If everything runs smoothly and the buffer period is unnecessary the time will be used for improving the implemented methods or implementation of new ones from Berge 2007 (ref: http://adsabs.harvard.edu/abs/2007A%26A...466.1219B).

Links to patches:

Link to GSoC 2015 blog:

Clone this wiki locally