Skip to content

Latest commit

 

History

History
259 lines (237 loc) · 13.2 KB

paper.md

File metadata and controls

259 lines (237 loc) · 13.2 KB
title tags authors affiliations date bibliography
`exoplanet`: Gradient-based probabilistic inference for exoplanet data & other astronomical time series
Python
astronomy
name orcid affiliation
Daniel Foreman-Mackey
0000-0002-9328-5652
1
name orcid affiliation
Rodrigo Luger
0000-0002-0296-3826
1,2
name orcid affiliation
Eric Agol
0000-0002-0802-9145
3,2
name orcid affiliation
Thomas Barclay
0000-0001-7139-2724
4
name orcid affiliation
Luke G. Bouma
0000-0002-0514-5538
5
name orcid affiliation
Timothy D. Brandt
0000-0003-2630-8073
6
name orcid affiliation
Ian Czekala
0000-0002-1483-8811
7,8,9,10
name orcid affiliation
Trevor J. David
0000-0001-6534-6246
1,11
name orcid affiliation
Jiayin Dong
0000-0002-3610-6953
7,8
name orcid affiliation
Emily A. Gilbert
0000-0002-0388-8004
12
name orcid affiliation
Tyler A. Gordon
0000-0001-5253-1987
3
name orcid affiliation
Christina Hedges
0000-0002-3385-8391
13,14
name orcid affiliation
Daniel R. Hey
0000-0003-3244-5357
15,16
name orcid affiliation
Brett M. Morris
0000-0003-2528-3409
17
name orcid affiliation
Adrian M. Price-Whelan
0000-0003-0872-7098
1
name orcid affiliation
Arjun B. Savel
0000-0002-2454-768X
18
name index
Center for Computational Astrophysics, Flatiron Institute, New York, NY, USA
1
name index
Virtual Planetary Laboratory, University of Washington, Seattle, WA, USA
2
name index
Department of Astronomy, University of Washington, University of Washington, Seattle, WA, USA
3
name index
Center for Space Sciences and Technology, University of Maryland, Baltimore County, Baltimore, MD, USA
4
name index
Department of Astrophysical Sciences, Princeton University, Princeton, NJ, USA
5
name index
Department of Physics, University of California, Santa Barbara, Santa Barbara, CA, USA
6
name index
Department of Astronomy and Astrophysics, The Pennsylvania State University, University Park, PA, USA
7
name index
Center for Exoplanets and Habitable Worlds, The Pennsylvania State University, University Park, PA, USA
8
name index
Center for Astrostatistics, The Pennsylvania State University, University Park, PA, USA
9
name index
Institute for Computational and Data Sciences, The Pennsylvania State University, University Park, PA, USA
10
name index
Department of Astrophysics, American Museum of Natural History, New York, NY, USA
11
name index
Department of Astronomy and Astrophysics, University of Chicago, Chicago, IL, USA
12
name index
NASA Ames Research Center, Moffett Field, CA, USA
13
name index
Bay Area Environmental Research Institute, Moffett Field, CA, USA
14
name index
Sydney Institute for Astronomy, School of Physics, University of Sydney, Camperdown, New South Wales, Australia
15
name index
Stellar Astrophysics Centre, Department of Physics and Astronomy, Aarhus University, Aarhus, Denmark
16
name index
Center for Space and Habitability, University of Bern, Bern, Switzerland
17
name index
Department of Astronomy, University of Maryland, College Park, MD, USA
18
23 April 2021
paper.bib

Summary

exoplanet is a toolkit for probabilistic modeling of astronomical time series data, with a focus on observations of exoplanets, using PyMC3 [@pymc3]. PyMC3 is a flexible and high-performance model-building language and inference engine that scales well to problems with a large number of parameters. exoplanet extends PyMC3’s modeling language to support many of the custom functions and probability distributions required when fitting exoplanet datasets or other astronomical time series.

While it has been used for other applications, such as the study of stellar variability [e.g., @gillen20; @medina20], the primary purpose of exoplanet is the characterization of exoplanets [e.g., @gilbert20; @plavchan20] or multiple-star systems [e.g., @czekala21] using time-series photometry, astrometry, and/or radial velocity. In particular, the typical use case would be to use one or more of these datasets to place constraints on the physical and orbital parameters of the system, such as planet mass or orbital period, while simultaneously taking into account the effects of stellar variability.

Statement of need

Time-domain astronomy is a priority of the observational astronomical community, with huge survey datasets currently available and more forthcoming. Within this research domain, there is significant investment into the discovery and characterization of exoplanets, planets orbiting stars other than our Sun. These datasets are large (on the scale of hundreds of thousands of observations per star from space-based observatories such as Kepler and TESS), and the research questions are becoming more ambitious (in terms of both the computational cost of the physical models and the flexibility of these models). The packages in the exoplanet ecosystem are designed to enable rigorous probabilistic inference with these large datasets and high-dimensional models by providing a high-performance and well-tested infrastructure for integrating these models with modern modeling frameworks such as PyMC3. Since its initial release at the end of 2018, exoplanet has been widely used, with 64 citations of the Zenodo record [@zenodo] so far.

The exoplanet software ecosystem

Besides the primary exoplanet package, the exoplanet ecosystem of projects includes several other libraries. This paper describes, and is the primary reference for, this full suite of packages. The following provides a short description of each library within this ecosystem and discusses how they are related.

  • exoplanet1 is the primary library, and it includes implementations of many special functions required for exoplanet data analysis. These include the spherical geometry for computing orbits, some exoplanet-specific distributions for eccentricity [@kipping13b; @vaneylen19] and limb darkening [@kipping13], and exposure-time integrated limb-darkened transit light curves.
  • exoplanet-core2 provides efficient, well-tested, and differentiable implementations of all of the exoplanet-specific operations that must be compiled for performance. These include an efficient solver for Kepler's equation [based on the algorithm proposed by @raposo17] and limb darkened transit light curves [@agol20]. Besides the implementation for PyMC3, exoplanet-core includes implementations in numpy [@numpy] and jax [@jax].
  • celerite23, is an updated implementation of the celerite algorithm4 [@foremanmackey17; @foremanmackey18] for scalable Gaussian Process regression for time series data. Like exoplanet-core, celerite2 includes support for numpy, jax, and PyMC3, as well as some recent generalizations of the celerite algorithm [@gordon20].
  • pymc3-ext5, includes a set of helper functions to make PyMC3 more amenable to the typical astronomical data analysis workflow. For example, it provides a tuning schedule for PyMC3's sampler [based on the method used by the Stan project and described by @carpenter17] that provides better performance on models with correlated parameters.
  • rebound-pymc36 provides an interface between REBOUND [@rein12], REBOUNDx [@tamayo20], and PyMC3 to enable inference with full N-body orbit integration.

Documentation & case studies

The main documentation page for the exoplanet libraries lives at docs.exoplanet.codes where it is hosted on ReadTheDocs. The tutorials included with the documentation are automatically executed on every push or pull request to the GitHub repository, with the goal of ensuring that the tutorials are always compatible with the current version of the code. The celerite2 project has its own documentation page at celerite2.readthedocs.io, with tutorials that are similarly automatically executed.

Alongside these documentation pages, there is a parallel "Case Studies" website at gallery.exoplanet.codes that includes more detailed example use cases for exoplanet and the other libraries described here. Like the tutorials on the documentation page, these case studies are automatically executed using GitHub Actions, but at lower cadence (once a week and when a new release of the exoplanet library is made) since the runtime is much longer. \autoref{fig:figure} shows the results of two example case studies demonstrating some of the potential use cases of the exoplanet software ecosystem.

Some examples of datasets fit using exoplanet. The full analyses behind these examples are available on the "Case Studies" page as Jupyter notebooks. (left) A fit to the light curves of a transiting exoplanet observed by two different space-based photometric surveys: Kepler and TESS. (right) The phase-folded radial velocity time series for an exoplanet observed from different observatories with different instruments, fit simultaneously using exoplanet. \label{fig:figure}

Similar tools

There is a rich ecosystem of tooling available for inference with models such as the ones supported by exoplanet. Each of these tools has its own set of strengths and limitations and we will not make a detailed comparison here, but it is worth listing some of these tools and situating exoplanet in this context.

Some of the most popular tools in this space include (and note that this is far from a comprehensive list!) EXOFAST [@eastman13; @eastman19], radvel [@fulton18], juliet [@espinoza19], exostriker [@trifonov19], PYANETI [@barragan19], allesfitter [@guenther20], and orbitize [@blunt20]. Similar tools also exist for modeling observations of eclipsing binary systems, including JKTEBOP [@southworth04], eb [@irwin11], and PHOEBE [@conroy20]. These packages all focus on providing a high-level interface for designing models and then executing a fit. In contrast, exoplanet is designed to be lower level and more conceptually similar to tools like batman [@kreidberg15], PyTransit [@parviainen15], ldtk [@parviainen15b], ellc [@maxted16], starry [@luger19], or Limbdark.jl [@agol20], which provide the building blocks for evaluating the models required for inference with exoplanet datasets. In fact, several of the higher-level packages listed above include these lower-level libraries as dependencies, and our hope is that exoplanet could provide the backend for future high-level libraries.

As emphasized in the title of this paper, the main selling point of exoplanet when compared to other tools in this space is that it supports differentiation of all components of the model and is designed to integrate seamlessly with the aesara [@aesara] automatic differentiation framework used by PyMC3. It is worth noting that aesara was previously known as Theano [@theano], so these names are sometimes used interchangeably in the PyMC3 or exoplanet documentation7. This allows the use of modern inference algorithms such as No U-Turn Sampling [@hoffman14] or Automatic Differentiation Variational Inference [@kucukelbir17]. These algorithms can have some computational and conceptual advantages over inference methods that do not use gradients, especially for high-dimensional models. The computation of gradients is also useful for model optimization; this is necessary when, say, searching for new exoplanets, mapping out degeneracies or multiple modes of a posterior, or estimating uncertainties from a Hessian. Care has been taken to provide gradients which are numerically stable, and more accurate and faster to evaluate than finite-difference gradients.

Acknowledgements

We would like to thank the Astronomical Data Group at Flatiron for listening to every iteration of this project and for providing great feedback every step of the way.

This research was partially conducted during the Exostar19 program at the Kavli Institute for Theoretical Physics at UC Santa Barbara, which was supported in part by the National Science Foundation under Grant No. NSF PHY-1748958.

Besides the software cited above, exoplanet is also built on top of ArviZ [@arviz] and AstroPy [@astropy13; @astropy18].

References

Footnotes

  1. https://github.com/exoplanet-dev/exoplanet

  2. https://github.com/exoplanet-dev/exoplanet-core

  3. https://celerite2.readthedocs.io

  4. https://celerite.readthedocs.io

  5. https://github.com/exoplanet-dev/pymc3-ext

  6. https://github.com/exoplanet-dev/rebound-pymc3

  7. More information about this distinction is available at https://docs.exoplanet.codes/en/stable/user/theano/