Skip to content

Commit

Permalink
So much issue closing...
Browse files Browse the repository at this point in the history
1: fixing hard spaces (regex: \\\(.*\)\~/\\\1\\\ /g
2: getting rid of all e.g.s
3: also had to fix use of \pr (what) vs. \pr{what}
  • Loading branch information
dfm committed Feb 7, 2012
1 parent 7109ce7 commit 3f52147
Show file tree
Hide file tree
Showing 2 changed files with 44 additions and 42 deletions.
Binary file modified document/doc.pdf
Binary file not shown.
86 changes: 44 additions & 42 deletions document/doc.tex
Original file line number Diff line number Diff line change
Expand Up @@ -144,17 +144,17 @@

We introduce a stable, well tested Python implementation of the affine-%
invariant ensemble sampler for Markov chain Monte Carlo (MCMC)
proposed by Goodman~\&~Weare (2010). The code is open source and has
proposed by Goodman \& Weare (2010). The code is open source and has
already been used in several published projects in the Astrophysics
literature. The algorithm behind \this~has several advantages over
literature. The algorithm behind \this\ has several advantages over
traditional MCMC sampling methods and it has excellent performance as
measured by the autocorrelation time.
One major advantage of the algorithm is that it requires hand-tuning of
only 2 parameters compared to $\sim N^2$ for
a traditional algorithm in an $N$-dimensional parameter space. In this
\paper, we describe the algorithm and the details of our implementation
and API. Taking advantage of the naturally parallel nature of
the algorithm, \this~permits \emph{any} user to take advantage of
the algorithm, \this\ permits \emph{any} user to take advantage of
multiple CPUs without extra effort. This is a huge advantage over a
na\"ive implementation of the published algorithm when applied to
computationally expensive problems.
Expand Down Expand Up @@ -191,7 +191,7 @@ \section{Introduction}
parameter spaces. This has proved useful in too many research
applications to list here but the results from the Wilkinson Microwave
Anisotropy Probe (WMAP) mission provide a dramatic example
\citep[e.g.][]{Dunkley:2005}.
\citep[for example][]{Dunkley:2005}.

Arguably the most important advantage of Bayesian data analysis is
that it is possible to \emph{marginalize} over nuisance parameters. A
Expand Down Expand Up @@ -226,7 +226,8 @@ \section{Introduction}

Most uses of MCMC in the astrophysics literature are based on slight
modifications to the Metropolis-Hastings (M-H) method
\citep[e.g.][]{MacKay:2003}. Each step in a M-H chain is proposed using a
\citep[for example][]{MacKay:2003}. Each step in a M-H chain is proposed
using a
multivariate Gaussian centered on the current position of the chain. Since
each term in the covariance matrix of this proposal distribution is an
unspecified parameter, this method has $N\,[N+1]/2$ tuning parameters (where
Expand All @@ -235,8 +236,8 @@ \section{Introduction}
tuning parameters and there is no fool-proof method for choosing the values
correctly. As a result, many heuristic methods have been developed to attempt
to determine the optimal parameters in a data-driven way
\citep[e.g.][]{Gregory:2005,Dunkley:2005,Widrow:2008}. Unfortunately, these
methods all require ``burn-in'' phases where shorter Markov chains
\citep[for example][]{Gregory:2005,Dunkley:2005,Widrow:2008}. Unfortunately,
these methods all require ``burn-in'' phases where shorter Markov chains
are sampled and the results are used to tune the hyperparameters. This extra
cost is unacceptable when the likelihood calls are computationally heavy.

Expand Down Expand Up @@ -273,7 +274,7 @@ \section{Introduction}
performance. \citet{Hou:2011} were the first group to implement this
algorithm to solve a physics problem. The implementation presented here is
an independent effort that has already proved effective in several projects
\citep[][Foreman-Mackey \& Widrow~2012, in prep.]{Lang:2011,
\citep[][Foreman-Mackey \& Widrow\ 2012, in prep.]{Lang:2011,
Bovy:2011, Dorman:2012}.
In what follows, we summarize the GW algorithm and the implementation
decisions made in \this. We also describe the small changes
Expand All @@ -290,31 +291,31 @@ \section{The Algorithm}\sectlabel{algo}
$\{ \model_i \, \forall i=1, \ldots, M \}$ from
the joint probability distribution
\begin{equation}
\pr (\model, \nuisance, \data) = \pr (\model, \nuisance)
\pr{\model, \nuisance, \data} = \pr{\model, \nuisance}
\, \pr (\data | \model, \nuisance)
\end{equation}
where the prior distribution $\pr (\model, \nuisance)$ and the likelihood
function $\pr (\data|\model,\nuisance)$ can be relatively easily (but not
where the prior distribution $\pr{\model, \nuisance}$ and the likelihood
function $\pr{\data|\model,\nuisance}$ can be relatively easily (but not
necessarily quickly) computed for a particular value of
$(\model_i, \nuisance_i)$. Since the normalization $\pr (\data)$ is
$(\model_i, \nuisance_i)$. Since the normalization $\pr{\data}$ is
independent of $\model$ and $\nuisance$, the joint distribution above is
proportional to the posterior probability $\pr (\model, \nuisance | \data)$
proportional to the posterior probability $\pr{\model, \nuisance | \data}$
given any one choice of generative model. Therefore, once the samples
produced by MCMC are available, the marginalized constraints on $\model$
(\eq{marginalization}) can be approximated by the histogram of the samples
projected into the subspace spanned by $\model$. In particular, the
expectation value of a particular parameter $\phi \in \model$ given the
samples $\{ \phi_i \}$ is
\begin{equation}
E[\phi] = \int \phi \, \pr ( \model, \nuisance | \data )
E[\phi] = \int \phi \, \pr{\model, \nuisance | \data}
\, \dd \model \, \dd \nuisance
\approx \frac{1}{M} \sum_{i=1} ^M \phi_i .
\end{equation}
Generating the samples $\model_i$ is a non-trivial process unless
$\pr (\model, \nuisance, \data)$ is a very specific analytic distribution
(e.g.~Gaussian). MCMC is a procedure for generating a random walk in the
parameter space the relatively efficiently draws a representative set of
samples from the distribution. Each point in a Markov chain
$\pr{\model, \nuisance, \data}$ is a very specific analytic distribution
(for example, a Gaussian). MCMC is a procedure for generating a random walk
in the parameter space the relatively efficiently draws a representative set
of samples from the distribution. Each point in a Markov chain
$\link (t) = [\model(t), \nuisance(t)]$
depends only on the position of the previous link $\link (t-1)$.

Expand All @@ -326,11 +327,11 @@ \section{The Algorithm}\sectlabel{algo}
$Q(Y; X(t))$, (2) accept this proposal with probability
\begin{equation}
\mathrm{min} \left \{ 1,
\frac{\pr(\vector{Y} | \data)}{\pr(\vector{X}(t) | \data)} \,
\frac{\pr{\vector{Y} | \data}}{\pr{\vector{X}(t) | \data}} \,
\frac{Q(X(t); Y)}{ Q(Y;X(t))} \right \}.
\end{equation}
It is worth emphasizing that if this step is accepted $X(t+1) = Y$; Otherwise,
the new position is set to the previous one $X(t+1) \gets X(t)$ (i.e.~the
the new position is set to the previous one $X(t+1) \gets X(t)$ (i.e.\ the
position $X(t)$ is \emph{double counted}).

The M-H algorithm converges (as $t \to \infty$) to a stationary set of
Expand All @@ -351,8 +352,8 @@ \section{The Algorithm}\sectlabel{algo}
\begin{algorithmic}[1]

\STATE Draw a sample $Y \sim Q (Y; X(t))$
\STATE $q \gets [\pr(\vector{Y}) \, Q(X(t); Y)]
/ [\pr(\vector{X}(t)) \, Q(Y;X(t))]$
\STATE $q \gets [\pr{\vector{Y}} \, Q(X(t); Y)]
/ [\pr{\vector{X}(t)} \, Q(Y;X(t))]$
\STATE $r \gets R \sim [0, 1]$
\IF{$r \ge q$}
\STATE $\vector{X}(t+1) \gets \vector{Y}$
Expand Down Expand Up @@ -395,7 +396,7 @@ \section{The Algorithm}\sectlabel{algo}
\begin{equation}
\eqlabel{acceptance}
q = \min \left \{ 1, Z^{n-1} \,
\frac{\pr(\vector{Y})}{\pr(\vector{X_k} (t))} \right \}
\frac{\pr{\vector{Y}}}{\pr{\vector{X_k} (t)}} \right \}
\end{equation}
where $n$ is the dimension of the parameter space. This procedure is then
repeated for each walker in the ensemble \emph{in series} following the
Expand Down Expand Up @@ -524,7 +525,7 @@ \section{Benchmarks \& Tests} \sectlabel{tests}
\tau = \sum_{t= -\infty} ^{\infty} \frac{C(t)}{C(0)} .
\end{equation}

\this~can optionally calculate the autocorrelation time using the Python
\this\ can optionally calculate the autocorrelation time using the Python
module \project{acor}\footnote{\url{http://github.com/dfm/acor}} to estimate
the autocorrelation time. This module is a direct port of the original
algorithm \citepalias[described by][]{Goodman:2010} and implemented by those
Expand All @@ -550,7 +551,8 @@ \section{Benchmarks \& Tests} \sectlabel{tests}

\section{Discussion \& Tips}

% DFM: In this section, I put the key advice from each paragraph in \emph{}. Hogg
% DFM: In this section, I put the key advice from each paragraph in
% \emph{}. Hogg

The goal of this project has been to make a sampler that is a useful
tool for a large class of data analysis problems---a ``hammer'' if you
Expand Down Expand Up @@ -661,7 +663,7 @@ \section{Discussion \& Tips}
\begin{thebibliography}{}
\raggedright

\bibitem[Bovy~\etal(2011)]{Bovy:2011}
\bibitem[Bovy\ \etal(2011)]{Bovy:2011}
Bovy,~J., Rix,~H.-W., Liu,~C., Hogg,~D.~W., Beers,~T.~C., \& Lee,
Y.~S., 2011, \apj, submitted, arXiv:1111.1724 [astro-ph.GA]
% http://adsabs.harvard.edu/cgi-bin/bib_query?arXiv:1111.1724
Expand All @@ -670,21 +672,21 @@ \section{Discussion \& Tips}
{Christen}, J., \emph{A general purpose scale-independent MCMC algorithm},
technical report I-07-16, CIMAT, Guanajuato, 2007.

\bibitem[Dorman~\etal(2012)]{Dorman:2012}
\bibitem[Dorman\ \etal(2012)]{Dorman:2012}
{Dorman},~C., {Guhathakurta},~P., {Fardal},~M.~A., {Geha},~M.~C.,
{Howley},~K.~M., {Kalirai},~J.~S., {Lang},~D., {Cuillandre}, J.,
{Dalcanton},~J., {Gilbert},~K.~M., {Seth},~A.~C., {Williams},~B.~F.,
\& {Yniguez},~B., 2012, \apj, submitted
\& {Yniguez},\ B., 2012, \apj, submitted
% http://adsabs.harvard.edu/abs/2012AAS...21934608D

\bibitem[Dunkley~\etal(2005)]{Dunkley:2005}
\bibitem[Dunkley\ \etal(2005)]{Dunkley:2005}
{Dunkley}, J., {Bucher}, M., {Ferreira}, P.~G., {Moodley}, K.,
\& {Skordis}, C.,
2005, \mnras, 356, 925-936
% http://adsabs.harvard.edu/abs/2005MNRAS.356..925D

\bibitem[Goodman~\&~Weare(2010)]{Goodman:2010}
Goodman,~J., \& Weare,~J.,
\bibitem[Goodman~\&\ Weare(2010)]{Goodman:2010}
Goodman,~J., \& Weare,\ J.,
2010, Comm.\ App.\ Math.\ Comp.\ Sci., 5, 65

\bibitem[Gregory(2005))]{Gregory:2005}
Expand All @@ -693,12 +695,12 @@ \section{Discussion \& Tips}
Cambridge University Press, 2005
% http://adsabs.harvard.edu/abs/2005blda.book.....G

\bibitem[Hou~\etal(2011))]{Hou:2011}
\bibitem[Hou\ \etal(2011))]{Hou:2011}
{Hou}, F., {Goodman}, J., {Hogg}, D.~W., {Weare}, J., \& {Schwab}, C.,
2011, arXiv:1104.2612
% http://adsabs.harvard.edu/abs/2011arXiv1104.2612H

\bibitem[Lang~\& Hogg(2011))]{Lang:2011}
\bibitem[Lang\ \& Hogg(2011))]{Lang:2011}
{Lang}, D. and {Hogg}, D.~W.,
2011, arXiv:1103.6038
% http://adsabs.harvard.edu/abs/2011arXiv1103.6038L
Expand All @@ -707,7 +709,7 @@ \section{Discussion \& Tips}
{MacKay}, D., \emph{Information Theory, Inference, and Learning Algorithms},
Cambridge University Press, 2003

\bibitem[Widrow~\etal(2008))]{Widrow:2008}
\bibitem[Widrow\ \etal(2008))]{Widrow:2008}
{Widrow}, L.~M. and {Pym}, B. and {Dubinski}, J.,
2008, \apj, 679, 1239
% http://adsabs.harvard.edu/abs/2008ApJ...679.1239W
Expand All @@ -720,17 +722,17 @@ \section{Usage}\sectlabel{api}

\paragraph{Installation}

The easiest way to install \this~is using
The easiest way to install \this\ is using
\pip\footnote{\url{http://pypi.python.org/pypi/pip/}}. Running the command
\begin{lstlisting}
% pip install emcee
\end{lstlisting}
at the command line of a UNIX-based system will install the package and its
\Python~dependencies. If you would like to install for all users, you might
need to run the above command with superuser permissions. \this~depends on
\Python~($>2.7$) and \numpy\footnote{\url{http://numpy.scipy.org}} ($>1.6$)
\Python\ dependencies. If you would like to install for all users, you might
need to run the above command with superuser permissions. \this\ depends on
\Python\ ($>2.7$) and \numpy\footnote{\url{http://numpy.scipy.org}} ($>1.6$)
and the associated \texttt{dev} headers. On some systems, you might need to
install these packages separately. On most systems where \Python~has already
install these packages separately. On most systems where \Python\ has already
been installed, this won't be necessary but if it is, you can install
dependencies (on \Ubuntu, for example) with the command:
\begin{lstlisting}
Expand All @@ -742,12 +744,12 @@ \section{Usage}\sectlabel{api}
\begin{lstlisting}
% python setup.py install
\end{lstlisting}
in the unzipped directory. Make sure that you have \numpy~installed in this
in the unzipped directory. Make sure that you have \numpy\ installed in this
case as well.

\paragraph{Issues \& Contributions}

The development of \this~is being coordinated on \github~at
The development of \this\ is being coordinated on \github\ at
\url{http://github.com/dfm/emcee} and contributions are welcome. If you
encounter any problems with the code, please report them at
\url{http://github.com/dfm/emcee/issues} (DFM: check this url) and consider
Expand All @@ -758,7 +760,7 @@ \section{Usage}\sectlabel{api}
A very simple sample problem and a common unit test for sampling code is
the performance of the code on a high dimensional Gaussian density
\begin{equation}\eqlabel{mgauss}
\pr (\mathbf{x}) \propto \exp\left ( -\frac{1}{2} \mathbf{x}^T \,
\pr{\mathbf{x}} \propto \exp\left ( -\frac{1}{2} \mathbf{x}^T \,
\Sigma^{-1} \, \mathbf{x} \right ).
\end{equation}
We will incrementally work through this example here and the source code
Expand Down

0 comments on commit 3f52147

Please sign in to comment.