In [None]:
import os
import numpy as np

You can run terminal commands in a Jupyter notebook by prefacing them with `!`. This will untar the `data/` directory:

In [None]:
!tar -xzf data.tgz

# building an `enterprise` model

The main purpose of `enterprise` is to generate the PTA **likelihood** function.
Then you can use that function to do whatever you want.

The multivariat Gaussian likelihood is defined:
$$\mathcal{L} = \left(2\pi\, \det\mathbb{C} \right)^{-1/2} \,\, \exp\left( -\frac{1}{2} \vec{q}\cdot \mathbb{C}^{-1} \cdot \vec{q} \right)$$
where $\mathbb{C}$ is the covariance matrix and $\vec{q}$ is the residual vector.

In the PTA case, we construct $\mathbb{C}$ from white noise covariance $\mathbb{N}$ and the design matrix of for the Gaussian process models $\mathbb{T}$.
The residual vector contains the TOA residuals, $\delta\vec{t}$, and any deterministic signals in the model, $\vec{s}$, (e.g. a continuous GW from a single SMBHB).

$$ \mathbb{C} = \mathbb{N} + \mathbb{TBT}^T $$
$$ \vec{q} = \delta\vec{t} - \vec{s} $$



While the likelihood is always a multivariate Gaussian likelihood, the details will depend on the particulars of the  model.
`enterprise` is set up so that you specify the **data**, **model**, and **model parameters**, then it constructs the covariance matrix and residual vector needed to calculate the likelihood for a given set of parameters.  The `enterprise` parameter specification is based on the **prior** for each parameter.

The `enterprise.signals.signal_base.PTA` object has two commonly used methods:
* `PTA.get_lnlikelihood()`
* `PTA.get_lnprior()`

each takes a parameter vector (list of parameter values) as an input.
These two methods are commonly passed to a **Markov chain Monte Carlo** (MCMC) sampler in order to draw samples from the model posterior probability distribution.

The `enterprise.signals` module contains nearly everything we need to construct the data model and get our likelihood and prior functions.

## load data

The `data/` directory contains `.par` and `.tim` files for three pulsars, one from each recent PTA dataset:
* J1600-3053 from EPTA 6 pulsar
* J2241-5236 from PPTA DR2
* J2317+1439 from NANOGrav 12.5 yr

Each PTA uses a slightly different data model, so these three will give us an introductory walkthrough of what we can do with `enterprise`.

To keep things organized here's a dictionary of the filenames for each pulsar:

In [None]:
datafiles = {
    "J1600-3053":{"par":"J1600-3053_EPTA_6psr.par", "tim":"J1600-3053_EPTA_6psr.tim"},
    "J2241-5236":{"par":"J2241-5236_PPTA_dr2.par", "tim":"J2241-5236_PPTA_dr2.tim"},
    "J2317+1439":{"par":"J2317+1439_NANOGrav_12y.par", "tim":"J2317+1439_NANOGrav_12y.tim"},
}

Let's load the data for J2317+1439 as an `enterprise.pulsar.Pulsar` object and construct its model

In [None]:
from enterprise.pulsar import Pulsar

In [None]:
datadir = os.path.abspath("data")

In [None]:
pfile =
tfile =

psr = 

(the warnings are generated by `tempo2` and are nothing to worry about...)

## white noise model

The basic PTA white noise model contains **EFAC** and **EQUAD**.
EQUAD, $\mathcal{Q}$ is an additional error term added in quadrature to the timing uncertainty for each TOA, $\sigma$.
EFAC, $\mathcal{F}$, is an overall scale factor that multiplies the timing uncertainty for each TOA.
Both account for excess white noise that is not captured by the timing uncertainty.
In a perfect world if we understand the noise in our telescopes, EFAC would be $1$.
In practice it is often pretty close to $1$.

The basic white noise covariance matrix $\mathbb{N}$ is diagonal
$$\mathbb{N}_{ii} = \mathcal{F}^2\left({\sigma_i}^2 + \mathcal{Q}^2\right) $$

In practice each observing system may have different noise properties.
We define $\mathcal{F}_k$ and $\mathcal{Q}_k$, where $k$ denotes the observing system (e.g. L-band PUPPI or PDFB 20cm), so each system gets its own noise model.
This is referred to as **white noise per backend**.

* `white_signals.MeasurementNoise` implements this base white noise model
* `selelections.Selection` implements the backend (or any other) selection scheme
* the `parameter` module contains what we'll need to specify the parameter priors

In [None]:
from enterprise.signals import parameter
from enterprise.signals import white_signals
from enterprise.signals import selections

### Construct a white noise "signal" with EFAC and EQUAD per backend
the priors are:
* EFAC $\rightarrow$ Uniform(0.1, 10)
* EQUAD $\rightarrow$ LogUniform($10^{-8.5}, 10^{-5}$) sec
 * $\log_{10}$(EQUAD) $\rightarrow$ Uniform(-8.5, -5)

In [None]:
# define model parameters


In [None]:
# use built in backend selection


In [None]:
# make white noise "signal"


The NANOGrav 12.5yr dataset used **channelized TOAs**, breaking a single observation into many TOAs for each radio frequency channel.  To account for jitter noise, which is perfectly correlated in all TOAs from the same observation, we use the ECORR model.

This makes the whitenoise covariance matrix, $\mathbb{N}$, block diagonal, where each block groups all of the TOAs from the same observation.

### Construct a per backend ECORR "signal"
prior:
* ECORR $\rightarrow$ LogUniform($10^{-8.5}, 10^{-5}$) sec
 * $\log_{10}$(ECORR) $\rightarrow$ Uniform(-8.5, -5)

When additional signals are added to a model in `enterprise` they are literally **added** to make the joint model:
```python
model = modA + modB
```

In [None]:
# define model parameters


In [None]:
# make ecorr "signal" and add it to the existing WN signal


## Gaussian process models

Gaussian process models are defined by a design matrix $\mathbb{T}$, a vector of basis coefficients $\vec{b}$, and a prior on the coefficients, $\mathbb{B}$.  For instance in the powerlaw red noise model the basis coefficients are the amplitude of various Fourier modes and the prior is the powerlaw which constrains these amplitudes to the right spectral shape.

The PTA likelihood **marginalizes** over the coefficients.
Only the GP prior affects the likelihood calculation.

In [None]:
from enterprise.signals import gp_signals
from enterprise.signals import gp_priors

### Construct a Fourier basis GP model for red noise and GWB
Both signals will use the `gp_signals.FourierBasisGP` class, in order to differentiate them we will pass the GWB signal a name.

prior:
* $\log_{10}(A_\mathrm{RN}) \rightarrow$ Uniform(-20, -11)
* $\gamma_\mathrm{RN} \rightarrow$ Uniform(0, 7)


* $\log_{10}(A_\mathrm{GWB}) \rightarrow$ Uniform(-18, -13)
* $\gamma_\mathrm{GWB} = 13/3$, constant (for SMBHB background)

We can also pass a name to `Parameter` classes, which tells them not to generate a common parameter for all pulsars, rather than a new one for each.  This is important for GWB parameters.

In [None]:
# define RN model parameters


In [None]:
# define powerlaw for GP prior


In [None]:
# make RN signal


In [None]:
# define GWB model parameters (don't forget to pass each a name)


In [None]:
# define powerlaw for GP prior using GWB parameters


In [None]:
# make GWB signal (don't forget to name it)


### instantiate a built in gp signal object for the timing model
There are two implementations of the linear timing model:

* `gp_signals.TimingModel`
* `gp_signals.MarginalizingTimingModel`

Each can take an option to use an SVD, which helps stablize the matrix when the timing model is poorly constrained.

## instantiate a `PTA` and compute the log-likelihood

Remember a model is the literal sum of its parts.
When we add `Signal` objects together we get a `SignalCollection` object.

The `enterprise` `SignalCollection` is a class factory.
It acts like a function that takes a `Pulsar` object as its input in order to apply that model to that pulsar
and returns an instantiated class instance.

The `signal_base.PTA` object can take either a single, instantiated pulsar model or a list of several pulsar models.
To do a single pulsar noiserun we will use just this one pulsar.

In [None]:
from enterprise.signals import signal_base

In [None]:
# add together the signals


In [None]:
# make a PTA object


## generate a random point in the model domain and calculate the likelihood and prior.

Each `enterprise` parameter object comes with its own `.sample` method, which draws a random sample from the prior distribution.
We can generate a point in the domain by sampling each parameter in turn.

`PTA.get_lnlikelihood` and `PTA.get_lnprior` can take an array-like input for the parameter vector or a dictionary.  Lets make a dictionary, whose keys are the parameter names and values are the random samples.

In [None]:
# make a parameter vector as a dictionary


In [None]:
# calculate logL and logPr


The `PTA` object contains all of the other objects that went into making it, including `Parameter`s and `Signal`s.

Let's look at what's in there...

### take a look at the output of each:
* `PTA.signals`
* `PTA.params`
* `PTA.param_names`
* `PTA.pulsars`
* `PTA.pulsarmodels`

The `PTA.summary` method outputs some information about the `PTA` object.
It's useful to print it or save it to file to keep track of things.

### print the output of `PTA.summary`

## What's goes into the likelihood?

The full covariance matrix $\mathbb{C}$ is $N_\mathrm{TOA}\times N_\mathrm{TOA}$, which is a nightmare to directly invert.
There's a lot of fancy linear algebraic manipulations that build $\mathbb{C}^{-1}$ out of more manageable parts.

### T-matrix

The $\mathbb{T}$ matrix is $N_\mathrm{TOA} \times N_\mathrm{coef}$, where $N_\mathrm{coef}$ is the total number of GP coefficients in all GP models.  The $\mathbb{T}$ matrix is the *design matrix* for the **GP basis**.  We can access it with the `PTA.get_basis` method.

The GW and RN models share common Fourier basis with $60$ coefficients ($30$ sine amplitudes & $30$ cosine).

J2317 has $230$ timing model parameters!
There are $19$ "traditional" timing model parameters:
* spin: $(2)$: F0, F1
* astrometric $(5)$: ELONG, ELAT, PMELONG, PMELAT, PX
* binary ELL1H $(7)$: A1, PB, TASC, EPS1, EPS2, H3, H4
* FD $(3)$
* jumps $(2)$

But the DMX model which has $211$ bins!

This makes $N_\mathrm{coef} = 290$.

In [None]:
# check the number of TOAs


In [None]:
# check the shape of the the "T-matrix"


### N-matrix

The block-diagonal white noise covariance $\mathbb{N}$ matrix is accessed via `PTA.get_ndiag`.

It's not much to look at, because it is obscured as a special class that sets up the Sherman-Morrison matrix inversion scheme.

In [None]:
# check the output of the "N-matrix"


### B-matrix

The GP prior $\mathbb{B}$ matrix is can be broken out as:
$$
  \mathbb{B} =
    \begin{bmatrix}
    \infty & 0\\
    0 & \varphi
    \end{bmatrix},
$$
where $\infty$ is a diagonal matrix of infinities representing the improper, uniform prior on the linear timing model coefficients and $\varphi$ is the Fourier basis prior.
The Fourier modes are approximately orthogonal, so $\varphi$ is diagonal too.
This means the whole $\mathbb{B}$ matrix is diagonal and can be stored as a vector.

Since $\varphi$ is the non-trivial part of the $\mathbb{B}$ matrix, the two aren't always clearly distinguished...
In `enterprise` the `PTA.get_phi` and `PTA.get_phiinv` methods access the $\mathbb{B}$ matrix and its inverse.

In [None]:
#check the shape of the "B-matrix"


### other parts of the covariance matrix
Some other useful combinations of matrices are used in the $\mathbb{C}$ matrix inversion for the likelihood calculation
. For example `PTA.get_TNT` and `PTA.get_TNr`:

### s-vector
Our model doesn't have a deterministic signal, $\vec{s}$.  But if it did, we could find it in the `PTA` as `get_delay`.

In [None]:
# "s-vector"