# Using `enterprise` to analyze single pulsar noise

In this notebook you will learn:
* How to use `enterprise` to interact with PTA data,
* How to search in PTA data for GWs using a single pulsar,
* How to post-process your results.

If you are interested in working through this notebook, but do not want to install the software, we have prepared a [Google Colab notebook](https://colab.research.google.com/drive/1sBALRUi6wCykAAKH8Lp5TdS69QUmNgZq#scrollTo=t1FXF9NO5HpG)

By copying this notebook, you can install the software to your own Google Colab account and run the software without installation on your computer.

This notebook is intended to give quick results while demonstrating the basic process of looking for red noise in a single pulsar with the white noise fixed. If you are interested in searching for the GWB with more pulsars, or if you are interested in correlations of the red noise, please see `pta_gwb_analysis.ipynb`.

# Load packages and modules

In [1]:
from __future__ import division

%matplotlib inline
%config InlineBackend.figure_format = 'retina'
%load_ext autoreload
%autoreload 2

import os, glob, json, pickle
import matplotlib.pyplot as plt
import numpy as np
import scipy.linalg as sl

import enterprise
from enterprise.pulsar import Pulsar
import enterprise.signals.parameter as parameter
from enterprise.signals import utils
from enterprise.signals import signal_base
from enterprise.signals import selections
from enterprise.signals.selections import Selection
from enterprise.signals import white_signals
from enterprise.signals import gp_signals
from enterprise.signals import deterministic_signals
import enterprise.constants as const

import corner
from PTMCMCSampler.PTMCMCSampler import PTSampler as ptmcmc


import precession_model



## Get par, tim, and noise files

In [2]:
psrlist = None # define a list of pulsar name strings that can be used to filter.

In [3]:
# set the data directory
datadir = '../data'
print(datadir)

../data


In [4]:
# for the entire pta
psrname = 'B1937+21'
parfiles = sorted(glob.glob(datadir + '/par/' + psrname + '*par'))
timfiles = sorted(glob.glob(datadir + '/tim/' + psrname + '*tim'))

print(parfiles)
print(timfiles)

['../data/par/B1937+21_NANOGrav_12yv3.gls.par']
['../data/tim/B1937+21_NANOGrav_12yv3.tim']


## Load into Pulsar class list
* The `enterprise` Pulsar class uses `libstempo` (or optionally `PINT`) to read in `par` and `tim` files, then stores all pulsar data into a `Pulsar` object. This object contains all data and meta-data needed for the ensuing pulsar and PTA analysis. You no longer to reference the `par` and `tim` files after this cell.
* Note below that you can explicitly declare which version of the JPL solar-system ephemeris model that will be used to compute the Roemer delay between the geocenter and the barycenter (e.g. `DE438`). Otherwise the default values will be taken from the `par` files. Explicitly declaring the version here is good practice.

In [5]:
psrs = []
ephemeris = 'DE438'
for p, t in zip(parfiles, timfiles):
    psr = Pulsar(p, t, ephem=ephemeris)
    psrs.append(psr)




* We can read-in some previously computed noise properties from single-pulsar white noise analyses. These are things like `EFAC`, `EQUAD`, and (for `NANOGrav`) `ECORR`. 
* In practice, we set these white-noise properties as fixed in the low-frequency noise / GW searches.
* See `singlepulsar_whitenoise_analysis.ipynb` to see the methods used to find these values.
* The noise properties have been stored as `json` files, and are read in to a big parameter dictionary.

In [6]:
## Get parameter noise dictionary
noise_ng12 = datadir + '/channelized_12p5yr_v3_full_noisedict.json'
print(noise_ng12)
params = {}
with open(noise_ng12, 'r') as fp:
    params.update(json.load(fp))

../data/channelized_12p5yr_v3_full_noisedict.json


In [7]:
# find the maximum time span to set GW frequency sampling
tmin = [p.toas.min() for p in psrs]
tmax = [p.toas.max() for p in psrs]
Tspan = np.max(tmax) - np.min(tmin)
print(tmin)
print(tmax)

[4602276511.100282]
[5005342074.766501]


* Usually, in a full PTA analysis we fix all of the white noise (EFAC, EQUAD, and ECORR) parameters to the values obtained from the noise files. This is done by using `Constant` parameters. In this case we do not specify a default value for all instances of that parameter but instead will set them, based on their initialized pulsar and backend specific name later via the `set_default_params` method of `PTA`.

* For a single pulsar, it is not necessary to set the white noise as constant, but the computation time will increase with the increase in number of parameters. For this notebook, we set it as constant.

* We use the `Selection` object to define which noise parameters are assigned to which chunks of TOAs. This selection is based on unique combination of backends and receivers.

In [8]:
# define selection by observing backend
selection = selections.Selection(selections.by_backend)

### Parameters
* For this **detection** search, we will use a `Uniform` prior on the red noise, and set the white noise parameters as `Constant` with values added in later from the noise dictionary.

* In a single pulsar analysis, we can't look at spatial correlations. So we will exclude them here. They are mentioned in the full PTA analysis in the `pta_gwb_analysis.ipynb` notebook.

In [9]:
# white noise parameters
efac = parameter.Constant() 
equad = parameter.Constant() 
ecorr = parameter.Constant() # we'll set these later with the params dictionary

# red noise parameters

# dm-variation parameters
log10_A_dm = parameter.Uniform(-20, -11)
gamma_dm = parameter.Uniform(0, 7)


### Signals

In [10]:
# white noise
ef = white_signals.MeasurementNoise(efac=efac, log10_t2equad=equad, selection=selection)
ec = white_signals.EcorrKernelNoise(log10_ecorr=ecorr, selection=selection)

# red noise (powerlaw with 30 frequencies)
rn = precession_model.RedNoise_delay_block()

# timing model
tm = gp_signals.TimingModel(use_svd=True)

In [11]:
# full model
s = ef + ec + rn + tm 
print(s)

<class 'enterprise.signals.signal_base.SignalCollection.<locals>.SignalCollection'>


In [12]:
# intialize PTA (this cell will take a minute or two to run)
models = []
        
for p in psrs:    
    models.append(s(p))
    
pta = signal_base.PTA(models)

In [13]:
pta.param_names

['RedNoise_P', 'RedNoise_a1', 'RedNoise_a2', 'RedNoise_k', 'RedNoise_t0']

In [14]:
# set white noise parameters with dictionary
pta.set_default_params(params)

In [15]:
# set initial parameters drawn from prior
x0 = np.hstack([p.sample() for p in pta.params])
ndim = len(x0)

In [16]:
# set up the sampler:
# initial jump covariance matrix
cov = np.diag(np.ones(ndim) * 0.01**2)
outDir = '../chains_pta_gwb'

sampler = ptmcmc(ndim, pta.get_lnlikelihood, pta.get_lnprior, cov, 
                 outDir=outDir, resume=False)

In [None]:
# sampler for N steps (this should take about 15 mins.)
N = int(1e6)  # normally, we would use 5e6 samples (this will save time)
x0 = np.hstack([p.sample() for p in pta.params])
sampler.sample(x0, N, SCAMweight=30, AMweight=15, DEweight=50, )

  logpdf = np.log(self.prior(value, **kwargs))


Finished 1.00 percent in 1536.097142 s Acceptance rate = 0.0291111Adding DE jump with weight 50
Finished 53.40 percent in 238236.409410 s Acceptance rate = 0.151996

In [None]:
chain = np.loadtxt(os.path.join(outDir, 'chain_1.txt'))
burn = int(0.25 * chain.shape[0])

In [None]:
ind_k = list(pta.param_names).index('RedNoise_k')
ind_P = list(pta.param_names).index('RedNoise_P')
ind_a1 = list(pta.param_names).index('RedNoise_a1')
ind_a2 = list(pta.param_names).index('RedNoise_a2')
ind_t0 = list(pta.param_names).index('RedNoise_t0')

In [None]:
# Make trace-plot to diagnose sampling
plt.plot(chain[burn:, ind_k])

In [None]:
plt.plot(chain[burn:, ind_P])

In [None]:
plt.plot(chain[burn:, ind_a1])

In [None]:
plt.plot(chain[burn:, ind_a2])

In [None]:
plt.plot(chain[burn:, ind_t0])

In [None]:
# Plot a histogram of the marginalized posterior distribution
plt.hist(chain[burn:,ind_k], 50, histtype='stepfilled', 
         lw=2, color='C0', alpha=0.5)
plt.xlabel('RedNoise_k')
plt.ylabel('PDF')

In [None]:
plt.hist(chain[burn:,ind_P], 50, histtype='stepfilled', 
         lw=2, color='C0', alpha=0.5)
plt.xlabel('RedNoise_P')
plt.ylabel('PDF')

In [None]:
plt.hist(chain[burn:,ind_a1], 50, histtype='stepfilled', 
         lw=2, color='C0', alpha=0.5)
plt.xlabel('RedNoise_a1')
plt.ylabel('PDF')

In [None]:
plt.hist(chain[burn:,ind_a2], 50, histtype='stepfilled', 
         lw=2, color='C0', alpha=0.5)
plt.xlabel('RedNoise_a2')
plt.ylabel('PDF')

In [None]:
plt.hist(chain[burn:,ind_t0], 50, histtype='stepfilled', 
         lw=2, color='C0', alpha=0.5)
plt.xlabel('RedNoise_t0')
plt.ylabel('PDF')