# The big hole in our story

Thus far, we'e talked about how to time pulsars and about how we really time pulsars in the IPTA. After this, we'll talk about how to search for a gravitational wave background with ENTERPRISE. 

But in real life, there's a very important step in between these processes. That step is called "Single Pulsar Noise Analysis." 

In this not-tutorial, we'll briefly run through why single pulsar noise analysis is important, some of the noise parameters we can calculate, and how we evaluate noise parameters. However, we won't actually run a noise analysis. That's because running a single pulsar noise analysis has a very high ratio of "sitting around" time to "actively running commands" time. 

If you're hoping to get involved in noise analysis for DR3 or you want to learn more, the DR3 noise leads are in the process of prepping a supplemental tutorial on noise!

Another big disclaimer: There are significant differences in approach to noise analysis between the different PTAs. This is a quick intro to "vanilla" noise modeling but may not reflect what your local PTA is doing.

# Why do we do noise analysis?

Consider the schematic below, showing light going from the pulsar all the way to the telescope. 

We've already talked about pulsar spin parameters, pulsar binary parameters, and astrometric parameters. Noise analysis is what lets us (more fully) understand the effects of the interstellar medium, the solar wind, and other things that aren't modeled in our timing model. 

If that description sounds vague, it's because it is vague! Noise analysis can help us learn about intrinsic noise in the pulsar such as spin noise, about effects in our telescope, and about the insterstellar medium. With the partial exception of dispersion measure (see below), we don't have a clear analytic model for these processes. Instead, we'll use MCMC methods to find best fit parameters for both white (time uncorrelated) and red (time correlated) noise present in our PTA datasets.



<img src="images/nb_timing_schematic.png" alt="An illustration showing blue pulses arriving before red pulses" width="800" height="auto">

Image Credit: Agazie et al. 2023

# Sources of Noise in PTA Datasets.

What are some sources of noise in PTA datasets? 

- Effects from the underestimated TOA errors (achromatic -- independent of radio frequency)
    - Pulsar jitter 
    - Giant pulses
    - Mode changing
    - Nulling
- Spin Noise (achromatic -- independent of radio frequency)
  - Rotational Irregularities
  - Glitches
- Profile Changes (chromatic -- depends on radio frequency)
- Binary orbital irregularities (achromatic -- independent of radio frequency)
- Interstellar medium effects (chromatic -- depends on radio freqeuncy)
    - Dispersion
    - Refraction
    - Scintillation
- Solar Wind (achromatic, but spatially correlated)
- Solar System Ephemeris uncertainty (achromatic, but spatially correlated)
- Clock Errors (achromatic, but spatially correlated)
- Observatory based uncertainty (could be chromatic or achromatic)

Obviously, if we talk about each of these, we'll be here all day. But the point is: Noise enters our dataset from EVERYWHERE. 

Next, we'll talk briefly about how we describe these noise effects in a vanilla noise model.

# Dispersion Measure


The most prominent effect on timing comes from pulses traveling through the cold, ionized plasma of the ISM. The index of refraction of the medium is frequency-dependent, resulting in lower frequencies arriving later at the telescope than higher frequencies. The amount of time that a signal will be shifted by is given by

$$t_{\mathrm{DM}} = 4.15~\mathrm{ms~\left(\frac{\nu}{GHz}\right)^{-2}\left(\frac{DM}{pc~cm^{-3}}\right)}$$

where DM is the dispersion measure, the integral of the electron number density along the line-of-sight, $\mathrm{DM} =\displaystyle{\int}_0^D n_e dl$, where $D$ is the distance from the Earth to the Pulsar. Because every frequency is shifted by an amount
given by the equation above, one can determine the DM by measuring the delay of two different frequencies

$$t_{\mathrm{DM}} = 4.15~\mathrm{ms~\left[\left(\frac{\nu_1}{GHz}\right)^{-2} - \left(\frac{\nu_2}{GHz}\right)^{-2}\right]\left(\frac{DM}{pc~cm^{-3}}\right)}$$

in the same units as above. Therefore, by measuring the times-of-arrival (TOAs) of pulses at two different frequencies, we can estimate what the dispersion measure is and remove the effect.

What does this effect look like in our datasets? It looks like higher frequency (bluer) light arriving before lower frequency (redder) light, as shown in the figure below.


Image credit: H. Thankful Cromartie

<img src="images/DM.png" alt="An illustration showing blue pulses arriving before red pulses" width="400" height="auto">

Dispersion measure is in a funny place, because it's both part of our timing model and part of our noise model. For each pulsar, we have a nominal DM value -- the dispersion measure that removes this effect. For a lot of pulsar applications, that's enough. But for high precision timing (like in a PTA), we need a better model for DM. 

If you look carefully at our timing models from the previous tutorials, we've fit for DM, DM1, and DM2: DM and its first and second derivatives. But DM modeling can be even more sophisticated than that! Each PTA uses their own procedure, but in IPTA DR3, we're using a **Gaussian Process** approach.

Gaussian process models represent a series of values $\vec{y}$ as samples from a multivariate Gaussian distribution:
\begin{equation}
p(\vec{y}) = \mathscr{N}(m, \mathbf{C}), 
\end{equation}
where $m$ is the mean value and $\mathbf{C}$ is a covariance matrix.

There's a lot more description of how this works in for example [Larsen et al. 2024](https://iopscience.iop.org/article/10.3847/1538-4357/ad5291/pdf). But, the main point is that the DMGP model allows us to generate a time series for DM through the dataset. 

Here's an example time series for J1909-3744 (from Larsen et al. 2024). The x-axis shows the span of the observations in MJD; the y-axis shows the time series deviation from the nominal DM for this pulsar based on the DMGP model.

Image Credit: Larsen et al. 2024
<img src="images/DMGP.png" alt="The DMGP model for J1909-3744" width="400" height="auto">

# Noise Parameters (besides DM)

## White Noise Modeling

The TOA uncertainties we use for pulsar timing derive from the matched filter process of making TOAs, as described in [Taylor 1992](https://ui.adsabs.harvard.edu/abs/1992RSPTA.341..117T/abstract) and[ Lommen and Demorest 2013](https://ui.adsabs.harvard.edu/abs/2013CQGra..30v4001L/abstract).

However, there are source of TOA uncertainty that aren't present in this uncertainty estimate, like jitter, spin noise, scattering or RFI. For that reason we need to "boost" the white noise level or overall uncertainty in the TOAs to reflect reality. 

We use 3 parameters for this process: EFAC ($\mathcal{F}$), EQUAD ($\mathcal{Q}$) and ECORR ($\mathcal{J}$). Each combination of receiver and backend gets its own set of parameters, so our total number of white noise parameters is up to (not all systems get an ECORR) 3 times the number of frontend-backend pairs.

If we refer to each receiver backend pair as $re/be$, then the components of the covariance matrix are 
\begin{equation}
C_{ij} = \mathcal{F}^2(re/be) \left[\sigma_{S/N, i}^2 + \mathcal{Q}^2 (re/be) \right] \delta_{ij} + \mathcal{J}^2(re/be)\mathcal{U}_{ij}
\end{equation}
where the $i, j$ denote TOA indices across all observing epochs. The EFAC and EQUAD components are diagonal, and the ECORR component is block diagonal for a single observation, where $\mathcal{U}$ is a block diagonal matrix, with values of 1 for TOAs from the same observation and 0 for all other entries.

## Red Noise Modeling 

The red noise components of the noise model are a combination of all the signals that are time-correlated on long timescales. This can include ISM information as well as clock errors, solar system ephemeris errors, and ultimately the gravitational wave background. 

We describe these models with a power law of the form
\begin{equation}
P = A^2 f^{-\gamma}
\end{equation}
where $A$ is an amplitude and $\gamma$ is a spectral index.

## Noise Parameter Summmary

From Agazie et al. 2023
![noise_param_table](images/noise_param_table.png)



# Evaluating Noise Parameters

Noise parameters are generally found via Markov Chain Monte Carlo sampling, then we assess them based on the shape of the posteriors (and the general value of the parameters.) 

Below, I've reproduced the best resource I've ever found on evalutating noise paramter posteriors. (If you're a NANOGrav Timer, you already know what this is.) 

...

It's Jeff Hazboun's Legendary Happy & Sad Posterior Cartoons!

## Reasonable (Happy) Posteriors

The posteriors for the noise parameters usually fall into three categories, dictated by the shape of the posterior.

### Happy Posterior Shapes:
* A very constrained parameter, with a Gaussian-like posterior. [High Significance]
* A mildly constrained parameter, with part that is Gaussian-like and part that looks like it is dominated by the flat prior. [Low Significance]
* An unconstrained parameter, except that some portion of the highest end of the parameter space is excluded. [Upper Limit]

Examples of these are sketched below. 

### Happy Posterior Cartoons
<img src="images/noise_posteriors_happy.jpeg" alt="Happy Posteriors" width="1000" height="auto">


## Unusual Posteriors
Here we summarize some of the common forms of parameters that do not fall into the usual and expected categories above. 

### Unusual Posterior Shapes:
* A posterior that is pushed up against the highest end of its prior range. [Pushing against prior]
* A completely unconstrained parameter that is basically returning its flat, uniform prior. [Filling Prior]
* An EFAC that is significantly larger than 1. [High EFAC]
* An EFAC that extends down to values significantly below 1, with EQUAD values that go to high values at the same time. [Low EFAC and covariant EQUAD]
* 
### Sad Posterior Cartoons
<img src="images/noise_posteriors_sad.jpeg" alt="Sad Posteriors" width="1000" height="auto">

### Possible reasons for unusual posteriors
The reasons below should be added to as we investigate unusual noise posteriors. Currently most of these reasons are anecdotal. 

*They are listed in the order of how much should worry about them.*

#### Pushing Against Prior
If we actually believe that the value for a parameter pushing up against a prior is larger than the prior range, then the prior range should be extended.
* Significant outliers present
* Non-converged timing model
* ...

#### Filling Prior
* Too few TOAs to inform this parameter
* ...

#### _While uncommon the following types of posteriors are found in past datasets_

#### Low EFAC / Covariant EQUAD
* Too few TOAs to break degeneracy between EFAC and EQUAD
* ...

#### High EFAC
A few pulsar, like B1937+21, have large EFAC values, probably due to large scattering and profile mismatch. EFACs >4 have been used in past datasets. 
* Scattered profiles
* ...

# Further Reading

This is by no means a comprehensive list, but I've tried to compile a few papers that might help you learn more about noise modeling for PTAs.

More than anything else, this reading is inspired by [Agazie et al., 2023. "The NANOGrav 15 yr Data Set: Detector Characterization and Noise Budget"](https://ui.adsabs.harvard.edu/abs/2023ApJ...951L..10A/abstract). The lead authors on this paper were Jeff Hazboun and Michael Lam -- if you want to talk more about noise, I know they'd love to. 

Some sections are also derived from existing NANOGrav tutorials on DM & noise processes.


Other selected useful reading:\
[Verbiest et al., 2009 "Timing stability of millisecond pulsars and prospects for gravitational-wave detection "](https://ui.adsabs.harvard.edu/abs/2009MNRAS.400..951V/abstract) \
[Lam et al., 2016. " Systematic and Stochastic Variations in Pulsar Dispersion Measures "](https://ui.adsabs.harvard.edu/abs/2016ApJ...821...66L/abstract)\
[Cordes & Shannon, 2010. "A Measurement Model for Precision Pulsar Timing "](https://ui.adsabs.harvard.edu/abs/2010arXiv1010.3785C/abstract)\
[Goncharov et al., 2021 " Identifying and mitigating noise sources in precision pulsar timing data sets "](https://ui.adsabs.harvard.edu/abs/2021MNRAS.502..478G/abstract)\
[Chen et al., 2023. "The Chinese Pulsar Timing Array data release I. Single pulsar noise analysis "](https://ui.adsabs.harvard.edu/abs/2025arXiv250604850C/abstract)\
[Srivastava et al., 2023 "Noise analysis of the Indian Pulsar Timing Array data release I "](https://ui.adsabs.harvard.edu/abs/2024asi..confO..35S/abstract)\
[Chalumeau et al. 2023 "Noise analysis in the European Pulsar Timing Array data release 2 and its implications on the gravitational-wave background search "](https://ui.adsabs.harvard.edu/abs/2022MNRAS.509.5538C/abstract)\
[Reardon et al. 2023 "The Gravitational-wave Background Null Hypothesis: Characterizing Noise in Millisecond Pulsar Arrival Times with the Parkes Pulsar Timing Array "](https://ui.adsabs.harvard.edu/abs/2023ApJ...951L...7R%2F/abstract)\
[Larsen et al., 2024, "The NANOGrav 15 yr Data Set: Chromatic Gaussian Process Noise Models for Six Pulsars"](https://ui.adsabs.harvard.edu/abs/2024ApJ...972...49L/abstract)\
[Miles et al., 2025 "The MeerKAT Pulsar Timing Array: the 4.5-yr data release and the noise and stochastic signals of the millisecond pulsar population "](https://ui.adsabs.harvard.edu/abs/2025MNRAS.536.1467M/abstract)
