# Probability Distributions for Cospectra

We are trying to derive the probability distributions of cospectra (the real part of a cross spectrum). 

This notebook contains some practical experiments to confirm that our maths is correct.

First, let's make some light curves:

In [1]:
%load_ext autoreload
%autoreload 2
%matplotlib notebook
import matplotlib.pyplot as plt
import seaborn as sns

sns.set_context('talk')
sns.set_style("whitegrid")
sns.set_palette("colorblind")

import numpy as np
import scipy.stats
import scipy.special
import scipy.fftpack

from stingray import Lightcurve, Crossspectrum
np.random.seed(1209432)

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload




We're going to make simple, flat Poisson distributed light curves:

In [2]:
time = np.linspace(0, 10, 1000000)

counts1 = np.random.poisson(10, time.shape[0])  
counts2 = np.random.poisson(10, time.shape[0])  

lc1 = Lightcurve(time, counts1)
lc2 = Lightcurve(time, counts2)

Let's plot those:

In [3]:
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(10, 8), sharex=True)
ax1.plot(lc1.time, lc1.counts, linestyle="steps-mid", lw=1, color="black")
ax2.plot(lc2.time, lc2.counts, linestyle="steps-mid", lw=1, color="black")
ax1.set_xlim(lc1.time[0], lc1.time[-1])
ax2.set_xlim(lc1.time[0], lc1.time[-1])

ax2.set_xlabel("Time [s]")
ax1.set_ylabel("Counts/bin")
ax2.set_ylabel("Counts/bin")

plt.tight_layout()

<IPython.core.display.Javascript object>

Okay, now we can compute the Fourier transform for each light curve:

In [4]:
fourier1 = scipy.fftpack.fft(lc1.counts)
fourier2 = scipy.fftpack.fft(lc2.counts)

freqs = scipy.fftpack.fftfreq(lc1.n, lc1.dt)

fourier1 = fourier1[freqs>0]
fourier2 = fourier2[freqs>0]

The Fourier amplitudes themselves are distributed normally with a mean of 0 and a variance of $\sqrt{\sum_{k=1}^N{x_k}/2}$:

In [5]:
norm = scipy.stats.norm(0, np.sqrt(np.sum(lc1.counts)/2.0))

u = np.linspace(-10000, 10000, 100000)
prob_norm = norm.pdf(u)

In [6]:
fig, ax = plt.subplots(1, 1, figsize=(6,4))
ax.hist(fourier1.real, bins=500, normed=True, color="black", alpha=0.5,
        histtype="stepfilled", label="real part of \n Fourier amplitudes");

ax.plot(u, prob_norm, lw=2, color="red", label=r"$p(x) = N\left(0, \sqrt{\sum_{k=1}^N{x_k}/2}\right)$")

ax.set_xlabel("Fourier amplitude")
ax.set_ylabel("Probablity density")

ax.legend(prop={"size":10})
plt.tight_layout()

<IPython.core.display.Javascript object>

Okay, cool. This is the same for the other amplitudes (both real and imaginary).

We now need to calculate the probability of the product of two variables, $Z = A_{X,j} A_{Y,j}$.
This turns out to be related to the zeroth-order Bessel function of the second kind.

We are going to need the variance of our Fourier amplitudes:

In [7]:
std1 = np.sqrt(np.sum(lc1.counts)/2.0)
std2 = np.sqrt(np.sum(lc2.counts)/2.0)

Now we can define the probability distribution for the product of the two amplitudes:

In [8]:
def bessel_probability(x, std1, std2):
    x_abs = np.abs(x)
    both_std = std1 * std2
    y = x_abs/(both_std)
    order = 0
    k = scipy.special.kn(order, y)/(np.pi*both_std)
    return k

And make the corresponding multiplication for our simulated amplitudes:

In [9]:
a_prod = fourier1.real * fourier2.real

Let's compare this to the theoretically expected probability:

In [10]:
u = np.linspace(np.min(a_prod), np.max(a_prod), 100000)
k_prod = bessel_probability(u, std1, std2)

In [11]:
fig, ax = plt.subplots(1, 1, figsize=(6,4))
ax.hist(a_prod, bins=500, normed=True, color="black", alpha=0.5,
        histtype="stepfilled", label="product of real Fourier amplitudes");

ax.plot(u, k_prod, lw=2, color="red", label="product probability distribution")

ax.set_xlabel("product of Fourier amplitude")
ax.set_ylabel("Probablity density")

ax.set_xlim(-2e7, 2e7)
ax.set_ylim(0, 7.5e-7)
ax.legend(prop={"size":10})
plt.tight_layout()

<IPython.core.display.Javascript object>

That looks pretty good! Let's now define a Laplace distribution for the cospectral densities:

In [12]:
lapl = scipy.stats.laplace(0, std1*std2)

lapl_prob = lapl.pdf(u)

And we compute the co-spectral densities:

In [13]:
csd = (fourier1.real*fourier2.real + fourier1.imag*fourier2.imag)

In [14]:
fig, ax = plt.subplots(1, 1, figsize=(6,4))
ax.hist(csd, bins=500, normed=True, color="black", alpha=0.5,
        histtype="stepfilled", label="Simulated co-spectral densities");

ax.plot(u, lapl_prob, lw=2, color="red", label="Laplace distribution")

ax.set_xlabel("Co-spectral density")
ax.set_ylabel("Probablity density")

ax.set_xlim(-2e7, 2e7)
ax.set_ylim(0, 2.7e-7)
ax.legend(prop={"size":10})
plt.tight_layout()

<IPython.core.display.Javascript object>

For comparison, let's also create the power-spectral densities:

In [15]:
from stingray import Powerspectrum

In [16]:
ps = Powerspectrum(lc1, norm="leahy")
cs = Crossspectrum(lc1, lc2, norm="leahy")



In [17]:
u2 = np.linspace(0, 10, 100000)

chi2 = scipy.stats.chi2(2)
chi2_prob = np.exp(chi2.logpdf(u2))

In [18]:
chi2_prob

array([ 0.5       ,  0.499975  ,  0.49995   , ...,  0.00336931,
        0.00336914,  0.00336897])

In [19]:
fig, ax = plt.subplots(1, 1, figsize=(6,4))
ax.hist(ps.power, bins=500, normed=True, color="black", alpha=0.5,
        histtype="stepfilled", label="Simulated co-spectral densities");

ax.plot(u2, chi2_prob, lw=2, color="red", label="Chi squared distribution")

ax.set_xlabel("Co-spectral density")
ax.set_ylabel("Probablity density")

ax.set_xlim(0, 10)
ax.set_ylim(0, 0.6)
ax.legend(prop={"size":10})
plt.tight_layout()

<IPython.core.display.Javascript object>

Let's plot the normalized versions side by side:

In [20]:
mean_nphot = np.sqrt(np.sum(lc1.counts)*np.sum(lc2.counts))

csd_normed = 2*csd/mean_nphot
# From Bachetti+15, the standard deviation is expected to be ~sqrt(2)
print(np.std(csd_normed.real) / np.sqrt(2))

1.00064087352


In [21]:
u = np.linspace(-5, 5, 10000)

lapl = scipy.stats.laplace(0, 1)
lapl_prob = lapl.pdf(u)


In [22]:
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10,4), sharey=True)
ax1.hist(csd_normed, bins=500, normed=True, color="black", alpha=0.5,
        histtype="stepfilled", label="Simulated \n co-spectral densities");


ax1.plot(u, lapl_prob, lw=2, color="red", label="Laplace distribution")

ax1.set_xlabel("Co-spectral density")
ax1.set_ylabel("Probablity density")
ax1.set_xlim(-5, 5)
ax1.set_ylim(0, 0.6)

ax1.legend(prop={"size":11})

ax2.hist(ps.power, bins=500, normed=True, color="black", alpha=0.5,
        histtype="stepfilled", label="Simulated powers");

ax2.plot(u2, chi2_prob, lw=2, color="red", label=r"$\chi^2$ distribution")

ax2.set_xlabel("power spectral density")

ax2.legend(prop={"size":11})


ax2.set_xlim(0, 10)
ax2.set_ylim(0, 0.55)
ax.legend(prop={"size":10})
plt.tight_layout()

plt.savefig("../paper/cs_dist.png", format="png")

<IPython.core.display.Javascript object>

### Detection Probabilities

I want to know how the detection probabilities differ. The p-value of finding an outlier at least equal or higher than a certain observed power is roughly the tail probability. This is also the *survival function*:

In [23]:
lapl_sf = lapl.sf(u2)
chi2_sf = chi2.sf(u2)

In [24]:
from stingray.powerspectrum import classical_pvalue

In [25]:
cp = np.array([classical_pvalue(p, 1) for p in u2[1:]])

Where do we get a detection with 99.9%?

In [26]:
lapl_ppf = lapl.ppf(0.95)
chi2_ppf = chi2.ppf(0.95)

In [27]:
lapl_ppf

2.302585092994045

In [28]:
chi2_ppf

5.9914645471079799

In [29]:
fig, ax = plt.subplots(1, 1, figsize=(6,4))

ax.plot(u2, lapl_sf, lw=2, color="black", label="Laplace distribution")
ax.plot(u2[1:], cp, lw=2, color="red", label="Classical p-value")
ax.plot(u2, chi2_sf, lw=2, color="black", linestyle="dotted", label="Chi-square distribution")

ax.vlines(lapl_ppf, 0, 0.7, lw=2, color="black", zorder=10)
ax.vlines(chi2_ppf, 0, 0.7, lw=2, color="black", 
          linestyle="dotted", zorder=10)

ax.set_xlabel("Co-spectral density")
ax.set_ylabel("Tail probability")

ax.set_xlim(0, 10)
ax.set_ylim(0, 1)
ax.legend(prop={"size":10})
plt.tight_layout()

plt.savefig("../paper/tailprob.png", format="png")

<IPython.core.display.Javascript object>

#### Example Period Detection

Let's make an example: a small light curve with 1000 data points, we're going to make the cross-spectral density and the power spectral density and give both outliers:

In [30]:
time = np.linspace(0, 10, 1000)

period = 0.1
amp = 0.055

p = 10.0 * (1 + amp*np.sin(2.0*np.pi*time/period))

rng = np.random.RandomState(100)

counts1 = rng.poisson(p, time.shape[0])  
counts2 = rng.poisson(p, time.shape[0])  

lc1 = Lightcurve(time, counts1)
lc2 = Lightcurve(time, counts2)

In [31]:
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10,4), sharey=True)
ax1.plot(lc1.time, lc1.counts, linestyle="steps-mid", color="black")
ax2.plot(lc2.time, lc2.counts, linestyle="steps-mid", color="black")

ax1.set_xlim(lc1.time[0], lc1.time[-1])
ax2.set_xlim(lc2.time[0], lc2.time[-1])

ax1.set_xlabel("Time [s]")
ax2.set_xlabel("Time [s]")

ax1.set_ylabel("Counts/bin")
plt.tight_layout()

<IPython.core.display.Javascript object>

Let's make the cross spectra and power spectra:

In [32]:
ps = Powerspectrum(lc1, norm="leahy")
cs = Crossspectrum(lc1, lc2, norm="leahy")



In [33]:
lapl_ppf

2.302585092994045

In [34]:
chi2_ppf

5.9914645471079799

In [35]:
import matplotlib.patches as mpatches
from matplotlib.patches import Polygon
from matplotlib.collections import PatchCollection

In [36]:
fig, ax1 = plt.subplots(1, 1, figsize=(6,4))
ax1.plot(cs.freq, cs.power, linestyle="steps-mid", 
         color="black", label="simulated data")

ax1.set_xlim(ps.freq[0], ps.freq[-1])

lapl_ppf = lapl.isf(0.01/len(ps.freq))
chi2_ppf = chi2.isf(0.01/len(ps.freq))

ax1.hlines(lapl_ppf, ps.freq[0], ps.freq[-1], 
           lw=2, color="red", label="Laplace distribution")
ax1.hlines(chi2_ppf, ps.freq[0], ps.freq[-1], lw=2, 
           color="red", linestyle="dotted", label=r"$\chi^2$ distribution")

ax1.set_xlabel("Frequency [Hz]")
ax1.set_ylabel("Cospectral densities")

plt.legend(bbox_to_anchor=(0., 0.99, 1., .11), loc=0,
           ncol=3, mode="expand", borderaxespad=0.1,
          prop={"size":10})

plt.tight_layout()
plt.savefig("../paper/cs_detec.png", format="png")

<IPython.core.display.Javascript object>