### GW tutorial 2: Detector noise and GW150914

Author: Melissa Lopez

Email: m.lopez@uu.nl

Note that we are importing some packages to be able to read the data.

In [None]:
%matplotlib inline
from pycbc.catalog import Merger
import pycbc.psd
import pylab
import matplotlib.pyplot as plt
import numpy as np

In the previous part we have seen the detector response to gravitational waves (GW). Currently, we have three ground-based detectors: LIGO Hanford (H1), LIGO Livingston (L1) and Virgo (V1). But what does the data actually look like?

To answer this question we are going to take a look at the first detection: **GW150914**. Let's load the data from L1

In [None]:
m = Merger('GW150914')
ifo = 'L1'
data = m.strain(ifo)

**Exercise 1:** We can see that this object is a PyCBC timeseries, and as such you can check what their atributes are (see [here](https://pycbc.org/pycbc/latest/html/pycbc.types.html#module-pycbc.types.timeseries)). Let's check some of them. 

- How much is the duration of the time series?

- How much is its sampling rate?

- How many data points does it have?

- How much is $\Delta_{f}$ and $\Delta_{t}$? Can you find a relation between these and the number of data points?

_Hint:_ $\Delta_{f} = 1/duration$ and  $\Delta_{t} = 1/sample\_rate$

In [None]:
# Duration of the time series
duration = data.duration
print(f"Duration of the time series: {duration} seconds")

# Sampling rate
sample_rate = data.sample_rate
print(f"Sampling rate: {sample_rate} Hz")

# Time resolution
delta_t = 1 / sample_rate
print(f"Time resolution (Delta_t): {delta_t} seconds")

# Frequency resolution
delta_f = 1 / duration
print(f"Frequency resolution (Delta_f): {delta_f} Hz")

# Number of data points
num_points = len(data)
print(f"Number of data points: {num_points}")

# Verifying the relation between num_points, delta_t, and duration
calculated_num_points = duration / delta_t
print(f"Calculated number of points (duration / Delta_t): {calculated_num_points}")


The previous magnitudes are key parameters of the timeseries. Now, let's plot the actual data to see what it looks like.

In [None]:
plt.plot(data.sample_times, data, label='Raw L1 data', color='cornflowerblue')
plt.xlabel('GPS time (s)')
plt.ylabel('Amplitude (dimensionless)')
plt.legend()

In GW we use the GPS time to determine at what time the GW signals reached Earth. You can see that the amplitude of this data is super small, $\mathcal{O}(10^{-18})$! GW interferometers are able to detect a change in distance of ~1/10,000th the size of a proton. 

But, where is GW190514? We cannot see it (yet) as there are many different contributions from the detector. The detector has a given "noise budget" given  by the power spectral density (PSD), $S_{n}(f)$, according to its specific design. 

**Exercise 2**: Estimate the PSD of the data using `filter_psd` (see [here](https://pycbc.org/pycbc/latest/html/pycbc.types.html#pycbc.types.timeseries.TimeSeries.filter_psd)) and plot it. Limit your plot according to the minimum frequency (say, 1 Hz) and  the Nyquist frequency. 

_Hint_: Note that the PSD is a frequency series.

In [None]:
# Estimate the PSD of the data using the filter_psd method
psd = data.filter_psd(segment='All', window='hann', samples=4, method='median')

# Define frequency range for plotting (1 Hz to Nyquist frequency)
min_freq = 1.0  # Minimum frequency (1 Hz)
nyquist_freq = data.sample_rate / 2

# Plot the PSD
plt.figure(figsize=(10, 6))
plt.loglog(psd.sample_frequencies, psd)
plt.xlim(min_freq, nyquist_freq)
plt.xlabel('Frequency [Hz]')
plt.ylabel('Power Spectral Density [1/Hz]')
plt.title('Power Spectral Density of GW150914 (LIGO Livingston)')
plt.grid(True)
plt.show()

The PSD shows us what are the different contributions of the detector noise.
While we computed the PSD from 1 Hz, you can see that there is a weird behaviour for frequencies < 5 Hz. This is because the detector is not calibrated at these frequencies, so our analysis needs to start at higher frequencies. For current ground based detectors, we usually start at 10 Hz or higher, depending on the source.

GW signals are tiny, so with this much noise we won't be able to see GW150914. To "flatten" all frequency contributions we whiten the data.

**Exercise 3:** We can whiten the data with [this](https://pycbc.org/pycbc/latest/html/pycbc.types.html#pycbc.types.timeseries.TimeSeries.whiten) function. Use `segment_duration = 4` and `max_filter_duration=4`. 

- Plot the whitened data next to the raw detector noise. What differences can you see?

- Estimate the PSD of the whitened data. Plot it in the same graph as the raw PSD. What differences can you see?


In [None]:
# Whiten the data with segment_duration and max_filter_duration
whitened_data = data.whiten(segment_duration=4, max_filter_duration=4)

# Plot the raw data and the whitened data
plt.figure(figsize=(12, 6))
plt.subplot(2, 1, 1)
plt.plot(data.sample_times, data, label="Raw Data", color="blue")
plt.plot(whitened_data.sample_times, whitened_data, label="Whitened Data", color="red", alpha=0.7)
plt.xlabel('Time [s]')
plt.ylabel('Strain')
plt.title('Raw vs Whitened Data (GW150914)')
plt.legend()
plt.grid(True)

# Estimate and plot the PSD of the raw data
psd_raw = data.filter_psd(segment='All', window='hann', samples=4, method='median')

# Estimate and plot the PSD of the whitened data
psd_whitened = whitened_data.filter_psd(segment='All', window='hann', samples=4, method='median')

# Plot the PSDs
plt.subplot(2, 1, 2)
plt.loglog(psd_raw.sample_frequencies, psd_raw, label="Raw PSD", color="blue")
plt.loglog(psd_whitened.sample_frequencies, psd_whitened, label="Whitened PSD", color="red", alpha=0.7)
plt.xlim(10, nyquist_freq)
plt.xlabel('Frequency [Hz]')
plt.ylabel('Power Spectral Density [1/Hz]')
plt.title('Raw vs Whitened PSD')
plt.legend()
plt.grid(True)

plt.tight_layout()
plt.show()

Now all frequency contributions are at the same level, as we can see from the PSD. However, GW150914 is still hidden in the data. 


**Exercise 4:** We can apply a [low pass filter](https://pycbc.org/pycbc/latest/html/pycbc.types.html#pycbc.types.timeseries.TimeSeries.lowpass_fir) and a [high pass filter](https://pycbc.org/pycbc/latest/html/pycbc.types.html#pycbc.types.timeseries.TimeSeries.highpass_fir) to limit the frequency of the data. The low pass filters above 250 Hz (order=512) and the high pass filters below 30 Hz (order=512).

- Estimate the PSD of the bandpassed data and plot it together with the raw PSD and the whitened PSD. What is the bandpass doing?

- Plot the bandpassed data [cropping](https://pycbc.org/pycbc/latest/html/pycbc.types.html#pycbc.types.timeseries.TimeSeries.crop) 12s on the left and 13s on the right. What can you see? Compare this to the raw data. What frequencies dominate in each case?

In [None]:
# Apply the high pass filter below 30 Hz and low pass filter above 250 Hz
bandpassed_data = data.highpass_fir(30, order=512).lowpass_fir(250, order=512)

# Estimate the PSD of the bandpassed data
psd_bandpassed = bandpassed_data.filter_psd(segment='All', window='hann', samples=4, method='median')

# Plot the raw, whitened, and bandpassed PSDs
plt.figure(figsize=(12, 6))

plt.loglog(psd_raw.sample_frequencies, psd_raw, label="Raw PSD", color="blue")
plt.loglog(psd_whitened.sample_frequencies, psd_whitened, label="Whitened PSD", color="red", alpha=0.7)
plt.loglog(psd_bandpassed.sample_frequencies, psd_bandpassed, label="Bandpassed PSD", color="green", alpha=0.7)

plt.xlim(10, nyquist_freq)
plt.xlabel('Frequency [Hz]')
plt.ylabel('Power Spectral Density [1/Hz]')
plt.title('PSD of Raw, Whitened, and Bandpassed Data')
plt.legend()
plt.grid(True)

plt.show()

# Crop 12 seconds from the left and 13 seconds from the right of the bandpassed data
cropped_bandpassed_data = bandpassed_data.crop(start=12, end=-13)

# Plot the raw and bandpassed (cropped) data
plt.figure(figsize=(12, 6))

plt.plot(data.sample_times, data, label="Raw Data", color="blue", alpha=0.7)
plt.plot(cropped_bandpassed_data.sample_times, cropped_bandpassed_data, label="Bandpassed Data (Cropped)", color="green", alpha=0.7)

plt.xlabel('Time [s]')
plt.ylabel('Strain')
plt.title('Raw vs Bandpassed (Cropped) Data')
plt.legend()
plt.grid(True)

plt.show()


**Exercise 5:** Last but not least, we generate a spectrogram (time-frequency representation) using [this](https://pycbc.org/pycbc/latest/html/pycbc.types.html#pycbc.types.timeseries.TimeSeries.qtransform) function.

_Hint_: logfsteps=200, qrange=(110, 110), frange=(20, 512), vmax=3.5 as it is standard

In [None]:
# Generate the spectrogram (time-frequency representation) using qtransform
spectrogram = data.qtransform(logfsteps=200, qrange=(110, 110), frange=(20, 512), vmax=3.5)

# Plottng
plt.figure(figsize=(12, 6))
spectrogram.plot()
plt.title('Spectrogram of GW150914 (LIGO Livingston)')
plt.xlabel('Time [s]')
plt.ylabel('Frequency [Hz]')
plt.show()


As we have seen, the detector has a given "noise budget" given  by the power spectral density (PSD), $S_{n}(f)$, according to its specific design. Let's see how the detectors will improve in the next observing runs.

**Bonus track:** From the `sensitivity_curves` folder [load](https://pycbc.org/pycbc/latest/html/pycbc.psd.html#pycbc.psd.read.from_txt) the PSD s:

- Third observing run of H1: `aligo_O3actual_H1.txt`
- Third observing run of L1: `aligo_O3actual_L1.txt`
- Third observing run of V1: `avirgo_O3actual.txt`
- Simulated fourth observing run of LIGO: `aligo_O4high.txt`

Note that these PSDs are from [LIGO public website](https://dcc.ligo.org/ligo-t2000012/public). Also read the PSD of [Einstein Telescope](https://pycbc.org/pycbc/latest/html/pycbc.psd.html#pycbc.psd.analytical.EinsteinTelescopeP1600143) (ET) and [Cosmic Explorer](https://pycbc.org/pycbc/latest/html/pycbc.psd.html#pycbc.psd.analytical.CosmicExplorerP1600143) (CE)

Plot all these PSDs. What can you say about ET and CE improvements?

_Hint_: Minimum frequency is 10 Hz, sampling rate 8192 Hz and duration is 16s.


In [None]:
# Load the PSDs for each detector and observing run
psd_h1 = psd.read.from_txt("sensitivity_curves/aligo_O3actual_H1.txt")
psd_l1 = psd.read.from_txt("sensitivity_curves/aligo_O3actual_L1.txt")
psd_v1 = psd.read.from_txt("sensitivity_curves/avirgo_O3actual.txt")
psd_o4 = psd.read.from_txt("sensitivity_curves/aligo_O4high.txt")

# Load the PSDs for Einstein Telescope (ET) and Cosmic Explorer (CE)
psd_et = psd.analytical.EinsteinTelescopeP1600143()
psd_ce = psd.analytical.CosmicExplorerP1600143()

# Define frequency range and plot settings
min_freq = 10  # Minimum frequency in Hz
sampling_rate = 8192  # Sampling rate in Hz
duration = 16  # Duration in seconds
nyquist_freq = sampling_rate / 2  # Nyquist frequency

# Plot all the PSDs
plt.figure(figsize=(12, 6))

# LIGO (H1, L1, V1)
plt.loglog(psd_h1.sample_frequencies, psd_h1, label="H1 - O3", color="blue")
plt.loglog(psd_l1.sample_frequencies, psd_l1, label="L1 - O3", color="red")
plt.loglog(psd_v1.sample_frequencies, psd_v1, label="V1 - O3", color="green")

# simulated fourth observing run of LIGO
plt.loglog(psd_o4.sample_frequencies, psd_o4, label="LIGO - O4", color="purple")

# Plot the PSDs for Einstein Telescope and Cosmic Explorer
plt.loglog(psd_et.sample_frequencies, psd_et, label="Einstein Telescope (ET)", color="orange")
plt.loglog(psd_ce.sample_frequencies, psd_ce, label="Cosmic Explorer (CE)", color="brown")

# Set plot limits and labels
plt.xlim(min_freq, nyquist_freq)
plt.xlabel('Frequency [Hz]')
plt.ylabel('Power Spectral Density [1/Hz]')
plt.title('Comparison of PSDs for Different Detectors and Observing Runs')
plt.legend(loc="lower left")
plt.grid(True)

plt.show()


Good job! This is the end of the second part. Maybe you can go for a break?