<hr style="height: 1px;">
<i>This notebook was authored by the 8.S50x Course Team, Copyright 2022 MIT All Rights Reserved.</i>
<hr style="height: 1px;">
<br>

<h1>Project 1: Detecting Gravitational-Waves from Binary Black Hole Merger 
with LIGO Open Data</h1>


<a name='section_1_0'></a>
<hr style="height: 1px;">


## <h2 style="border:1px; border-style:solid; padding: 0.25em; color: #FFFFFF; background-color: #90409C">PROJ1.0 Overview</h2>


<h3>Navigation</h3>

<table style="width:100%">
    <tr>
        <td style="text-align: left; vertical-align: top; font-size: 10pt;"><a href="#section_1_1">PROJ1.1 Experiment with GW-tools</a></td>
        <td style="text-align: left; vertical-align: top; font-size: 10pt;"><a href="#problems_1_1">PROJ1.1 Checkpoints</a></td>
    </tr>
    <tr>
        <td style="text-align: left; vertical-align: top; font-size: 10pt;"><a href="#section_1_2">PROJ1.2 Filters on LIGO Data</a></td>
        <td style="text-align: left; vertical-align: top; font-size: 10pt;"><a href="#problems_1_2">PROJ1.2 Checkpoints</a></td>
    </tr>
    <tr>
        <td style="text-align: left; vertical-align: top; font-size: 10pt;"><a href="#section_1_3">PROJ1.3 Analytic Model Part I - Frequency vs. Time</a></td>
        <td style="text-align: left; vertical-align: top; font-size: 10pt;"><a href="#problems_1_3">PROJ1.3 Checkpoints</a></td>
    </tr>
    <tr>
        <td style="text-align: left; vertical-align: top; font-size: 10pt;"><a href="#section_1_4">PROJ1.4 Analytic Model Part I - Strain vs. Time</a></td>
        <td style="text-align: left; vertical-align: top; font-size: 10pt;"><a href="#problems_1_4">PROJ1.4 Checkpoints</a></td>
    </tr>
    <tr>
        <td style="text-align: left; vertical-align: top; font-size: 10pt;"><a href="#section_1_5">PROJ1.5 Search for Signal in Long Time Range</a></td>
        <td style="text-align: left; vertical-align: top; font-size: 10pt;"><a href="#problems_1_5">PROJ1.5 Checkpoints</a></td>
    </tr>
</table>



<h3>Learning Objectives</h3>

In this Project we will explore the following objectives:

- experiment with GW-tools
- transformations and filters on LIGO data (fourier, PSD, whitening, etc.)
- creating and fitting functions to data in time and frequency domain
- extrapolating parameters from functional fits
- searches for signals in the long time range of LIGO data


<h3>Introduction to Gravitational-Waves</h3>

The existence of gravitational waves (GW) was first predicted by Albert Einstein in his General Theory of Relativity in 1916. He found that the linearized weak-field equations had wave solutions. By analogy to electromagnetism, time variation of the mass quadrupole moment of the source is expected to lead to transverse waves of spatial strain. The existence of GW was first demonstrated in 1974 by the discovery of a binary system composed of a pulsar in orbit around a neutron star by Hulse and Taylor <a href="https://ui.adsabs.harvard.edu/abs/1975ApJ...195L..51H/abstract" target="_blank">[1]</a>. However, direct detections of GW did not arrive until 2016. In that year, The LIGO (The Laser Interferometer Gravitational-Wave Observatory) collaboration reported the first direct detection of GW from a binary black hole system merging to form a single black hole <a href="https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.116.061102" target="_blank">[2]</a>. The observations reported in this paper and futher GW detections wprovide new tests of generay relativity in its strong-field regime, and GW observations have become an important new means to learn about the Universe.

In this project, you will reproduce the results reported in <a href="https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.116.061102" target="_blank">[2]</a> with LIGO open data. This tutorial will show how to analyze a particular GW event GW150914 (the first GW ever detected). In the tutorial, you will find how to download the data collected by Handford Observatory starting from Mon Sep 14 09:16:37 GMT 2015, plot the strain, whiten and filter the strain, plot a q-transform of the data, and extract the features of the source with a simple analytic model. After getting familiar with the basic analysis methods, you need to explore more events, check the consistence between detectors, match numerical relativity waveform template to extract accurate information of the source, compare with LIGO published results, and develop a machinery to search GW event within a long time range.

<h3>Setting the correct matplotlib version</h3>

First, let's install the correct version of matplotlib for our packages. 

**If you are using Google Colab**: This install will require you to restart the runtime because colab has a different version of matplotlib installed. Run the cell below and then click "RESTART RUNTIME" or go to Runtime -> Restart runtime. Then, run the cell again and continue on. 

**If you are using standalone Jupyter Notebook**: This step should run without problem and will install matplotlib=3.3.0 onto the python kernel that jupyter is using. 

In [None]:
#>>>RUN: PROJ1.0-runcell00

!pip install matplotlib==3.3.0

<h3>Importing Libraries</h3>

Before beginning, run the cell below to import the relevant libraries for this notebook. 

In [None]:
#>>>RUN: PROJ1.0-runcell02

# pip will install the packages if they aren't availiable on your environment
!pip install gwpy numpy scipy h5py wget lmfit

In [None]:
#>>>RUN: PROJ1.0-runcell03

# Import the packages you will need for the project
import numpy as np
import math
from gwpy.timeseries import TimeSeries
from scipy.linalg import fractional_matrix_power
from scipy.stats import zscore

import matplotlib.pyplot as plt
import matplotlib.mlab as mlab
import h5py

import wget
import os 

import lmfit
from lmfit import Model, minimize, fit_report, Parameters

<h3>Setting Default Figure Parameters</h3>

The following code cell sets default values for figure parameters.

In [None]:
#>>>RUN: PROJ1.0-runcell04

#set plot resolution
%config InlineBackend.figure_format = 'retina'

#set default figure parameters
plt.rcParams['figure.figsize'] = (9,6)

medium_size = 12
large_size = 15

plt.rc('font', size=medium_size)          # default text sizes
plt.rc('xtick', labelsize=medium_size)    # xtick labels
plt.rc('ytick', labelsize=medium_size)    # ytick labels
plt.rc('legend', fontsize=medium_size)    # legend
plt.rc('axes', titlesize=large_size)      # axes title
plt.rc('axes', labelsize=large_size)      # x and y labels
plt.rc('figure', titlesize=large_size)    # figure title

<a name='section_1_1'></a>
<hr style="height: 1px;">

## <h2 style="border:1px; border-style:solid; padding: 0.25em; color: #FFFFFF; background-color: #90409C">PROJ1.1 Experiment with GW-tools</h2>    

| [Top](#section_1_0) | [Previous Section](#section_1_0) | [Checkpoints](#problems_1_1) | [Next Section](#section_1_2) |


<h3>Downloading and Plotting the Time Series Data</h3>

In this section, we will load in a GW and experiment with different methods to visualize it. First, we'll **take a look at the 32 seconds of raw time-series data** but will quickly see that a signal is almost impossible to distinguish from this the raw data due to LIGO detector noise. We will apply a transformation in order to see where this noise is originating from. 

First download the data. Then make a plot of strain vs. time. Run the cells below. Source and attribution information is found below.

>**data:** H-H1_GWOSC_4KHZ_R1-1126257415-4096.hdf5 <br>
>**source:** https://www.gw-openscience.org/eventapi/html/GWTC-1-confident/GW150914/v3/ <br>
>**attribution:** R. Abbott et al. (LIGO Scientific Collaboration and Virgo Collaboration), "Open data from the first and second observing runs of Advanced LIGO and Advanced Virgo", SoftwareX 13 (2021) 100658. <br>
>**use statement:** "This research has made use of data or software obtained from the Gravitational Wave Open Science Center (gw-openscience.org), a service of LIGO Laboratory, the LIGO Scientific Collaboration, the Virgo Collaboration, and KAGRA. LIGO Laboratory and Advanced LIGO are funded by the United States National Science Foundation (NSF) as well as the Science and Technology Facilities Council (STFC) of the United Kingdom, the Max-Planck-Society (MPS), and the State of Niedersachsen/Germany for support of the construction of Advanced LIGO and construction and operation of the GEO600 detector. Additional support for Advanced LIGO was provided by the Australian Research Council. Virgo is funded, through the European Gravitational Observatory (EGO), by the French Centre National de Recherche Scientifique (CNRS), the Italian Istituto Nazionale di Fisica Nucleare (INFN) and the Dutch Nikhef, with contributions by institutions from Belgium, Germany, Greece, Hungary, Ireland, Japan, Monaco, Poland, Portugal, Spain. KAGRA is supported by Ministry of Education, Culture, Sports, Science and Technology (MEXT), Japan Society for the Promotion of Science (JSPS) in Japan; National Research Foundation (NRF) and Ministry of Science and ICT (MSIT) in Korea; Academia Sinica (AS) and National Science and Technology Council (NSTC) in Taiwan."   

In [None]:
#>>>RUN: PROJ1.1-runcell01
    
# Download the corresponding dataset
import wget
import os 

try: 
    os.mkdir('PROJ1') 
except OSError as error: 
    print(error)

wget.download('https://www.gw-openscience.org/eventapi/html/GWTC-1-confident/GW150914/v3/H-H1_GWOSC_4KHZ_R1-1126257415-4096.hdf5', 'PROJ1')

In [None]:
#>>>RUN: PROJ1.1-runcell02

# Set parameters
fn = 'PROJ1/H-H1_GWOSC_4KHZ_R1-1126257415-4096.hdf5' # data file
tevent = 1126259462.422 # GPS time (continuous time scale from Jan 1980)
evtname = 'GW150914' # event name
detector = 'H1' # detector: L1 or H1

# Load LIGO data and crop 16 seconds before and 16 seconds after the event
strain = TimeSeries.read(fn, format='hdf5.gwosc')
center = int(tevent)
strain = strain.crop(center-16, center+16)

# Show the LIGO strain vs. time
plt.figure()
strain.plot()
plt.ylabel('strain')
plt.show()

### <span style="border:3px; border-style:solid; padding: 0.15em; border-color: #90409C; color: #90409C;">Checkpoint 1.1.1</span>

Carefully read the plot of LIGO strain vs. time to understand the axes.

What day of the week (in UTC time-zone) was the GW observation made?

Hint: The plotting API includes this information. 

We can see that the signal (occuring at 16 seconds) in the above figure, is almost impossible to see. In fact, it is **lower** magnitude than the rest of the time-series - this is because LIGO data is dominated by low frequency noise! Let's take a look at exactly where the this noise is coming from. 

<h3>Plotting in the Frequency Domain</h3>

Plotting these data in the Fourier domain gives us an idea of the frequency content of the data. A way to visualize the frequency content of the data is to plot the amplitude spectral density, ASD. The ASDs are the square root of the power spectral densities (PSDs), which are **averages of the square of the fast fourier transforms (FFTs)** of the data. They are an estimate of the "strain-equivalent noise" of the detectors versus frequency, which limit the ability of the detectors to identify GW signals.

Run the following code to plot the ASD vs. frequency.

In [None]:
#>>>RUN: PROJ1.1-runcell03

# Plot the amplitude spectrum density (ASD)

# ASD be created by the built in ASD function in GWpy
asd = strain.asd(fftlength=8)

plt.clf()
asd.plot()
plt.xlim(10, 2000)
plt.ylim(1e-24, 1e-19)
plt.ylabel('ASD (strain/Hz$^{1/2})$')
plt.xlabel('Frequency (Hz)')
plt.show()

You can see strong spectral lines in the data - corresponding to large power sources at the corresponding frequencies. They are all of instrumental origin. Some are engineered into the detectors (mirror suspension resonances at ~500 Hz and harmonics, calibration lines, control dither lines, etc) and some (60 Hz and harmonics) are unwanted. We'll return to these, later.

You can't see the signal in this plot, since it is relatively weak and less than a second long, while this plot averages over 32 seconds of data. So this plot is entirely dominated by instrumental noise.


### <span style="border:3px; border-style:solid; padding: 0.15em; border-color: #90409C; color: #90409C;">Checkpoint 1.1.2</span>

Above 20 Hz, at what frequency (to the nearest 100 Hz) does the largest spectral-line amplitude noise occur on the LIGO detector? Enter your answer as number with precision 1e2.

Hint: Look more closely at the data by changing your `(x,y)` limits.

In [None]:
#>>>PROBLEM: PROJ1.1.2
# Use this cell for drafting your solution (if desired),
# then enter your solution in the interactive problem online to be graded.

asd = strain.asd(fftlength=8)

x_min = #YOUR CODE HERE
x_max = #YOUR CODE HERE
y_min = #YOUR CODE HERE
y_max = #YOUR CODE HERE

plt.clf()
asd.plot()
plt.xlim(x_min, x_max)
plt.ylim(y_min, y_max)
plt.ylabel('ASD (strain/Hz$^{1/2})$')
plt.xlabel('Frequency (Hz)')
plt.show()

>#### Follow-up 1.1.2a (ungraded)
>
>What are the largest sources of noise in the signal? Can you identify them?

<a name='section_1_2'></a>
<hr style="height: 1px;">

## <h2 style="border:1px; border-style:solid; padding: 0.25em; color: #FFFFFF; background-color: #90409C">PROJ1.2 Filters on LIGO Data</h2>    

| [Top](#section_1_0) | [Previous Section](#section_1_1) | [Checkpoints](#problems_1_2) | [Next Section](#section_1_3) |

<h3>Overview</h3>

In this section, we will continue working with the LIGO data we just loaded in. However, now we will **apply filters in order to clean up the data**. Filters can be applied in a variety of different physics applications to remove unwanted dependencies in your data. For time-series we will use the whitening and bandpass filters. 


<h3>Whitening</h3>

From the ASD above, we can see that noise fluctuations are much larger at low and high frequencies and near spectral lines. 

We can "whiten" the data, suppressing the extra noise at low frequencies and at the spectral lines, to better see the weak signals in the most sensitive band. This is done by normalizing the power at all frequencies so that excess power at any frequency is more obvious. The whitening transformation is well explained here <a href="https://courses.media.mit.edu/2010fall/mas622j/whiten.pdf" target="_blank">[3]</a>. With timeseries, there is a shortcut - taking the ifft(fft/asd)! This is implemented below.


In [None]:
#>>>RUN: PROJ1.2-runcell01

def rough_whitener(strain_data, crop_window=30): 
  asd_data = strain_data.asd()
  fft_data = strain_data.fft()
  whitened = np.fft.irfft(np.abs(1/asd_data)*fft_data)
  return whitened

whitened_timeseries = TimeSeries(rough_whitener(strain))
whitened_timeseries.t0 = tevent - 16 # defining start time for plot
whitened_timeseries.dt = 1/4096 # defining timestep at 4096 Hz
plt.clf()
whitened_timeseries[0:-1].plot() # you can plot different steps of time here
plt.ylabel('strain (whitened)')
plt.title('Roughly Whitened Data')
plt.show()

### <span style="border:3px; border-style:solid; padding: 0.15em; border-color: #90409C; color: #90409C;">Checkpoint 1.2.1</span>

Look very closely at the "Roughly Whitened Data" plot. What unwanted effect did the transform produce? Keep in mind that we want data to be normalized by the ASD. 

Hint: If needed, you can look around by plotting different parts of the timeseries. 


      A. The transform has not normalized the data. 
      B. The time on the plot is wrong. 
      C. The transform has not removed noise and there are spikes everywhere. 
      D. There are edge effects that look wrong at the very beginning and very end. 





<h3>Removing Artifacts</h3>

The GWpy whitener <a href="https://github.com/gwpy/gwpy/blob/master/gwpy/timeseries/timeseries.py#L1657-L1752" target="_blank">[4]</a> does some additional fancy time-series operations, including fixing the edge effects of the rough whitener using a window function <a href="https://en.wikipedia.org/wiki/Window_function" target="_blank">[5]</a>, so we will use that included function. 

In [None]:
#>>>RUN: PROJ1.2-runcell02

# Whitening data using GWpy 

white_data = strain.whiten() #We will just use the GWpy whitening function here
plt.clf()
white_data.plot()
plt.ylabel('strain (whitened)')
plt.show()

Now we are starting to see some signal around 16 seconds! It's still not good enough, so we will apply a band-pass filter to reject high frequency noise. But first, lets check out the Q-tranform to see what frequency ranges contain our GW. 

<h3>Q-Transform</h3>

To see where in the frequency spectrum of our signal lies (and see what range to bandpass on), we take a q-transform of the data and check the time variation of frequency. This q-transform is formed by taking the magnitude of the Short-time Fourier transform [[5]](https://en.wikipedia.org/wiki/Short-time_Fourier_transform), normally on a log-intensity axis (e.g. Energy).

In [None]:
#>>>RUN: PROJ1.2-runcell03

dt = 1  #-- Set width of q-transform plot, in seconds
hq = strain.q_transform(outseg=(tevent-dt, tevent+dt))

plt.clf()
fig = hq.plot()
ax = fig.gca()
fig.colorbar(label="Normalised energy")
ax.grid(False)
plt.xlim(tevent-0.5, tevent+0.5)
plt.ylim(0, 1000)
plt.ylabel('Frequency (Hz)')
plt.title('Q-Transform')
plt.show()

<h3>Bandpassing</h3>

A bandpass filter can select data from a specific range of frequencies and is used in a wide variety of physics applications. The code below shows you a framework to do bandpassing within GWpy. This will **select data with frequencies above bandpass_low and below bandpass_high**.

Below is defined a function that will plot the timeseries and ASD after applying a bandpass filter range. At first, a bandpass of `[1,1000]` is chosen, which essentially leaves the data unchanged. The plots are as follows:
- the whitened time series data after applying the bandpass
- a 0.3 second window of the whitened time series data, centered around the known event
- the ASD of the unwhitened data, after applying the bandpass

In [None]:
#>>>RUN: PROJ1.2-runcell04

def bandpass(bandpass_low, bandpass_high): 
  white_data_bp = white_data.bandpass(bandpass_low, bandpass_high)

  plt.clf()
  white_data_bp.plot()
  plt.ylabel('strain (whitened + band-pass)')
  plt.title('32 Second Window around GW')
  plt.show()

  plt.clf()
  white_data_bp.plot()
  plt.ylabel('strain (whitened + band-pass)')
  plt.xlim(tevent-0.15, tevent+0.15)
  plt.title('0.3 Second Window around GW')
  plt.show()

  #Note: for the sake of comparison, we are using unwhitened strain
  strain_bandpass = strain.bandpass(bandpass_low, bandpass_high) 
  asd = strain_bandpass.asd(fftlength=8)
  plt.clf()
  asd.plot()
  plt.xlim(10, 2000)
  plt.ylim(1e-24, 1e-19)
  plt.ylabel('ASD (strain/Hz$^{1/2})$')
  plt.xlabel('Frequency (Hz)')
  plt.show()
  
  return white_data_bp


bandpass_low, bandpass_high = [1,1000]
bandpass(bandpass_low, bandpass_high)

### <span style="border:3px; border-style:solid; padding: 0.15em; border-color: #90409C; color: #90409C;">Checkpoint 1.2.2</span>

Your goal in this checkpoint will be to select the correct bandpassing range based on the **Q-transform plot** we made above. In the below cell, experiment with the filter thresholds (`bandpass_low` and `bandpass_high`), and find a window leading to a **clear** signal on the upper and middle (strain) plots.  On the lower (ASD) plot, check out how this bandpass filter affects the ASD and compare to the original ASD plot!

Enter your answer as a list of two numbers, where you have chosen suitable values within the correct bandpassing range: `[bandpass_low, bandpass_high]`

Hint: Keep in mind that `bandpass_low >0` and if needed, look up the typical frequency range for a BBH merger. A *clear* signal will be around 2x larger magnitude than the normal background data.

In [None]:
#>>>PROBLEM: PROJ1.2.2

bandpass_low = X #YOUR CODE HERE
bandpass_high = Y #YOUR CODE HERE

white_data_bp = bandpass(bandpass_low, bandpass_high)

>#### Follow-up 1.2.2a (ungraded)
>
>Can you optimize your choice of bandpass based on some constraint on the Q-transform and/or time series data? Try to write this code and return a choice of bandpass automatically.

<a name='section_1_3'></a>
<hr style="height: 1px;">

## <h2 style="border:1px; border-style:solid; padding: 0.25em; color: #FFFFFF; background-color: #90409C">PROJ1.3 Analytic Model Part I - Frequency vs. Time</h2>    

| [Top](#section_1_0) | [Previous Section](#section_1_2) | [Checkpoints](#problems_1_3) | [Next Section](#section_1_4) |


<h3>Overview</h3>

In this section, you will develop an analytic model to describe a black hole merger event. Then, by fitting this model to your data, we will extract parameters of the event itself, including the reduced masses of the orbiting binary system. 

<h3>Analytic Model: Frequency vs Time</h3>

Calculating the actual waveform requires complicated numerical simulations. However, we can use the basic knowledge of General Relativity (GR) and Newtonian mechanics to perform an approximate analytic calculation for the waveform. For an orbiting binary system ($m_{1}$ and $m_{2}$), according to GR the frequency ($\omega(t)$) of the GW radiation satisfies

$$
\begin{equation}
\dot{\omega} = \frac{12}{5}2^{\frac{1}{3}}\left(\frac{G\mathcal{M}_{c}}{c^{3}}\right)^{\frac{5}{3}}\omega^{\frac{11}{3}}
\tag{1} 
\end{equation}
$$

where $G$ and $c$ are gravitational constant and the speed of light respectively. $\mathcal{M}_{c}$ is so-called chirp mass, defined by

$$
\begin{equation}
\mathcal{M}_{c} = \frac{(m_{1}m_{2})^{\frac{3}{5}}}{(m_{1}+{m_{2}})^{\frac{1}{5}}}
\tag{2}
\end{equation}
$$

Integrating Eq. [1], we can get

$$
\begin{eqnarray}
\int\omega^{-\frac{11}{3}}d\omega &= \int\frac{12}{5}2^{\frac{1}{3}}\left(\frac{G\mathcal{M}_{c}}{c^{3}}\right)^{\frac{5}{3}}dt \tag{3}\\
    \Rightarrow \omega(t) &= \frac{5^{\frac{3}{8}}}{4}\left(\frac{c^{3}}{G\mathcal{M}_{c}}\right)^{\frac{5}{8}}\Delta t^{-\frac{3}{8}} \tag{4}\\
\end{eqnarray}
$$

Now we get the time dependence of frequency. Considering

$$
\begin{equation}
  \frac{GM_{\odot}}{c^{3}} \approx 4.93~\mathrm{\mu s}
  \tag{5}
\end{equation}
$$

We have

$$
\begin{equation}
  \omega(t) = 948.5\left(\frac{M_{\odot}}{\mathcal{M}_{c}}\right)^{\frac{5}{8}}\left(\frac{1\:\mathrm{s}}{\Delta t}\right)^{\frac{3}{8}}\;\mathrm{[Hz\cdot rad]}
  \tag{6}
\end{equation}
$$

This is a first order frequency equation. For a second-order derivation of this equation, check out this paper: Gravitational-Radiation Damping of Compact Binary System to Second Post-Newtonian order by Luc Blanchet et al. <a href="https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.74.3515" target="_blank">[PRL]</a>, <a href="https://arxiv.org/pdf/gr-qc/9501027.pdf" target="_blank">[ArXiv]</a>.

<h3>Plotting the Model</h3>

Now let's plot this function and see how it looks like. Compare it to the Q-Transform (lower plot) and notice that they are very similar in shape. 

### <span style="border:3px; border-style:solid; padding: 0.15em; border-color: #90409C; color: #90409C;">Checkpoint 1.3.1</span>

Now let's plot this function and see what it looks like compared to the Q-transform. First, fill in the code below to define the function.

In [None]:
#>>>PROBLEM: PROJ1.3.1

def gwfreq(iT, iM,iT0, cutoff=2e-3):
    #Returns the frequency of a gravitational wave at a specific time in merger
    #iM : chirp mass in units of solar masses 
    #iT : time where frequency is being sampled
    #iT0 : time where complete merger occurs
    #cutoff : arbritrary time window where merger ends before iT0
    
    const = #YOUR CODE HERE
    idelta_T = #YOUR CODE HERE, should be array of differences (iT0 - iT)
    output = const*np.power(np.maximum(idelta_T, cutoff),-3./8.)
    return output



<h3>Comparison</h3>

Run the code below to compare the model function to the Q-Transform (lower plot). Notice that they are very similar in shape. 

In [None]:
#>>>RUN: PROJ1.3-runcell01

# sampling time from t= 0s to t= 4s
times = np.linspace(0, 4., 500)

# frequency of a 20 solar chirp mass system merging at t=0
freq = gwfreq(iT=times, iM=25, iT0=4, cutoff=2e-2)

plt.clf()
plt.plot(times, freq)
plt.xlabel('Time (s)')
plt.ylabel('Frequency (Hz)')
plt.title('Simple Analytic Model')
plt.xlim(0, 4)
plt.show()


# Plotting Q-Transform shape for comparison
dt = 1  #-- Set width of q-transform plot, in seconds
hq = strain.q_transform(outseg=(tevent-dt, tevent+dt))

plt.clf()
fig = hq.plot()
ax = fig.gca()
fig.colorbar(label="Normalised energy")
ax.grid(False)
plt.xlim(tevent-0.5, tevent+0.5)
plt.ylim(0, 500)
plt.ylabel('Frequency (Hz)')
plt.title('Q-Tranform From Data (plotted before)')
plt.show()

<h3>Fitting Frequency vs. Time</h3>

Can we fit the GW model to the Q-transform? Would this yeild meaningful fit parameters?

First, we must project the amplitude of the Q-transform into the frequency-time domain. You can try to do this your own way, but we have included code below for one such method.

In [None]:
#>>>RUN: PROJ1.3-runcell02

def project_spectrogram(hq, threshold=30): 
  '''
  Given a spectrogram and threshold, this function will project down the 3D 
  spectrogram into a scatter plot by picking the points in intensity that lie
  above the threshold

  hq : spectrogram
  threshold : energy intensity above which points are chosen for projection
  '''
  projected_spec_times = []
  projected_spec_freq = []

  offset = hq.t0
  hq.times -= hq.t0
  hq_times = hq.times.value
  hq_freq = hq.frequencies.value
  hq_values = hq.value

  # Pick spectrogram values above threshold
  for x_pixel in range(hq.shape[0]): 
      for y_pixel in range(hq.shape[1]):
          if hq_values[x_pixel, y_pixel] > threshold: 
            projected_spec_times.append(hq_times[x_pixel])
            projected_spec_freq.append(hq_freq[y_pixel])

  # Average over y-values corresponding to same x-values
  fixed_projection_times = []
  fixed_projection_freq = []
  pool = []
  for i in range(len(projected_spec_times)-1): 
    if projected_spec_times[i] == projected_spec_times[i+1]:
      pool.append(projected_spec_freq[i])
    else: 
      fixed_projection_times.append(projected_spec_times[i])
      fixed_projection_freq.append(np.mean(pool))
      pool = []
  return fixed_projection_times, fixed_projection_freq

projected_spec_times, projected_spec_freq = project_spectrogram(hq)

plt.figure()
plt.scatter(projected_spec_times, projected_spec_freq)
plt.xlim((0.5, 1.5))
plt.ylim((0, 500))
plt.xlabel('Time (s)')
plt.ylabel('Frequency (Hz)')
plt.title('Q-Transform Projected to Scatter Plot')
plt.show()

Next, we define a function to minimize, which is the difference between our fit and the projected Q-transform data. We perform the fitting below. **Your goal in this checkpoint will be to analyze the results of this fit.**

In [None]:
#>>>RUN: PROJ1.3-runcell03

# define osc_dif for lmfit::minimize()
def gwfreq_dif(params, x, data, eps):
  iM=params["iM"]
  iT0=params["iT0"]
  cutoff=params["cutoff"]
  val=gwfreq(x, iM, iT0, cutoff)
  return (val-data)/eps

model = lmfit.Model(gwfreq)
p = model.make_params()
p['iM'].set(25)     # Mass guess
p['iT0'].set(1)  # By construction we put the merger in the center
p['cutoff'].set(2e-2)
unc = np.full(len(projected_spec_freq),10)
out = minimize(gwfreq_dif, params=p, args = (projected_spec_times, projected_spec_freq, unc))
print(fit_report(out))


plt.figure()
plt.scatter(projected_spec_times, projected_spec_freq)
plt.xlim((0.5, 1.5))
plt.ylim((0, 500))
plt.xlabel('Time (s)')
plt.ylabel('Frequency (Hz)')
plt.plot(np.linspace(0.5, 1.5, 500), model.eval(params=out.params,iT=np.linspace(0.5, 1.5, 500)),'r',label='best fit')
plt.title('Analytic Function (fitted to data)')
plt.show()


### <span style="border:3px; border-style:solid; padding: 0.15em; border-color: #90409C; color: #90409C;">Checkpoint 1.3.2</span>

What is the chirp mass of the merger according to the best fit? Round to the nearest integer mass.

Hint: Look at the fit summary for information about the fit. 

### <span style="border:3px; border-style:solid; padding: 0.15em; border-color: #90409C; color: #90409C;">Checkpoint 1.3.3</span>

Why isn't this fit producing a reasonable number for the chirp mass value? Choose **two** of the following options:

- The fitted function is completely incorrect form given the real data.
- The frequency-domain fit will always have a $t^{-3/8}$ power law and so the fit isn't sensitive to chirp mass.
- Uncertainty values were too large on the real data.
- Q-transform projection to 2-dimensions doesn't capture the edges or tails of the GW in frequency space. 

>#### Follow-up 1.3.3a (ungraded)
>
>Is there a better way to project down the Q-transform? Try it! Does this improve the mass value or chi-squared from the fit at all? 

<a name='section_1_4'></a>
<hr style="height: 1px;">

## <h2 style="border:1px; border-style:solid; padding: 0.25em; color: #FFFFFF; background-color: #90409C">PROJ1.4 Analytic Model Part II - Strain vs. Time</h2>    

| [Top](#section_1_0) | [Previous Section](#section_1_3) | [Checkpoints](#problems_1_4) | [Next Section](#section_1_5) |


<h3>Analytic Model: Waveform before merger</h3>

Now, we build the waveform with the radiation power $dE(t)/dt$.

$$
\begin{eqnarray}
\\
f(t) &= A(t)\cos(\omega(t)\Delta t +\phi) \tag{7}\\
&\propto \frac{dE(t)}{dt}\cos(\omega(t)\Delta t +\phi) \tag{8} \\
\\
\end{eqnarray}
$$


With Newtonian mechanics, we know the energy of a binary orbit is

$$
\begin{eqnarray}
\\
    E &= E_{\mathrm{k}} + E_{\mathrm{u}} \tag{9}\\
    &= \frac{1}{2}\mu\dot{r}^{2} + \frac{1}{2}\frac{m_{1}m_{2}}{m_{1}+m_{2}}\omega^{2}R^{2} - \frac{Gm_{1}m_{2}}{R} \tag{10}\\
   &= \frac{1}{2}\frac{m_{1}m_{2}}{m_{1}+m_{2}}\omega^{2}R^{2} - \frac{Gm_{1}m_{2}}{R} \tag{11}\\ 
   \\
\end{eqnarray}
$$

The last step is because $\dot{r} = 0$. According to Kepler's third law,

$$
\begin{equation}
\\
\tag{12}
  \omega^{2} = \frac{G(m_{1}+m_{2})}{R^{3}}
\\
\end{equation}
$$

Put Eq. [12] into Eq. [11] and substitute $R$, we have

$$
\begin{eqnarray}
E &= -\frac{Gm_{1}m_{2}}{2R} \tag{13}\\
\tag{14}    &\propto \mathcal{M}_{c}^{\frac{5}{3}}\omega(t)^{\frac{2}{3}} \\
\\
\end{eqnarray}
$$

Perform derivatives on Eq. [14],

$$
\begin{eqnarray}
\frac{dE}{dt} &\propto \mathcal{M}_{c}^{\frac{5}{3}}\omega(t)^{-\frac{1}{3}}\dot{\omega} \tag{15}\\
\tag{16}    &\propto \left(\mathcal{M}_{c}\omega(t)\right)^{\frac{10}{3}} \\
\\
\end{eqnarray}
$$

Put Eq. [16] into Eq. [8], we have the waveform

$$
\begin{eqnarray}
\tag{17}
f(t) &= C\left(\mathcal{M}_{c}\omega(t)\right)^{\frac{10}{3}}\cos(\omega(t)\Delta t +\phi), \\
\\
\end{eqnarray}
$$

and $C$ is a constant. We can use the result that we derived before, for $\omega(t)$:

$$
\begin{eqnarray}
\omega(t) &= 948.5\left(\frac{M_{\odot}}{\mathcal{M}_{c}}\right)^{\frac{5}{8}}\left(\frac{1\:\mathrm{s}}{\Delta t}\right)^{\frac{3}{8}}\;\mathrm{[Hz\cdot rad]} \tag{18} \\
\\
\end{eqnarray}
$$

<h3>Analytic Model: Waveform after merger</h3>

The waveform amplitude above is only supposed to work before merger ($t \leq t_{0}$), and we need to come up with a way to define the ringdown amplitude ($t > t_{0}$). The actual way to get the amplitude requires months of a super computer to build good templates! As a simple approximate solution, you can use a simple exponential decay function. It will be your next task to write this function.

First, run the code below to properly bandpass filter the data (if you have not done this already). This will define the data to which we will fit our function. Use the bandpass values below, as your fit results will be compared to results that we computed using these parameters.

In [None]:
#>>>RUN: PROJ1.4-runcell01

#BE SURE TO SET THE VALUES OF YOUR BANDPASS TO THE FOLLOWING:
bandpass_low = 30 #YOUR CODE HERE
bandpass_high = 400 #YOUR CODE HERE
white_data_bp = bandpass(bandpass_low, bandpass_high)

Now, in the code below, we have set up all of the fitting apparatus, which does the following:
- `osc_dif`: defines a function to minimize
- `plot_fit_function_not_fitted`: plots the fit function without performing lmfit, for some default input values
- `plot_fit_function_fitted`: plots the fit function after performing lmfit and prints the fit report
- `osc`: the function to be fit, which takes the variables `(t, Mc, t0, C, phi, tau)`.

For `osc`, we have written a simple sinusiodal function, which obviously is not an appropriate fit for the data. The varibale `tau` is left as an extra free-parameter that you may choose to use.

Run the code below to see the output, then consider the following questions.

In [None]:
#>>>RUN: PROJ1.4-runcell02

# define osc_dif for lmfit::minimize()
def osc_dif(params, x, data, eps):
  iM=params["Mc"]
  iT0=params["t0"]
  cutoff=params['cutoff']
  norm=params["C"]
  phi=params["phi"]
  tau=params["tau"]
  
  val=osc(x, iM, iT0, cutoff, norm, phi, tau)
  return (val-data)/eps

def plot_fit_function_not_fitted(function):
  times = np.linspace(-0.1, 0.3, 1000)
  freq = function(times, 30, 0.18, 1e-2, 1, 0, 0)
  plt.figure(figsize=(12, 4))
  plt.subplots_adjust(left=0.1, right=0.9, top=0.85, bottom=0.2)
  plt.plot(times, freq)
  plt.xlabel('Time (s) since '+str(tevent))
  plt.ylabel('strain')
  plt.title('Analytic Function (not fitted to data)')
  plt.show()


def plot_fit_function_fitted(function):
  sample_times = white_data_bp.times.value
  sample_data = white_data_bp.value
  indxt = np.where((sample_times >= (tevent-0.17)) & (sample_times < (tevent+0.13)))
  x = sample_times[indxt]
  x = x-x[0]
  white_data_bp_zoom = sample_data[indxt]

  plt.figure(figsize=(12, 4))
  plt.subplots_adjust(left=0.1, right=0.9, top=0.85, bottom=0.2)
  plt.plot(x, white_data_bp_zoom)
  plt.xlabel('Time (s)')
  plt.ylabel('strain (whitened + band-pass)')

  model = lmfit.Model(osc)
  p = model.make_params()
  p['Mc'].set(25)     # Mass guess
  p['t0'].set(0.17)  # By construction we put the merger in the center
  p['cutoff'].set(2e-3)
  p['C'].set(1e-12)      # normalization guess 
  p['phi'].set(0)    # Phase guess
  p['tau'].set(0)    # Phase guess
  unc = np.full(len(white_data_bp_zoom),np.std(white_data_bp_zoom))
  out = minimize(osc_dif, params=p, args=(x, white_data_bp_zoom, unc))
  print(fit_report(out))
  plt.plot(x, model.eval(params=out.params,t=x),'r',label='best fit')
  plt.title('Analytic Function (fitted to data)')
  plt.show()

def osc(t, Mc, t0, cutoff, C, phi, tau):
    # Example code to show how plot_fit_function_fitted
    val = np.sin(100*t)
    return val

plot_fit_function_not_fitted(osc)
plot_fit_function_fitted(osc)

<a name='problems_1_3'></a>     

| [Top](#section_1_0) | [Restart Section](#section_1_4) | [Next Section](#section_1_5) |


### <span style="border:3px; border-style:solid; padding: 0.15em; border-color: #90409C; color: #90409C;">Checkpoint 1.4.1</span>

Build a function `osc(t, Mc, t0, cutoff, C, phi, tau)`, which will properly model the oscillation of the waveform. It has the form as Eq. [17] for $t \leq t_{0}$, but dampens to zero for $t > t_{0}$. Run the fit in your notebook.

Hint: Remember to use a exponential decay function. Recall that the frequency of the GW can be found using `gwfreq(iT,iM,iT0)` as defined above.

To score the result of your fit, we will independently fit your function `osc(t, Mc, t0, cutoff, C, phi, tau)` against a different set of data. Your function will be accepted if it produces a reduced chi-sq `<=0.6`.

In [None]:
#>>>PROBLEM: PROJ1.4.1
# Use this cell for drafting your solution (if desired),
# then enter your solution in the interactive problem online to be graded.

print('def osc(t, Mc, t0, cutoff, C, phi, tau):')

def osc(t, Mc, t0, cutoff, C, phi, tau):
    # Enter code here
    val = 0
    return val

### <span style="border:3px; border-style:solid; padding: 0.15em; border-color: #90409C; color: #90409C;">Checkpoint 1.4.2</span>

Now what is the chirp mass of the merger according to the best fit? Round to the nearest integer mass.

>#### Follow-up 1.4.2a (ungraded)
>     
>Are you happier with the fit now? What are some other things you might do to get an even better result?

<a name='section_1_5'></a>
<hr style="height: 1px;">

## <h2 style="border:1px; border-style:solid; padding: 0.25em; color: #FFFFFF; background-color: #90409C">PROJ1.5 Search for Signal in Long Time Range</h2>   

| [Top](#section_1_0) | [Previous Section](#section_1_4) | [Checkpoints](#problems_1_5) |


<h3>Overview</h3>

Previously, we had looked at the small range of data where we knew we would find the signal. That's cheating; in reality we don't know where the signal is in advance! So now we will search for a signal in the entire LIGO dataset that you downloaded. You will need to develop machinery to search for GW events across a long time range.

<h3>Objective</h3>

Using the Project 1 notebook, find a gravitational wave signal in the extensive dataset. Demonstrate that you have found the signal by analyzing the significance or chi-square. Explain your approach and results thoroughly.

*Tip: You can slice the time into short ranges, and each one individually and see the significance - you should see a spike in significance/chi$^2$ when the gravitational wave is correctly fit by the analytic function. Remember: not all parameters will need varying as in previous fits. You can set* `p['parameter'].vary = False` *for parameters you deem unnecessary to vary.*

<h3>Expectations and Grading</h3>

For this open-ended task, you will be expected to develop some procedure, analyze your results, and present your findings. Specifically, you will do the following:
       
1. Submit a pdf of your work on MITx, to be graded by your peers based on the criteria outlined below.
2. Grade the work of others based on the same criteria.

For full credit on this peer-reviewed checkpoint, we specifically expect you to complete these three tasks (and support your work with thorough explanation:

- Section 1: Develop a procedure for finding GW-events.
- Section 2: Explain your procedure.
- Section 3: Describe your results and characterize the significance (analyze chi-squared values, report the GW-event time, etc.)

<h3>Peer-Evaluation Rubric</h3>

Submit a pdf of your notebook below to MITx Online. Afterwards, you must peer-grade 3 submissions based on the criteria below (your submission will also be graded on these criteria):

<p align="center">
<img src="https://raw.githubusercontent.com/mitx-8s50/images/main/PROJ1/rubric.png" width="800"/>
</p>

<h3>Section 1: Develop a procedure for finding GW-events</h3>

**Begin your work below. You could use the starting code, if you wish.**

*Tip: You can slice the time into short ranges, and each one individually and see the significance - you should see a spike in significance/chi$^2$ when the gravitational wave is correctly fit by the analytic function. Remember: not all parameters will need varying as in previous fits. You can set* `p['parameter'].vary = False` *for parameters you deem unnecessary to vary.*

In [None]:
#>>>PROBLEM: PROJ1.5.1
# Use this cell for drafting your solution (if desired),
# then enter your solution in the interactive problem online to be graded.

print('Develop a machinery to search for GW events across a long time range.')
print('Use the variable white_data_bp, which we created before by whitening and bandpassing')

print(white_data_bp)

times = []
sigs = [] # find significance as a function of time
chi2 = [] # find chi^2 as a function of time


# Develop search function here


plt.figure(figsize=(12, 4))
plt.subplots_adjust(left=0.1, right=0.9, top=0.85, bottom=0.2)
plt.plot(times, sigs)
plt.xlabel('Time (s)')
plt.ylabel('N/$\sigma_{N}$')
plt.show()

plt.figure(figsize=(12, 4))
plt.subplots_adjust(left=0.1, right=0.9, top=0.85, bottom=0.2)
plt.plot(times, chi2)
plt.xlabel('Time (s)')
plt.ylabel('$\chi^{2}$')
plt.show()

<h3>Section 2: Explanation of procedure</h3>

**Explain your code here:**





<h3>Section 3: Explanation of results</h3>

**Describe your results here:**


