# **UMN LIGO/Multi-Messenger Astronomy Introduction Notebook**

# Useful Background Information

## Astrophysics, Multi-messenger Astronomy, LIGO, Low-latency, Gamma-ray Bursts

* Multi-messenger Astornomy Wikipedia: https://en.wikipedia.org/wiki/Multi-messenger_astronomy
* Gravitational Waves Wikipedia: https://en.wikipedia.org/wiki/LIGO 
* Gravitational Waves Caltech: https://www.ligo.caltech.edu/page/what-are-gw 
* Gamma-ray Bursts (GRBs) Wikipedia: https://en.wikipedia.org/wiki/Gamma-ray_burst 
* Low-latency gravitational wave alert products and performance paper: https://arxiv.org/pdf/2308.04545 and PDF
* RAVEN methods paper: attached PDF
* LLPIC paper: attached PDF

## Python 101

Install any libraries not already installed with Python (`numpy`, `matplotlib.pyplot`, `pandas`, etc.) using `pip install [library]`

* Numpy tutorial: https://www.w3schools.com/python/numpy/default.asp 
* Numpy cheat sheet: https://www.datacamp.com/cheat-sheet/numpy-cheat-sheet-data-analysis-in-python 
* Matplotlib tutorial: https://www.w3schools.com/python/matplotlib_intro.asp 
* Matplotlib cheat sheet: https://www.datacamp.com/cheat-sheet/matplotlib-cheat-sheet-plotting-in-python 
* Pandas tutorial: https://www.w3schools.com/python/pandas/default.asp
* Linux/terminal/directory guide: https://dev.to/softwaresennin/linux-directory-structure-simplified-a-comprehensive-guide-3012 

In [None]:
# Import these libaries 
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# RAVEN MDC Dataset

The RAVEN Mock Data Challenge (MDC) dataset is a large, simulated population of compact binary coalescence (CBC) events generated to test and develop low-latency multi-messenger follow-up pipelines. Each row represents one synthetic gravitational-wave event with parameters drawn from astrophysically motivated distributions, including component masses, spins, luminosity distance, redshift, inclination angle, sky position (RA/Dec), and signal-to-noise ratio (SNR). These simulations mimic what LIGO/Virgo/KAGRA would observe during real operations and provide a realistic environment for building and validating tools like RAVEN, which identifies potential coincidences between gravitational-wave alerts and high-energy transients (e.g., GRBs).

In [None]:
# Read the csv of the dataset
data = pd.read_csv('path/to/directory/grb_simulated.csv')
# Prints the dataframe
data

In [None]:
# Lists the columns in the dataframe
data.keys()

## Below are descriptions of the columns/parameters and what they mean

### Timing and Position
* `time_H`, `time_L`, `time` _V - Arrival times at Hanford, Livingston, and Virgo detectors in GPS seconds
* `time` - Central GPS time of the injection (geocenter)
* `right_ascension` - Sky location (RA) of the source in radians
* `declination` - Sky location (Dec) of source in radians
* `polarization` - GW polarization angle ($\psi$) in radians
* `coa_phase` - Coalescence phase ($\phi$), the phase of the waveform at merger
* `inclination` - Inclination angle ($\theta$), the angle between orbital angular momentum and line of sight in radians

### Masses and Spins
* `mass1`, `mass2` - Detector-frame component masses (i.e., observed, redshifted masses)
* `mass1_source`, `mass2_source` - Source-frame masses (corrected for redshift)
* `Mc` - Detector-frame chirp mass (combination of m1, m2)
* `Mc_source` - Source-frame chirp mass
* `chi_eff` - Effective spin: mass-weighted projection of component spins along orbital angular momentum
* `chi_p` - Effective precession spin parameter
* `spin1x`, `spin1y`, `spin1z` - Spin components of primary (object 1) in Cartesian coordinates
* `spin2x`, `spin2y`, `spin2z` - Spin components of secondary (object 2)

### Cosmology and Distance
* `z` - Cosmological redshift
* `distance` - Luminosity distance in Mpc
* `eff_dist_H`, `eff_dist_L`, `eff_dist_V` - Effective distance to the source (accounts for orientation) for each detector

### Waveform and Analysis
* `approximant` - Waveform model used to simulate the injection (e.g., SEOBNRv4_ROM)
* `fref` - Reference frequency for spin vectors
* `flow` - 	Lower frequency cutoff used in the waveform injection (Hz)

### SNR 
* `snr_H`, `snr_L`, `snr_V` - SNRs in Hanford, Livingston, and Virgo
* `snr_net` - Network SNR (combined from all detectors)

### Tidal and Eccentricity
* `lambda1`, `lambda2` - Tidal deformability parameters for object 1 and 2 (important for BNS, NSBH)
* `eccentricity` - Orbital eccentricity at a given reference frequency

### Log Prior Weights
* `logpdraw` - Log probability density of the full injection draw
* `logpdraw_time`, `logpdraw_eccentricity`, etc. - Component-wise log probabilities for injection priors
* `logpdraw_inclination_polarization_coa_phase` - Joint log prior for inclination, polarization, and phase
* `logpdraw_lambda1_lambda2_GIVEN_mass1_source_mass2_source` - Conditional prior on tidal parameters
* `logpdraw_z_mass1_source_mass2_source_spin1x_spin1y_spin1z_spin2x_spin2y_spin2z` - Joint prior for redshift, mass, spin
* `logpdraw_right_ascension_declination` - Sky location prior

# Getting Started

Complete the tasks below that introduce basic Python skills and how to work with the RAVEN MDC dataset. Any task's required Python knowledge can easily be found through Google searches if you haven't learned how to do that before.

## Python Basics and Dataframes

### 1. Load in the csv file

### 2. Print  dataset metadata
* Number of rows 
* Number of columns
* Column names

### 3. Compute basic summary statistics, something similar to `df.describe()`

### 4. Show the first 5 rows/events
* Poke around with them and see what the parameters look like

### 5. Identify which columns correspond to mass, spin, distance, etc. Look at the columns and their values that seem the most important/interesting to you

### 6. Determine if there are any missing values in each column

## Basic Filtering & Subsetting

### 1. Find all events with SNR > 15

### 2. Find the 10 closest events (smallest distance)

### 3. Find events with inclination < 20° (face-on candidates)

### 4. Count how many events come from each source type if you have a “type” column (e.g., BNS, NSBH, BBH)

## Deriving Quantities

### 1. Compute and add a column for the symmetric mass ratio: $\eta = \frac{m_1 m_2}{(m_1 + m_2)^2}$

### 2. Compute the chirp mass mass again, without adding the column to the dataframe: $\mathcal{M}_c = \frac{(m_1 m_2)^{3/5}}{(m_1 + m_2)^{1/5}}$

### 3. Compute and add a column for luminosity distance in Gpc

## Plotting

### 1. Make histograms of the following columns (pick sensible bin sizes, such as 30-50)
* mass1
* mass2
* Mc
* SNR
* distance and/or redshift
* inclination angle
* $E_{iso}$
* $T_{90}$
* $E_{peak}$

### 2. Label axes clearly and give each plot a title

### 3. Compare the shapes of the distributions
* Which ones look skewed? Why?
* Which ones look symmetrical? Why?

### 4. Compare histograms of SNR vs distance

## GRB-Specific Tasks

### 1. Using inclination column:
* Identify “on-axis” (θ < 20°) vs “off-axis” events
* Make histograms of their SNR and distances
* How does inclination affect detectability?

### 2. Energy budget estimates:
* Convert $E_{iso}$ to a rough flux at Earth: $F \sim \frac{E}{4\pi D^2}$