# Bayesian Signal Denoising and Parameter Optimization for Gravitational Wave Detection in LIGO Data

### Project Overview
This project focuses on the detection and analysis of gravitational wave signals using the LIGO Gravitational Wave Data. Gravitational waves, first predicted by Albert Einstein in 1916, are ripples in spacetime caused by massive astronomical events, such as the merging of black holes. The dataset used in this project corresponds to the first-ever direct detection of gravitational waves, event GW150914, observed by the LIGO detectors on September 14, 2015. This event marked a major scientific breakthrough, confirming a key prediction of Einstein's theory of General Relativity. The uniqueness of this data lies in its historical significance and its inherent challenge: distinguishing a faint gravitational wave signal from background noise. By applying advanced techniques such as Bayesian inference, Gaussian Mixture Models, and Monte Carlo simulations, we aim to optimize the extraction of these weak signals, explore the underlying astrophysical parameters, and improve the detection capabilities of such rare and critical events. This project demonstrates how data science and statistical methods can contribute to cutting-edge scientific discoveries.

### Data Source
The data for this project will be sourced from the LIGO Open Science Center (LOSC), which provides publicly available data from the LIGO experiment. This project will specifically focus on the data related to the first detected gravitational wave event, GW150914, as a case study for applying signal processing and optimization techniques.
Data can be accessed at: LIGO Open Science Center.


In [1]:
import os
import requests
import numpy as np
import matplotlib.pyplot as plt
from pycbc.frame import read_frame
from pycbc.filter import highpass
from pycbc.psd import welch
from pycbc.types import TimeSeries

ModuleNotFoundError: No module named 'pycbc'

In [None]:
# URL for the specific LIGO dataset
ligo_data_url = 'https://www.gw-openscience.org/s/events/GW150914/H-H1_LOSC_4_V1-1126259446-32.gwf'
file_name = 'H-H1_LOSC_4_V1-1126259446-32.gwf'
channel_name = 'H1:GWOSC-4KHZ_R1_STRAIN'

In [None]:
# Step 1: Download the file if not already present
def download_ligo_data(url, file_name):
    if not os.path.exists(file_name):
        print(f"Downloading {file_name}...")
        response = requests.get(url, stream=True)
        with open(file_name, 'wb') as f:
            for chunk in response.iter_content(chunk_size=8192):
                if chunk:
                    f.write(chunk)
        print(f"Download complete: {file_name}")
    else:
        print(f"File already exists: {file_name}")

# Step 2: Load the Gravitational Wave Data
def load_ligo_data(file_path, channel_name):
    # Reading the gravitational wave strain data from the frame file
    strain_data = read_frame(file_path, channel_name)
    
    # Convert to TimeSeries format
    strain_data = TimeSeries(strain_data, delta_t=1.0/4096.0)
    
    print(f"Data Loaded: {len(strain_data)} samples at {strain_data.sample_rate} Hz")
    
    return strain_data

# Step 3: Apply a Highpass Filter to Remove Low-Frequency Noise
def apply_highpass_filter(data, cutoff=15.0):
    filtered_data = highpass(data, cutoff, 8)
    print("Highpass filter applied.")
    return filtered_data

# Step 4: Estimate Power Spectral Density (Optional: Visualization)
def estimate_psd(data, seg_length=4, avg_method='median'):
    psd = welch(data, seg_length * data.sample_rate, avg_method=avg_method)
    plt.loglog(psd.sample_frequencies, psd, label='Power Spectral Density')
    plt.xlabel('Frequency [Hz]')
    plt.ylabel('Strain^2/Hz')
    plt.title('Power Spectral Density (PSD) of Gravitational Wave Data')
    plt.show()
    return psd

# Step 5: Plot the Raw and Preprocessed Data (Optional)
def plot_time_series(raw_data, filtered_data, title="Gravitational Wave Data"):
    plt.figure(figsize=(10, 6))
    plt.subplot(2, 1, 1)
    plt.plot(raw_data.times, raw_data, color='red', label='Raw Data')
    plt.title(f'{title}: Raw Data')
    plt.xlabel('Time [s]')
    plt.ylabel('Strain')
    
    plt.subplot(2, 1, 2)
    plt.plot(filtered_data.times, filtered_data, color='blue', label='Filtered Data')
    plt.title(f'{title}: Highpass Filtered Data')
    plt.xlabel('Time [s]')
    plt.ylabel('Strain')
    plt.tight_layout()
    plt.show()

In [None]:
# Main Execution

# Step 1: Download the data
download_ligo_data(ligo_data_url, file_name)

# Step 2: Load the data
strain_data = load_ligo_data(file_name, channel_name)

# Step 3: Apply the highpass filter
filtered_data = apply_highpass_filter(strain_data, cutoff=15.0)

# Step 4: Estimate and plot the PSD
psd = estimate_psd(filtered_data)

# Step 5: Plot the raw and filtered time-series data
plot_time_series(strain_data, filtered_data)