Written by Anuj Kankani and Siddarth Mahesh                                     
Any bugs are Matt Cerep's fault

A large chunk of the SXS catalog is stored on Anuj's thornyflat scratch account                                                                         
If you need access ask Anuj and he can add you to the allowed users list        
This way you don't have to redownload the catalog, and any files you download from sxs can be used by everyone                                                

In [1]:
#!pip install sxs
#!pip install kuibit
#!pip install numpy
#!pip install pandas
#!pip install matplotlib

In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import sxs
import kuibit
from kuibit.gw_mismatch import mismatch_from_strains
from kuibit.timeseries import TimeSeries

This just sets the necessary config options for downloading, and lets you see where the data is being downloaded. If you want to change the data storage location on your machine you have to change the environment variables SXSCONFIGDIR and SXSCACHEDIR. Note: On the cluster make sure you do this, otherwise it will NOT download to scratch by default

In [3]:
print("Welcome!")
sxs.write_config(download=True,cache=True)
print("config directory is",sxs.sxs_directory("config"))
print("cache directory is",sxs.sxs_directory("cache"))

Welcome!
config directory is /home/siddharth-mahesh/.config/sxs
cache directory is /home/siddharth-mahesh/.cache/sxs


Get the catalog information from SXS

In [4]:
catalog = sxs.load("catalog")

Skipping download from 'https://data.black-holes.org/catalog.json' because local file is newer


This determines the simulations we will use. This code block can obviosuly be edited as needed to get the simulations we want.

In [5]:
# There is probably a shorter way to do this, but I'm not familiar enough with dataframes, and this is more user friendly imo.
# SXS defines reference values which are calculated after the junk radiation has settled down.
def get_sims(catalog):
  df = catalog.table #create a Pandas dataframe
  BBH_sims = df.loc[df['object_types'] == 'BHBH']
  r,c = BBH_sims.shape
  print("There are",r,"total BBH simulations in the catalog")

  #print(df.columns) #uncomment this if you want to see all the columns in the catalog

  SPIN_TOLERANCE = 1e-5
  MAXIMUM_ECCENTRICITY = 1e-3

  index_arr = []

  for index, row in BBH_sims.iterrows():
      chi_x1 = row['reference_dimensionless_spin1'][0]
      chi_y1 = row['reference_dimensionless_spin1'][1]
      chi_z1 = row['reference_dimensionless_spin1'][2]
      chi_x2 = row['reference_dimensionless_spin2'][0]
      chi_y2 = row['reference_dimensionless_spin2'][1]
      chi_z2 = row['reference_dimensionless_spin2'][2]
      ecc = row['reference_eccentricity']
      mass_ratio = row['reference_mass_ratio']

      #make sure x and y components of spin are under tolerance as well as the eccentricity
      if(abs(chi_x1)<=SPIN_TOLERANCE and abs(chi_y1)<=SPIN_TOLERANCE and abs(chi_x2)<=SPIN_TOLERANCE and abs(chi_y2)<=SPIN_TOLERANCE and abs(ecc)<MAXIMUM_ECCENTRICITY):
        index_arr.append(index)

  sims = BBH_sims[BBH_sims.index.isin(index_arr)]
  r,c = sims.shape
  print("Found",r,"BBH simulations matching conditions")

  return sims

sims is now a Pandas dataframe with all the catalog information for the simulations we want.

In [6]:
sims = get_sims(catalog)

There are 2019 total BBH simulations in the catalog
Found 436 BBH simulations matching conditions


The search_expression variable defines the file that will be used in every simulation. This variable can be edited to get multiple resolution levels, psi4 data, etc... More information can be found at the bottom of https://sxs.readthedocs.io/en/main/tutorials/02-Catalog/ in the "Selecting Data Sets" section as well as in the source code. search_expression="/Lev/rhOverM" searches for the latest version, highest resolution strain data

In [7]:
search_expression = "/Lev/rhOverM"

First we can check the size of the file we will download

In [24]:
def get_file_size(catalog,sims,search_expression):
  id = np.array(sims.index)
  total_file_size = 0
  for str in id:
      #this assumes every sim is identified as SXS:BBH:####v# where # is a number
      BBH_sim = catalog.select_files(str+search_expression)
      for key,subdict in BBH_sim.items():
          total_file_size += subdict['filesize']

  print("total size is ",(total_file_size/1000000)/1000,' Gb')

In [25]:
get_file_size(catalog,sims,search_expression)

total size is  43.272072269999995  Gb


 We need to specify the extrapolation order we want to use. SXS:BBH:1111 only has a 2nd order extrapolation, but everything else has 4

In [10]:
EXTRAPOLATION_ORDER = 4

Finally, we can download the waveforms into our cache directory. If the file already exists in our cache, it will not download it.

In [11]:
def download_waveforms(sims,search_expression,EXTRAPOLATION_ORDER = 4):
  for str in id:
    try:
      sxs.load(str+search_expression, extrapolation_order=EXTRAPOLATION_ORDER)
    except:
      print("Could not find given extrapolation order. Trying 2nd order extrapolation")
      sxs.load(str+search_expression, extrapolation_order=2)

In [12]:
#download_waveforms(sims)

Everything below is analysis code. The general principle is to take a sxs waveform, convert the needed modes to kuibit timeseries and use kuibit's built in functionality as much as possible. Nothing has been tested here yet. All analysis functions that take in a sxs waveform assume that the junk radiation has been sliced off.

Helper function to find nearest value in an array

In [13]:
def find_nearest(array, value):
    array = np.asarray(array)
    idx = (np.abs(array - value)).argmin()
    return array[idx]

In [14]:
def find_nearest_index(array, value):
    array = np.asarray(array)
    idx = (np.abs(array - value)).argmin()
    return idx

Quick helper function that takes in a sxs waveform and returns a kuibit timeseries for the given l,m mode.

In [15]:
def get_kuibit_lm(w,l,m):
    index = w.index(l, m)
    w_temp = w[:,index]
    return TimeSeries(w.t,w_temp)

Compute the time difference between the peak strain of two modes.

In [16]:
def get_peak_strain_time_difference_between_modes(w,l1,m1,l2,m2):
  t1 = get_kuibit_lm(w,l1,m1)
  t2 = get_kuibit_lm(w,l2,m2)
  return t1.time_at_maximum()-t2.time_at_maximum()


return the waveform frequency as a timeseries for a given l,m mode

In [17]:
def get_kuibit_frequency_lm(w,l,m):
  ts = get_kuibit_lm(w,l,m)
  #I'm pretty sure this is right, but if there is a mistake somewhere it's probably here
  return ts.phase_angular_velocity()

return the isco frequency given a final mass and spin.

In [18]:
def get_w_isco(m,mf,af):
  # note that m is the multipole moment of the waveform
  # get r_isco and Omega_isco first (See Bardeen, Press, Teukolsky 1972)
  Z1 = 1. + ((1. - af*af)**(1./3.))*((1 + af)**(1/3) + (1 - af)**(1/3))
  Z2 = np.sqrt(3*af*af + Z1*Z1)
  r_isco = (3 + Z2 - np.sqrt(3 - Z1)*(3 + Z1 + 2*Z2))*mf
  Omega_isco = 1/((r_isco**1.5 + af)*mf)
  w_isco = m*Omega_isco
  return w_isco


return the time difference between the peak strain and t_isco for a given l,m mode, Assumes junk radiation has been removed

In [19]:
def get_isco_peak_strain_time_difference_lm(w,l,m,mf,af):
  ts = get_kuibit_lm(w,l,m)
  freq = get_kuibit_frequency_lm(w,l,m)
  w_isco = get_w_isco(m,mf,af)
  t_peak = ts.time_at_maximum()
  #There may be cases where the waveform well after the peak strain becomes noisy and reaches w_isco again, so we only look for w_isco before the peak strain
  freq = freq.cropped(end=t_peak)
  t_isco = freq.t[find_nearest_index(freq.y,w_isco)]
  return t_peak-t_isco

Calculate the mismatch between a sxs waveform and a Fully(Freedom) Effective One Body model for a given l,m mode

In [20]:
#NEEDS TO BE TESTED, ALSO SHOULD CHECK THAT KUIBIT CALCULATES MISMATCHES THE SAME WAY SEOBNR DOES
#NEED TO ADD FUNCTIONALITY TO MAKE SURE START AND END TIMES ARE THE SAME
#probably should just pass in a start and end time and use kuibit to crop the data? idk how you wrote the BOB code so not messing with this for now
def calculate_mismatch(w,FEOB,SEOB,l,m):
  sxs = get_kuibit_lm(w,l,m)
  mismatch_FEOB = mismatch_from_strains(sxs,FEOB)
  mismatch_SEOB = mismatch_from_strains(sxs,SEOB)
  return mismatch_FEOB, mismatch_SEOB

Now that we have downloaded all our waveforms, we can load them and analyze as necessary

In [21]:
#you can convert all the modes into a kuibit GravitationalWavesOneDet, but kuibit assumes GravitationalWavesOneDet is psi4 at a finite distance
#you can easily work around this by setting r = 1 for extrapolated data and just being aware your data is the strain
#but in the interest of making this as easy to work with, without requiring too much kuibit knowledge, and avoiding future mistakes
#the analysis functions are built to take in a sxs waveform with junk radiation removed, convert the necessary modes to kuibit timeseries and go from there
def analyze_waveforms(sims,search_expression,EXTRAPOLATION_ORDER=4):
  delta_t = []
  x_arr = []
  for index, row in sims.iterrows():
    try:
      w = sxs.load(index+search_expression, extrapolation_order=EXTRAPOLATION_ORDER)
    except:
      print("Could not find given extrapolation order. Trying 2nd order extrapolation")
      w = sxs.load(str+search_expression, extrapolation_order=2)
    
    relax_time = row['relaxation_time']
    #this removes junk radiation according to the time calculated by sxs. sxs warns this is a very rough estimate
    #The analysis functions assume junk radiation has been taken care of so it is necessary to remove junk radiation here
    w_sliced = w[w.index_closest_to(relax_time):]
    rem_mass = row['remnant_mass']
    rem_spin = row['remnant_dimensionless_spin']
    #as an example we can plot the time difference between the peak strain time of the (2,2) and (4,4) modes as a function of the initial mass ratio
    delta_t.append(get_peak_strain_time_difference_between_modes(w_sliced,l1=2,m1=2,l2=4,m2=4))
    x_arr.append(row['initial_mass_ratio'])
  return x_arr, delta_t

  







In [22]:
#in case you want to see the possible column values you can call
#print(sims.columns)

In [23]:
x_arr, delta_t = analyze_waveforms(sims,search_expression,4)
plt.scatter(x_arr,delta_t)
plt.show()


Found the following files to load from the SXS catalog:
    SXS:BBH:0001v6/Lev5/rhOverM_Asymptotic_GeometricUnits_CoM.h5
Found the following files to load from the SXS catalog:
    SXS:BBH:0002v7/Lev6/rhOverM_Asymptotic_GeometricUnits_CoM.h5
Downloading to /home/siddharth-mahesh/.cache/sxs/SXS:BBH:0002v4/Lev6/rhOverM_Asymptotic_GeometricUnits_CoM.h5:


  0%|                                                                                     | 0/166211179 [00:00…

Found the following files to load from the SXS catalog:
    SXS:BBH:0004v5/Lev6/rhOverM_Asymptotic_GeometricUnits_CoM.h5
Downloading to /home/siddharth-mahesh/.cache/sxs/SXS:BBH:0004v2/Lev6/rhOverM_Asymptotic_GeometricUnits_CoM.h5:


  0%|                                                                                     | 0/158296692 [00:00…

Could not find given extrapolation order. Trying 2nd order extrapolation


TypeError: unsupported operand type(s) for +: 'type' and 'str'