# Clone, get data and load waveforms from a simulation to the CoRe DB

Illustrates the use of the watpy objects to work with the CoRe DB.

SB 09/2021 
AG 05/2023 
core@uni-jena.de

## 0. Start

Clone the repo somewhere and install the package:

```
git clone https://git.tpi.uni-jena.de/core/watpy.git
cd watpy
python setup.py install
```

Now we will be cloning part of the CoRe DB from the gitlab repository

https://core-gitlfs.tpi.uni-jena.de/core_database

so prepare a folder for those data:

```
mkdir CoRe_DB_clone # for local clone of the CoRe DB
```

or

In [None]:
import os
os.makedirs('./CoRe_DB_clone', exist_ok=True) 

## 1. Clone the CoRe DB

In [None]:
from watpy.coredb.coredb import *

Initialize a `CoRe_db()` object by specifying the path where we would like the CoRe DB to be initialized. 
The initialization will either clone the special repository `core_database_index` (and only this one) in the given path or syncronize it to (i.e. pull) the latest version in case the repository already exists.

In [None]:
# Tell Git to accept our certificate
!git config --global http.sslCAInfo /etc/certs/core-gitlfs-tpi-uni-jena-de.pem

In [None]:
db_path = './CoRe_DB_clone/'
cdb = CoRe_db(db_path)

The `cdb` object contains the CoRe DB index, which is a `CoRe_idx()` object with essential metadata for all the simulations contained in the DB. The metadata are in a list of `CoRe_md()` objects; which is a simple class wrapping a python dictionary.

In [None]:
idb = cdb.idb

print(idb.dbkeys) # show the database_key for each simulation

# show the metadata in the CoRe DB index for each simulation
entries = 0
for i in idb.index: 
    entries += 1
    for k, v in i.data.items():
        print('  {} = {}'.format(k,v))
    
    break # uncomment to see all ... large output
print('Shown {} entries'.format(entries))

It is also possible to plot some quantities

In [None]:
fig, ax = idb.show('id_eos', to_float=False) 

In [None]:
fig, ax = idb.show('id_gw_frequency_Momega22', to_float=True) 

**Note that the index contains only a subset of all the available metadata for each simulation.** We can find a group of simulations based on these metadata using usual dictionary manipulation:

In [None]:
key = 'id_eos'
val = 'DD2'
mdl_id_eos_DD2 = [i for i in idb.index if i.data[key]== val] # list of metadata(dictionaries)

# show metadata for these runs
for md in mdl_id_eos_DD2:
    for k, v in md.data.items():
        print('  {} = {}'.format(k,v))
        

The relative dbkeys are

In [None]:
dbkeys_id_eos_DD2 = [md.data['database_key'] for md in mdl_id_eos_DD2]

print(dbkeys_id_eos_DD2)

We could now sync use the local DB with the entire CoRe DB:

In [None]:
#cdb.sync(lfs=True, verbose=False) # this takes some time ...

But if only a subset of simulations are required, we can clone only the relative repositories. We can enforce the syncronization of a subset of simulations by passing explicitly the `dbkeys` argument:

In [None]:
cdb.sync(dbkeys=dbkeys_id_eos_DD2, verbose=False, lfs=True, prot='https')
#cdb.sync(dbkeys='THC:0001',verbose=False, lfs=True, prot='https') # To sync only one simulation

The object `cdb` has inside a list of `CoRe_sim()` objects labelled by the `database_key`. Each  `CoRe_sim()` object contains metadata and data of all the runs of a simulation, i.e. the content of one of the git repo in the CoRe DB group.

In [None]:
sim = cdb.sim

# see also 2. below
print(sim.keys())

print(sim['THC:0013'].run)

print(sim['THC:0013'].run['R01'])
print(sim['THC:0013'].run['R01'].data) # now you can work with this!
print(sim['THC:0013'].run['R01'].md) # now you can work with this!

We now have the data we want.

# 2 Get the CoRe DB data

Now, lets see better how to what is inside one simulation object:

In [None]:
thc13 = sim['THC:0013']

# metadata of this simulation - common data for all runs, from metadata_main.txt
for k, v in thc13.md.data.items():
    print('  {} = {}'.format(k,v))

The runs available for this simulation:

In [None]:
print(thc13.run.keys())

Each run is a `CoRe_run()` object that contains the metadata for the run and the actual data. The metadata is as usual

In [None]:
thc13_r01 = thc13.run['R01']

# metadata of this simulation run - note this has more info, from metadata.txt
for k, v in thc13_r01.md.data.items():
    print('  {} = {}'.format(k,v))

The actual data are stored in a `CoRe_h5()` object that allows us to easily read (write) from (to) the HDF5 format. For example, we can dump the HDF5 data into `.txt` files. We can choose what to extract ($h$, $\Psi_4$ or the energetics of the waveform) or just extract everything in the same directory where the original HDF5 archive was stored. The `.txt` files can now be loaded with any python routine (or with the `wave` classes of watpy).

In [None]:
# h5 data file
print(thc13_r01.data)
print(thc13_r01.data.dfile)

# extract to txt
thc13_r01.data.write_strain_to_txt() 
thc13_r01.data.write_psi4_to_txt()
thc13_r01.data.write_EJ_to_txt()

# or all three in one:
#r01.h5.extract_all()

**NOTE: The extracted files are not meant to be tracked by the CoRe DB repo and should not be added to any commit.**
The `CoRe_run` object can help you to delete the `.txt` files as follows:

In [None]:
#thc13_r01.clean_txt() # delete files extracted from the HDF5

The `CoRe_h5()` object has also routines to directly read the data at the chosen extraction radius. If no radius is given, or if the selected radius is not within the available ones, the largest is chosen by default. So, one can finally see the data:

In [None]:
fig, ax = thc13_r01.data.show('rh_22')

Finally, we can check the content of the HDF5 and import the dataset as a numpy array:

In [None]:
# check dset, h5dump -n
dset = thc13_r01.data.dump()

# import as numpy array
# 'read' is deprecated, use read_dset if possible
dset = thc13_r01.data.read('rh_22')
print(dset)

# plot it
import matplotlib.pyplot as plt
plt.plot(dset[:,0],dset[:,1])

# 3 Load waveforms from a simulation

Here we give an example on how to write metadata in CoRe format. Something like this must be done, for example, every time a new simulations needs to the added to the CoRe DB.

In [None]:
from watpy.wave.wave import *
from watpy.utils import ioutils
from watpy.utils.units import MSun_sec
import numpy as np
import os, glob

In [None]:
Msun_sec = MSun_sec() #4.925794970773135e-06

# metadata
thcsim = {}
thcsim['folder'] = './MySim_THC_135135' # simulation folder
thcsim['mass'] = 2 * 1.364 # binary mass in solar masses
thcsim['q'] = 1.0 # mass ratio, >= 1
thcsim['f0_Hz'] = 565.08 # initial GW frequency in Hz
thcsim['f0'] = thcsim['f0_Hz'] * Msun_sec
thcsim['Momg22'] = thcsim['mass'] * thcsim['f0'] / (2*np.pi) # initial GW frequency in geom. units
thcsim['massA'] = 1.364
thcsim['massB'] = 1.364
thcsim['madm'] = 2.703 # ADM mass (t=0) 
thcsim['jadm'] = 7.400 # ADM ang.mom. (t=0) 
thcsim['id_code']                  = 'LORENE'
thcsim['id_type']                  = 'Irrotational'
thcsim['id_mass']                  = 2.7
thcsim['id_rest_mass']             = 2.94554
thcsim['id_mass_ratio']            = 1.0
thcsim['id_ADM_mass']              = 2.67288
thcsim['id_ADM_angularmomentum']   = 7.01514
thcsim['id_gw_frequency_Hz']       = 663.58
thcsim['id_gw_frequency_Momega22'] = 0.0554514940011
thcsim['id_eos']                   = 'ABC'
thcsim['id_kappa2T']               = 159.0084296249798
thcsim['id_Lambda']                = 848.0449579998918
thcsim['id_eccentricity']          = None 
thcsim['id_mass_starA']            = 1.35
thcsim['id_rest_mass_starA']       = 1.47277
thcsim['id_spin_starA']            = 0, 0, 0
thcsim['id_LoveNum_kell_starA']    = 0.09996, 0.0269, 0.00984
thcsim['id_Lambdaell_starA']       = 848.0449579998921, 2001.0063178210328, 4584.234164607441
thcsim['id_mass_starB']            = 1.35
thcsim['id_rest_mass_starB']       = 1.47277

Define directory to save CoRe output data

In [None]:
thcsim['pre-release-folder'] = './MySim_THC_135135/CoReDB' # folder with CoRe formatted files

# Create CoRe output folder if needed
os.makedirs(thcsim['pre-release-folder'], exist_ok = True)

Collect all $\Psi_4$ files and define a multipolar wave with the class `mwaves()`

In [None]:
fnames = [os.path.split(x)[1] for x in glob.glob('{}/{}'.format(thcsim['folder'],'mp_Psi4_l?_m?_r400.00.asc'))]

wm = mwaves(path = thcsim['folder'], code = 'cactus', filenames = fnames, 
            mass = thcsim['mass'], f0 = thcsim['f0'], ignore_negative_m=True)

Show (2,2) strain.

In [None]:
h22 = wm.get(l=2, m=2)
fig = h22.show_strain()

Get strain from all modes

$h_+ - i h_\times = D_L^{-1}\sum_{\ell=2}^\infty\sum_{m=-\ell}^{\ell} h_{\ell m}(t)\,{}^{-2}Y_{\ell m}(\iota,\varphi)$

In [None]:
time, hplus, hcross = wm.hlm_to_strain()

import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.plot(time, hplus, label=r'$h_+$')
ax.plot(time, hcross,'--',label=r'$h_{\times}$') 
ax.set_xlabel('time')
ax.grid()
plt.legend()
#fig.savefig("nrmodes2strain.pdf")
plt.show()

Write text files for every mode of $h_{\ell m}$.

In [None]:
for (l,m) in wm.modes:
    
        psilm = wm.get(var='Psi4',l=l, m=m)
        psilm.write_to_txt('Psi4', thcsim['pre-release-folder'])
        
        hlm = wm.get(l=l, m=m)
        hlm.write_to_txt('h', thcsim['pre-release-folder'])

Write energetics to text files.

In [None]:
wm.energetics(thcsim['massA'], thcsim['massB'], thcsim['madm'], thcsim['jadm'], 
              path_out = thcsim['pre-release-folder'])

Above, the `thcsim` dictionary is (carefully) written using (some of) the keys for the CoRe DB metadata. The latter are stored in a `CoRe_md()` object, which basically contains a dictionary. Lets see what is inside:

In [None]:
from watpy.coredb.metadata import CoRe_md

md = CoRe_md() # initialized empty
print(md.path)
print(md.data)

#md.info() # Uncomment to see the information about each key of the metadata

To initialize this object we can pass a dictionary, like `thcsim`. We use the `CoRe_md()` to write such a text file, i.e. a metadata.txt:

In [None]:
md.update_fromdict(thcsim)
# md = CoRe_md(metadata = thcsim) # alternatively, (re-)initialize

# show the metadata
for k,v, in md.data.items():
    print('{} = {}'.format(k,v))

# write
md.write(path = thcsim['pre-release-folder'], fname = 'metadata.txt')