# Generate Simulated Data
-----

This notebook shows how to use the `rqpy.sim.PulseSim` class to generate simulated data.

Import required packages for demo.

In [1]:
import numpy as np
import pandas as pd
from glob import glob
from scipy import stats
import rqpy as rp
from cutbucket import CutUtils

Specify the paths to the RQs and to the raw data, where the RQs have been generated using `rqpy.process.rq` (see the `process_data.ipynb` demo in the `demos/processing/` folder). We will load these RQs to input into the pulse simulation.

In [2]:
pathtorq = "/path/to/saved/rqs/"
pathtodata = "/path/to/raw/data/"
fs = 625e3

In [3]:
filelist = sorted(glob(f"{pathtorq}*dF*.pkl"))
print(len(filelist))

4739


In [4]:
rq = pd.concat([pd.read_pickle(f) for f in filelist],
               ignore_index=True)

print(f"Series numbers included: {sorted(set(rq.seriesnumber))}")
print(f"The RQ DataFrame uses {rq.memory_usage(index=True, deep=True).sum()*1e-9:.2f} GB of RAM")

Series numbers included: [91812011525, 91812012114, 91812020849, 91812021102, 91812022128, 91812041814, 91812042314, 91812052152, 91812062018]
The RQ DataFrame uses 1.53 GB of RAM


If the desired cuts are being stored in a repository using [CutBucket](https://github.com/ucbpylegroup/cutbucket), then we can load them here. Otherwise, they would need to be defined in this notebook, using the RQs in the loaded DataFrame.

In [5]:
cut_repo = CutUtils("/path/to/cut/repository/", "data_id", lgcsync=True)

cgood_randoms = cut_repo.loadcut("cgood_randoms")

Connecting to GitHub Repository, please ensure that your ssh keys have been uploaded to GitHub account


We now specify a pulse shape, as well as the parameters for generating the amplitudes and time delay values of the pulses.

In this case, we specify the time delay values of the pulses using a Gaussian distribution, and we specify the amplitudes of the pulses using a uniform distribution from 0 to 0.6 $\mu \mathrm{A}$.

In [6]:
template_sim = rp.make_ideal_template(np.arange(32500)/fs, 20e-6, 80e-6)

for s in sorted(set(rq.seriesnumber)):
    print(f"Generating data for series number {s}")
    snum_str = f"{s:012}"
    snum_str = snum_str[:8] + "_" + snum_str[8:]
    
    PS = rp.sim.PulseSim(rq, pathtodata, "mid.gz", template_sim, fs,
                         cut=cgood_randoms)
    PS.generate_sim_data("tdelay", distribution=stats.norm, loc=0, scale=16e-6)
    PS.generate_sim_data("amplitudes", loc=0, scale=6e-7)
    PS.run_sim(f"/path/to/save/data/{snum_str}/", channel="PBS1", det="Z6")

Generating data for series number 91812011525
Generating data for series number 91812012114
Generating data for series number 91812020849
Generating data for series number 91812021102
Generating data for series number 91812022128
Generating data for series number 91812041814
Generating data for series number 91812042314
Generating data for series number 91812052152
Generating data for series number 91812062018
