# Data writers in QGL, Auspex, libaps2

It would be nice to have some documentation on how data is stored, processed, and passed around between the three software packages.  I see four situations where data is passed into or out of files.
  * Data written to `.aps2` files by `QGL`
  * Data read by `libaps2` and set to the APS2
  * Data written by `Auspex` into `.auspex` data files
  * Data read by `Auspex`/`Qlab.jl` for analysis

These cover ~99% of the situations I can imagine in our normal data flow.  The rest of this document is an exploration and note-to-future-self about how this process works and how it could be changed in the future.

## QGL to .aps2 files

The first situation will require `QGL` to create sequence files for the APS2.  Here, we'll create these files and read them back into the workspace to see how they are packed. 

In [1]:
import QGL.config
from QGL import *

In [2]:
# a minimal example of a qubit control chain
cl = ChannelLibrary(":memory:")
q2 = cl.new_qubit("q2")

# specify the particulars for a rack of APS2s
ip_addresses = [f"192.168.1.{i}" for i in [23, 24, 25, 28]]
aps2 = cl.new_APS2_rack("Maxwell", ip_addresses, tdm_ip="192.168.1.11")
aps2.px("TDM").trigger_interval = 500e-6
cl.set_master(aps2.px("TDM"))

# initialize all four APS2 to linear regime
for i in range(1,4):
    aps2.tx(i).ch(1).I_channel_amp_factor = 0.5
    aps2.tx(i).ch(1).Q_channel_amp_factor = 0.5
    aps2.tx(i).ch(1).amp_factor = 1

# create a digitizer - X6 in this case
dig_1  = cl.new_X6("MyX6", address=0)
dig_1.record_length = 1024 + 256

AM2 = cl.new_source("AutodyneM2", "HolzworthHS9000", "HS9004A-492-1", 
                     power=16.0, frequency= 6.74621e9, reference="10MHz")
q2src = cl.new_source("q2source", "HolzworthHS9000", "HS9004A-492-2", 
                     power=16.0, frequency=5.0122e9, reference="10MHz")

cl.set_measure(q2, aps2.tx(2), dig_1.channels[1], gate=False, trig_channel=aps2.tx(2).ch("m2"), generator=AM2)
cl.set_control(q2, aps2.tx(4), generator=q2src)


In [3]:
seqs = [[X(q2),Y(q2),X(q2),MEAS(q2)]]

In [4]:
mf = compile_to_hardware(seqs, 'utils/file_doc')

Compiled 1 sequences.


So now we have `.aps2` files in the utils/file_doc folder, along with a .json metafile.  __mf__ is a file path to the .json metafile that holds metadata about the sequence and where the .aps2 files are stored.  Note, the file extension is set byt the dirver used in QGL.  If you were using an APS1 or a APS3, you would see `.aps1` or `.aps3` files.  Not these files are structurally the same, just name differently for clarity.

In [5]:
mf

'/Users/mware/Github/AWGDir/utils/file_doc-meta.json'

In [6]:
import json

In [7]:
md = json.load(open(mf, "r"))
md

{'axis_descriptor': [{'name': 'segment',
   'partition': 1,
   'points': [1],
   'unit': None}],
 'database_info': {'db_provider': 'sqlite',
  'db_resource_name': ':memory:',
  'library_id': 1,
  'library_name': 'working'},
 'edges': [],
 'instruments': {'Maxwell_U2': '/Users/mware/Github/AWGDir/utils/file_doc-Maxwell_U2.aps2',
  'Maxwell_U4': '/Users/mware/Github/AWGDir/utils/file_doc-Maxwell_U4.aps2'},
 'measurements': ['M-q2'],
 'num_measurements': 1,
 'num_sequences': 1,
 'qubits': ['q2'],
 'receivers': {'RecvChan-MyX6-2': 1}}

In [8]:
APS2_control_file = md["instruments"]["Maxwell_U4"]
APS2_measure_file = md["instruments"]["Maxwell_U2"]

The metafile is just there for house-keeping across multiple `.aps2` files.  It also provides a record of what `QGL` created for each experiment.  The point of this notebook is to dive into the `.aps2` file structure itself.  We can start by looking at the function that creates the file: `QGL/QGL/drivers/APS2Pattern.py` and the `write_sequence_file()`.  The relevant code is only ~20 lines long:
```python
with open(fileName, 'wb') as FID:
    FID.write(b'APS2')                     # target hardware
    FID.write(np.float32(4.0).tobytes())   # Version
    FID.write(np.float32(4.0).tobytes())   # minimum firmware version
    FID.write(np.uint16(2).tobytes())      # number of channels
    # FID.write(np.uint16([1, 2]).tobytes()) # channelDataFor
    FID.write(np.uint64(instructions.size).tobytes()) # instructions length
    FID.write(instructions.tobytes()) # instructions in uint64 form

    #Create the groups and datasets
    for chanct in range(2):
        #Write the waveformLib to file
        if wfInfo[chanct][0].size == 0:
            #If there are no waveforms, ensure that there is some element
            #so that the waveform group gets written to file.
            #TODO: Fix this in libaps2
            data = np.array([0], dtype=np.int16)
        else:
            data = wfInfo[chanct][0]
        FID.write(np.uint64(data.size).tobytes()) # waveform data length for channel
        FID.write(data.tobytes())
```
Here we can see the data is just byte encoded and written to file in a particular order.  First some basic information about the hardware and minimum firmware.

In [9]:
print(b'APS2')

b'APS2'


In [10]:
np.float32(4.0).tobytes()

b'\x00\x00\x80@'

In [11]:
np.uint16(2).tobytes()

b'\x02\x00'

In [12]:
with open(APS2_control_file, "rb") as f:
    print(f.read())

b'APS2\x00\x00\x80@\x00\x00\x80@\x02\x00\x0b\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x80\x00\x91\x00\x00\x00\x00\x00/\x00\xa1\x00\x00\x00@\x00a\x00\xa1\x00\x00\x00\x00\x00@\x00!\x00\x00\x00\x05\x00\x00\x00\r\x0b\x00\x00\x00\x00\x01\x00\xa1\x06\x00\x00\x05\x00\x00\x00\r\x00\x00\x00\x05\x00\x00\x00\r#\x00\x00\x00\x00\x01\x00\xa1\x0c\x00\x00\x1d\x00 \x00\r\x00\x00\x00\x00\x00\x00\x00`4\x00\x00\x00\x00\x00\x00\x00t\x01W\x03\xb2\x05\x84\x08\xc2\x0bQ\x0f\x07\x13\xad\x16\x05\x1a\xcf\x1c\xd0\x1e\xdc\x1f\xdc\x1f\xd0\x1e\xcf\x1c\x05\x1a\xad\x16\x07\x13Q\x0f\xc2\x0b\x84\x08\xb2\x05W\x03t\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x004\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00

Of course, this is just the most basic look at what's in the file.  To pull out more useful information, `QGL` has some other function for exploring the data inside.

In [13]:
QGL.drivers.APS2Pattern.read_sequence_file(APS2_control_file)

{'ch1': [[(1, 0.04541570015871078),
   (1, 0.10438285923574656),
   (1, 0.17800024417043095),
   (1, 0.2661457697472836),
   (1, 0.36747649859602),
   (1, 0.4786961298986693),
   (1, 0.5946770846050543),
   (1, 0.7087046758637529),
   (1, 0.8132096203149799),
   (1, 0.9003784641679893),
   (1, 0.9630081797094372),
   (1, 0.9957270174581858),
   (1, 0.9957270174581858),
   (1, 0.9630081797094372),
   (1, 0.9003784641679893),
   (1, 0.8132096203149799),
   (1, 0.7087046758637529),
   (1, 0.5946770846050543),
   (1, 0.4786961298986693),
   (1, 0.36747649859602),
   (1, 0.2661457697472836),
   (1, 0.17800024417043095),
   (1, 0.10438285923574656),
   (1, 0.04541570015871078),
   (1, 3.1146187425024613e-16),
   (1, -2.5844387463462135e-19),
   (1, -1.2216110335518833e-15),
   (1, 3.911873736643366e-15),
   (1, 2.8801850737119535e-15),
   (1, 4.678017320548365e-16),
   (1, 1.3403101491459762e-14),
   (1, 1.1111039982203213e-14),
   (1, 7.170440821592936e-15),
   (1, 1.7620036993235762e-15),


In [14]:
QGL.drivers.APS2Pattern.read_sequence_file(APS2_measure_file)

{'ch1': [[(24, 0.0),
   (24, 0.0),
   (24, 0.0),
   (1, 0.017946526675619582),
   (1, 0.0894884629471371),
   (1, 0.34550115980954704),
   (1, 0.7392259797338542),
   (1, 0.9383469661823953),
   (1, 0.9877914784519595),
   (1, 0.9976803809058723),
   (1, 0.9995116591380784),
   (1, 0.9998779147845196),
   (1, 0.9998779147845196),
   (1, 0.9998779147845196),
   (1, 0.9998779147845196),
   (1, 0.9998779147845196),
   (1, 0.9998779147845196),
   (1, 0.9998779147845196),
   (1, 0.9998779147845196),
   (1, 0.9998779147845196),
   (1, 0.9998779147845196),
   (1, 0.9998779147845196),
   (1, 0.9998779147845196),
   (1, 0.9998779147845196),
   (1, 0.9998779147845196),
   (1, 0.9998779147845196),
   (1, 0.9998779147845196),
   (1, 0.9998779147845196),
   (1, 1.0),
   (1, 1.0),
   (1, 1.0),
   (1, 1.0),
   (1, 1.0),
   (1, 1.0),
   (1, 1.0),
   (1, 1.0),
   (1, 1.0),
   (1, 1.0),
   (1, 1.0),
   (1, 1.0),
   (1, 1.0),
   (1, 1.0),
   (1, 1.0),
   (1, 1.0),
   (1, 1.0),
   (1, 1.0),
   (1, 1.0),
 

It might be instructive to see what the pulses should look like so we can confirm the data in the files is what was intended.

In [15]:
plot_pulse_files(mf)

VBox(children=(IntSlider(value=1, description='Segment', max=1, min=1), Figure(animation_duration=50, axes=[Ax…

Note here, the data returned in the read_sequence_file methods and in plotting is just the waveform data and not the sequence instruction data.  `QGL` has a variety of functions you can use to inspect the sequence data:

In [16]:
QGL.drivers.APS2Pattern.raw_instructions(APS2_control_file)

array([10448491872987906048, 11601324317152903168, 11601379293808033792,
        2377970971995799552,   936748722576949248, 11601273739618025483,
         936748722576949254,   936748722576949248, 11601273739618025507,
         936783907351691276,  6917529027641081856], dtype=uint64)

In [17]:
QGL.drivers.APS2Pattern.read_instructions(APS2_control_file)

[SYNC,
 MODULATION; write=1 | RESET_PHASE; nco_select=0xf,
 MODULATION; write=1 | SET_FREQ; nco_select=0x1, increment=0x40000000,
 WAIT,
 WFM; engine=3, write=1 | PLAY; TA_bit=0, count=5, addr=0,
 MODULATION; write=1 | MODULATE; nco_select=0x1, count=11,
 WFM; engine=3, write=1 | PLAY; TA_bit=0, count=5, addr=6,
 WFM; engine=3, write=1 | PLAY; TA_bit=0, count=5, addr=0,
 MODULATION; write=1 | MODULATE; nco_select=0x1, count=35,
 WFM; engine=3, write=1 | PLAY; TA_bit=1, count=29, addr=12,
 GOTO | target_addr=0]

Users will likely never need to inspect .aps2 files in this detail but the above is illustrative of what information is in the files.  See the documentations for more details on what the instructions actually do when decoded by the hardware.  The above is a representation of our simple X,Y,X sequence created above.

## Data read by libaps2

The data read by `libaps2` can be specified by the user manually or read from files like `.aps2` above with the `load_sequence_file()` function inside libaps2:
```cpp
// Read the header information
char junk[100];
uint64_t buff_length;
uint16_t num_chans;

std::fstream file(seqFile, std::ios::binary | std::ios::in);
if( !file ) throw APS2_SEQFILE_FAIL;

file.read(junk, 12); // Don't need this info
file.read(reinterpret_cast<char *> (&num_chans), sizeof(uint16_t));

// Start anew
clear_channel_data();

// Read the instructions
file.read(reinterpret_cast<char *> (&buff_length), sizeof(uint64_t));
vector<uint64_t> instructions;
instructions.resize(buff_length);
file.read(reinterpret_cast<char *> (instructions.data()), buff_length*sizeof(uint64_t));
write_sequence(instructions);

// Read the waveforms
vector<int16_t> waveform;
for (int chanct = 0; chanct < num_chans; chanct++) {
  file.read(reinterpret_cast<char *> (&buff_length), sizeof(uint64_t));
  waveform.resize(buff_length);
  file.read(reinterpret_cast<char *> (waveform.data()), buff_length*sizeof(int16_t));
  set_waveform(chanct, waveform);
}
```
To manually set the values a user could use the `write_sequence` function to write sequence instruction information and the `set_waveform` function to write the waveform data.

## Data written by `Auspex`

The other end of the data taking process involves saving experimental data to file.  This data could have a very rich structure and will depend on how the data pipeline was constructed in `Auspex` and what was saved to file using the pipeline `Writer`.  The writer uses a data structure called __AuspexDataContainer__ specified in `auspex/data_format`.  It creates a directory structure with a `.auspex` extension where the data files are stored.  This choice to use directories was intentionally made to allow for multiple file writers to work at once.  The files inside the folder have the following structure:
```
folder/data_set_name/
                        data.dat
                        data_meta.json
```
Where the .dat contains the actual data and the JSON file holds meta data about data in the .dat file.  The data files are `numpy` memmaps as explained in more detail in the next section.

## Data read by `Auspex`

The last situation is where data needs to be read from file after the experiment is over.  This could be necessary for plotting, analysis, etc...  Auspex has two main functions for doing this: `open_data` and `load_data`.

In [18]:
from auspex.analysis.helpers import open_data, load_data

In [19]:
auspex_data = load_data('../../test/data/test_data-0000.auspex')

In [20]:
auspex_data

{'q2-raw_int': {'data': {'data': array([-0.03480155+0.07538876j, -0.0365286 +0.07124131j,
          -0.0347432 +0.0652775j , -0.03676876+0.06965215j,
          -0.03290481+0.06655533j, -0.03316881+0.05601843j,
          -0.03321098+0.04771077j, -0.04027801+0.04940713j,
          -0.03636705+0.04388169j, -0.03038527+0.04576108j,
          -0.03787801+0.04439955j, -0.03236112+0.04230543j,
          -0.04050779+0.04489688j, -0.03577601+0.03958429j,
          -0.0406527 +0.03691186j, -0.04130238+0.03809405j,
          -0.0371349 +0.03233574j, -0.04126542+0.02633132j,
          -0.04183918+0.02636424j, -0.04314072+0.02570276j,
          -0.03159363+0.03175351j, -0.03809416+0.02639723j,
          -0.04292881+0.02568283j, -0.03711685+0.02149215j,
          -0.03787809+0.03078357j, -0.03833073+0.0198887j ,
          -0.03942336+0.02429585j, -0.03922277+0.01124447j,
          -0.03651746+0.01885406j, -0.04246189+0.01818306j,
          -0.04411308+0.02368582j, -0.03961586+0.01782727j,
          

In [21]:
# You can be more specific with the data and group name if you'd like with the open_data function

#data, desc, _ = open_data(0,'../../test/data/', groupname="q2-raw_int", datasetname="data", date = "")

# you can also pass no arguments and select the data with a window (tested on MacOS)
# data = open_data()

In [22]:
data

array([-0.03480155+0.07538876j, -0.0365286 +0.07124131j,
       -0.0347432 +0.0652775j , -0.03676876+0.06965215j,
       -0.03290481+0.06655533j, -0.03316881+0.05601843j,
       -0.03321098+0.04771077j, -0.04027801+0.04940713j,
       -0.03636705+0.04388169j, -0.03038527+0.04576108j,
       -0.03787801+0.04439955j, -0.03236112+0.04230543j,
       -0.04050779+0.04489688j, -0.03577601+0.03958429j,
       -0.0406527 +0.03691186j, -0.04130238+0.03809405j,
       -0.0371349 +0.03233574j, -0.04126542+0.02633132j,
       -0.04183918+0.02636424j, -0.04314072+0.02570276j,
       -0.03159363+0.03175351j, -0.03809416+0.02639723j,
       -0.04292881+0.02568283j, -0.03711685+0.02149215j,
       -0.03787809+0.03078357j, -0.03833073+0.0198887j ,
       -0.03942336+0.02429585j, -0.03922277+0.01124447j,
       -0.03651746+0.01885406j, -0.04246189+0.01818306j,
       -0.04411308+0.02368582j, -0.03961586+0.01782727j,
       -0.03935266+0.01531323j, -0.04433473+0.01292302j,
       -0.04045838+0.0166872j ,

In [23]:
# In some cases, you have a specific 'data' folder and all data is organized by date. In that case, you can use open data
# in a way similar to:
#
# dat, desc, _ = open_data(25, '/Users/mware/IpythonNotebooks/explorations/Tomography/', groupname="q2-main", datasetname="data", date="200428")
#
# where the the function looks for a folder named with the data and pulls out the '25' numbered dataset

The `.dat` files themselves are binary packed memmaps.  They can be accessed amnually with `numpy.memmap`.  

In [24]:
test_file = '../../test/data/test_data-0000.auspex/q2-raw_int/data.dat'

In [25]:
test_data = np.memmap(test_file, dtype=complex, mode='r')

In [26]:
test_data

memmap([-0.03480155+0.07538876j, -0.0365286 +0.07124131j,
        -0.0347432 +0.0652775j , -0.03676876+0.06965215j,
        -0.03290481+0.06655533j, -0.03316881+0.05601843j,
        -0.03321098+0.04771077j, -0.04027801+0.04940713j,
        -0.03636705+0.04388169j, -0.03038527+0.04576108j,
        -0.03787801+0.04439955j, -0.03236112+0.04230543j,
        -0.04050779+0.04489688j, -0.03577601+0.03958429j,
        -0.0406527 +0.03691186j, -0.04130238+0.03809405j,
        -0.0371349 +0.03233574j, -0.04126542+0.02633132j,
        -0.04183918+0.02636424j, -0.04314072+0.02570276j,
        -0.03159363+0.03175351j, -0.03809416+0.02639723j,
        -0.04292881+0.02568283j, -0.03711685+0.02149215j,
        -0.03787809+0.03078357j, -0.03833073+0.0198887j ,
        -0.03942336+0.02429585j, -0.03922277+0.01124447j,
        -0.03651746+0.01885406j, -0.04246189+0.01818306j,
        -0.04411308+0.02368582j, -0.03961586+0.01782727j,
        -0.03935266+0.01531323j, -0.04433473+0.01292302j,
        -0.040

In [27]:
np.size(test_data)

105

Anyone using `Auspex` should never have to work at this level.  This breakdown of the file structures is presented only for reference.