# Data Capture

<a href="https://colab.research.google.com/github/ledatelescope/bifrost/blob/master/tutorial/06_data_capture.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab"></a>

Next we will look at how to use Bifrost to work with packetized data, either from the network or from packets recorded to a file.  This is done through the `bifrost.packet_capture` module.

**NOTE:** This section of the tutorial previews a new packet capture interface that is currently under development in the `ibverb-support` branch.

In [6]:
%%capture install_log
# Import bifrost, but attempt to auto-install if needed (and we're running on
# Colab). If something goes wrong, evaluate install_log.show() in a new block
# to retrieve the details.
try:
  import bifrost
except ModuleNotFoundError as exn:
  try:
    import google.colab
  except ModuleNotFoundError:
    raise exn
  !sudo apt-get -qq install exuberant-ctags libopenblas-dev librdmacm-dev software-properties-common build-essential
  !pip install -q contextlib2 pint simplejson scipy git+https://github.com/ctypesgen/ctypesgen.git
  ![ -d ~/bifrost/.git ] || git clone --branch ibverb-support https://github.com/ledatelescope/bifrost ~/bifrost
  !(cd ~/bifrost && ./configure --disable-hwloc && make -j all && sudo make install)
  import bifrost

In [8]:
import json
import ctypes
import threading

from bifrost.address import Address
from bifrost.udp_socket import UDPSocket
from bifrost.packet_capture import PacketCaptureCallback, UDPCapture

addr = Address('127.0.0.1', 10000)
sock = UDPSocket()
sock.bind(addr)
sock.timeout = 5.0

class CaptureOp(object):
    def __init__(self, log, oring, sock, nsrc=16, src0=0, max_size=9000, ntime_gulp=250, ntime_slot=25000, core=-1):
        self.log = log
        self.oring = oring
        self.sock = sock
        self.nsrc = nsrc
        self.src0 = src0
        self.max_size = max_size
        self.ntime_gulp = ntime_gulp
        self.ntime_slot = ntime_slot
        self.core = core
        self.shutdown_event = threading.Event()

    def shutdown(self):
        self.shutdown_event.set()

    def seq_callback(
        self, seq0, chan0, nchan, nsrc, time_tag_ptr, hdr_ptr, hdr_size_ptr):
        timestamp0 = int((self.utc_start - ADP_EPOCH).total_seconds())
        time_tag0 = timestamp0 * int(FS)
        time_tag = time_tag0 + seq0 * (int(FS) // int(CHAN_BW))
        print("++++++++++++++++ seq0     =", seq0)
        print("                 time_tag =", time_tag)
        time_tag_ptr[0] = time_tag
        hdr = {
            "time_tag": time_tag,
            "seq0": seq0,
            "chan0": chan0,
            "nchan": nchan,
            "cfreq": (chan0 + 0.5 * (nchan - 1)) * CHAN_BW,
            "bw": nchan * CHAN_BW,
            "nstand": nsrc * 16,
            "npol": 2,
            "complex": True,
            "nbit": 4,
            "axes": "time,chan,stand,pol",
        }
        print("******** CFREQ:", hdr["cfreq"])
        hdr_str = json.dumps(hdr)
        self.header_buf = ctypes.create_string_buffer(hdr_str)
        hdr_ptr[0] = ctypes.cast(self.header_buf, ctypes.c_void_p)
        hdr_size_ptr[0] = len(hdr_str)
        return 0

    def main(self):
        seq_callback = PacketCaptureCallback()
        seq_callback.set_chips(self.seq_callback)
        with UDPCapture('chips', self.sock, self.nsrc, self.src0, self.max_size,
                        self.ntime_gulp, self.ntime_slot, sequence_callback=seq_callback,
                        core=self.core) as capture:
            while not self.shutdown_event.is_set():
                status = capture.recv()
        del capture

This block implements data capture of the [CHIPS format](https://github.com/jaycedowell/bifrost/blob/disk-readers/src/formats/chips.hpp#L36) from the network.  The snippet starts out by creating a socket that will be used to receive the data on using `bifrost.address.Address` and `bifrost.udp_socket.UDPSocket`.  The capture block looks similar to other blocks that we have looked at but there are a few things to note.
 1. This block accepts many more keywords than the previous block.  These extra keywords are used to control the packet capture and data ordering when it is copied into the ring buffer.  They are:
  * `nsrc` - The number of packet sources to expect data from,
  * `src0` - The source ID number for the first packet socket,
  * `max_size` - The maximum packet size to accept.  This is usually set to 9000 to allow for jumbo packets,
  * `ntime_gulp` - This controls the internal buffer size used by the packet capture.  Bifrost keeps two buffers open and releases them to the output ring as data from new gulps is received.
  * `ntime_slot` - The approximate number of packet sets (a packet from all `nsrc` sources) per second.  This is used by Bifrost to determine the boundaries in the gulps.
 2. There is a an internal `threading.Event` instance that is used as a signal for telling the `CaptureOp` block to stop.  Without the capture would run indefinitely.
 3. There is a `seq_callback` method that is called by Bifrost when the packet sequence changes.  This method accepts a format-specific number of arguments and returns a JSON-packed header that sent to the ring.
 4. The `main` method implements the packet capture by calling a collection of Bifrost classes:
  * First, a new `PacketCaptureCallback` instance is created and then the callback for the CHIPS format is set to `CaptureOp.seq_callback`.  This redies the method for Bifrost to use it when the sequence changes.
  * Next, a new `UDPCapture` instance is created for the packet format with the relevant source/data parameters.  This is used as a context for this packet capture itself.
  * Finally, `UDPCapture.recv` is called repeatedly to receive and process packets.  This method returns an integer after a gulp has been released to the ring.  This interger stores the current state of the capture.

As mentioned before, Bifrost also works with reading packets from a file using the `bifrost.packet_capture.DiskReader` class.  This works similar to `UDPCapture` but the packet format specifiers require extra information in order to read packets from disk.  For example, a CHIPS capture of 132 channels is specified as "chips" for `UDPCapture` but as "chips_132" for `DiskReader`.

## Writing Data

Related to this capture interface is the `bifrost.packet_writer` module.  This implments the reverse of the capture in that it takes data from a ring and write it to the network or disk.

Let's look at an example of writing binary data in the [LWA TBN format](https://fornax.phys.unm.edu/lwa/trac/wiki/DP_Formats#TBNOutputInterface), a stream of 8+8-bit complex integers:

In [9]:
import time
import numpy
from bifrost.packet_writer import HeaderInfo, DiskWriter

with open('output.dat', 'wb') as fh:
    bfo = DiskWriter('tbn', fh, core=0)
    desc = HeaderInfo()
    desc.set_tuning(int(round(38e6 / 196e6 * 2**32)))
    desc.set_gain(20)
    
    time_tag = int(time.time()*196e6)
    
    data = numpy.random.randn(16, 512*10)
    data = data + 1j*numpy.random.randn(*data.shape)
    for i in range(16):
        data[i,:] *= 4
    data = bifrost.ndarray(data.astype(numpy.complex64))
    
    qdata = bifrost.ndarray(shape=data.shape, dtype='ci8')
    bifrost.quantize(data, qdata, scale=2)
    print('Input:')
    for i in range(5):
        print('  ', i, '@', 0, ':', data[0,i]*2, '->', qdata[0,i])
        
    qdata = qdata.reshape(16, -1, 512)
    qdata = qdata.transpose(1,0,2).copy()
    
    bfo.send(desc, time_tag, qdata.shape[0]*1960, 0, 1, qdata)
    
import struct
print('Output:')
with open('output.dat', 'rb') as fh:
    packet_header = fh.read(24)
    packet_payload = fh.read(512*2)
    packet_payload = struct.unpack('<1024b', packet_payload)
    i, q = packet_payload[0::2], packet_payload[1::2]
    print(list(zip(i,q))[:5])
    

Input:
   0 @ 0 : (-9.040182113647461+5.184663772583008j) -> (-9, 5)
   1 @ 0 : (-0.7387441396713257-1.1859489679336548j) -> (-1, -1)
   2 @ 0 : (14.541327476501465-2.5483381748199463j) -> (15, -3)
   3 @ 0 : (10.429906845092773+19.244178771972656j) -> (10, 19)
   4 @ 0 : (-1.082539439201355+6.93345832824707j) -> (-1, 7)
Output:
[(-9, 5), (-1, -1), (15, -3), (10, 19), (-1, 7)]


The flow here is:
 1. Opening a file in binary write mode and creating new `DiskWriter` and `HeaderInfo` instances.  `DiskWriter` is what actually writes the formatted data to disk and `HeaderInfo` is a metadata helper used to fill in the packet headers as they are written.
 2. Setting the "tuning" and "gain" parameters for the output headers.  These are values that are common to all of the packets written.
 3. Creating a time tag for the first sample and a collection of complex data that will go into the packets.
 4. Converting the complex data into the 8+8-bit integer format expected for TBN data.  The `DiskWriter` instances are data type-aware.
 5. Reshaping `qdata` so that it has axes of (packet number, output index, samples per packet).
 6. Actually writing the data to disk with `DiskWriter.send`.  This method takes in:
  * a `HeaderInfo` instance used to population the header,
  * a starting time tag value,
  * a time tag increment to apply when moving to the next packet,
  * an output naming starting index,
  * a output naming increment to apply when moving to the next output name, and
  * the data itself.

After this we re-open the file and read in the data to verify that what is written matches what was put in.  Since the data is 8+8-bit complex this is easy to do with some information about the packet structure and the `struct` module.

To write to the network rather than a file you would:
 1. Swap the open filehandle with a `bifrost.udp_socket.UDPSocket` instance and
 2. Trade `bifrost.packet_writer.DiskWriter` for `bifrost.packet_writer.UDPTransmit`.

## Adding a New Packet Format

Adding a new packet format to Bifrost is a straightforward task:
 1. Add a new `.hpp` file to `bifrost/src/formats`.  This file should contain:
  * the header format for the packet, 
  * a sub-class of the C++ `PacketDecoder` class that implements a packet validator,
  * a sub-class of the C++ `PacketProcessor` class that implements an unpacket to take the payload for a valid packet and place it in the correct position inside a ring buffer, and
  * *optionally*, a sub-class of the C++ `PacketHeaderFiller` class that can be used when creating packets from Bifrost.
 2. Add the new format to `packet_capture.hpp`.  This has three parts:
  * Add a new method to the C++ class `BFpacketcapture_callback_impl` to handle the sequence change callback.
  * Add new sub-class of the C++ `BFpacketcapture_impl` class that implements the format and defines how Bifrost detects changes in the packet sequence.
  * Update the C++ function `BFpacketcapture_create` to expose the new packet format.
 3. Add the new format callback helper to `packet_capture.cpp`.
 4. Update `bifrost/packet_capture.h` to expose the new callback to the Python API.
 5. *Optionally*, add support for writing packets:
  * Add the new format to `packet_writer.hpp` by adding a new sub-class of the C++ `BFpacketwriter_impl` class.
  * Update the C++ `BFpacketwriter_create` function to expose the new packet format.