# Data Logging
Labbench provides several data logging capabilities oriented toward experiments that involve complex sweeps or test conditions. Their general idea is to automatically log small details (device parameters, test conditions, git commit hashes, etc.) so that automation code is focused on the test procedure. The resulting logging system makes many implicit decisions but attempts describe the resulting structure clearly:
* Automatic logging of simple scalar parameters of {py:class}`labbench.Device` objects that are defined with {py:mod}`labbench.paramattr`
* Manual logging through simple dictionary mapping
* Consistent and automatic mapping from non-scalar types ({py:class}`pandas.DataFrame`, {py:func}`numpy.array`, long strings, files generated outside the data tree, etc.)
* Support for several output data types: {py:class}`labbench.CSVLogger`, {py:class}`labbench.HDFLogger`, and {py:class}`labbench.SQLiteLogger`

## Example: Logging `Device` objects
To get started, consider a simple loop:

In [1]:
import numpy as np
import labbench as lb
from labbench.testing.pyvisa_sim import SpectrumAnalyzer, PowerSensor

lb.visa_default_resource_manager('@sim')
lb.show_messages('info')

sensor = PowerSensor()
analyzer = SpectrumAnalyzer()

db = lb.CSVLogger(path=f"./{np.random.bytes(8).hex()}")
db.observe_paramattr([sensor, analyzer])

with sensor, analyzer, db:
    for freq in (5.8e9, 5.85e9, 5.9e9):
        analyzer.center_frequency = freq
        sensor.frequency = freq

        sensor.trigger()
        analyzer.trigger()

        data = {
            'analyzer_trace': analyzer.fetch(),
            'sensor_reading': sensor.fetch()[0],
        }

        db.new_row(data)

[1;30m INFO  [0m [32m2024-01-22 14:36:44,260.260[0m • [34maggregator:[0m PowerSensor() named 'sensor' by introspection
[1;30m INFO  [0m [32m2024-01-22 14:36:44,260.260[0m • [34maggregator:[0m SpectrumAnalyzer() named 'analyzer' by introspection
[1;30m INFO  [0m [32m2024-01-22 14:36:46,172.172[0m • [34maggregator:[0m CSVLogger('d908ee6979c1792a') named 'db' by introspection


[1;30m INFO  [0m [32m2024-01-22 14:36:44,260.260[0m • [34maggregator:[0m SpectrumAnalyzer() named 'analyzer' by introspection


[1;30m INFO  [0m [32m2024-01-22 14:36:46,172.172[0m • [34maggregator:[0m CSVLogger('d908ee6979c1792a') named 'db' by introspection


### Output data structure
Experimental results are populated as follows in a directory at the given path:

![image](csvlogger_folder_structure.png)

The root table in `outputs.csv` gives the high level test conditions and results:

In [2]:
import pandas as pd

root = pd.read_csv(f'{db.path}/outputs.csv')
root

Unnamed: 0,analyzer_trace,sensor_reading,analyzer_center_frequency,sensor_frequency,sensor_trigger_count,db_host_time,db_host_log
0,0/analyzer_trace.csv,-52.617,5800000000.0,5800000000.0,200,2024-01-22 14:36:46.285057,0/db_host_log.json
1,1/analyzer_trace.csv,-52.617,5850000000.0,5850000000.0,200,2024-01-22 14:36:46.396226,1/db_host_log.json
2,2/analyzer_trace.csv,-52.617,5900000000.0,5900000000.0,200,2024-01-22 14:36:46.512491,2/db_host_log.json


This points us at scalar test conditions and results, and paths to paths to files containing for non-scalar data (arrays, tables, etc.) and long text strings. Examples here include the measurement trace from the spectrum analyzer (column `'analyzer_trace.csv'`), and the host log JSON file (`'host_log'`). For example:

In [3]:
pd.read_csv(f"{db.path}/{root['analyzer_trace'][0]}")

Unnamed: 0,frequency,power_spectral_density
0,5.795000e+09,-52.617
1,5.795050e+09,-52.373
2,5.795101e+09,-52.724
3,5.795151e+09,-51.893
4,5.795201e+09,-52.270
...,...,...
195,5.804799e+09,-51.752
196,5.804849e+09,-53.065
197,5.804899e+09,-52.585
198,5.804950e+09,-51.861


In [4]:
import json

with open(f"{db.path}/metadata.json", 'r') as stream:
    metadata = json.load(stream)

# metadata['device_objects']
metadata['field_name_sources']

{'sensor_isopen': {'object': 'sensor.isopen',
  'paramattr': '<labbench.paramattr.property.bool() as isopen>',
  'type': 'bool',
  'help': '\n\n`True` if the backend is ready for use',
  'label': ''},
 'sensor_resource': {'object': 'sensor.resource',
  'paramattr': '<labbench.paramattr.value.str(None) as resource>',
  'type': 'str',
  'help': 'device address or URI',
  'label': ''},
 'analyzer_isopen': {'object': 'analyzer.isopen',
  'paramattr': '<labbench.paramattr.property.bool() as isopen>',
  'type': 'bool',
  'help': '\n\n`True` if the backend is ready for use',
  'label': ''},
 'analyzer_resource': {'object': 'analyzer.resource',
  'paramattr': '<labbench.paramattr.value.str(None) as resource>',
  'type': 'str',
  'help': 'device address or URI',
  'label': ''},
 'db_host_log': {'object': 'db.host.log',
  'paramattr': '<labbench.paramattr.property.list() as log>',
  'type': 'list',
  'help': '\n\nGet the current host log contents.',
  'label': ''},
 'db_munge_force_relational': 

For a more systematic analysis to analyzing the data, we may want to expand the root table based on the relational data files in one of these columns. A shortcut for this is provided by {py:func}`labbench.read_relational`:

In [5]:
lb.read_relational(
    f'{db.path}/outputs.csv',

    # the column containing paths to relational data tables.
    # the returned table places a .
    'analyzer_trace',

    # copy fixed values of these column across as columns in each relational data table
    ['sensor_frequency', 'sensor_reading']
)

Unnamed: 0,analyzer_trace,analyzer_trace_frequency,analyzer_trace_id,analyzer_trace_power_spectral_density,root_index,sensor_frequency,sensor_reading
0,0/analyzer_trace.csv,5.795000e+09,0,-52.617,0,5.800000e+09,-52.617
1,0/analyzer_trace.csv,5.795050e+09,1,-52.373,0,5.800000e+09,-52.617
2,0/analyzer_trace.csv,5.795101e+09,2,-52.724,0,5.800000e+09,-52.617
3,0/analyzer_trace.csv,5.795151e+09,3,-51.893,0,5.800000e+09,-52.617
4,0/analyzer_trace.csv,5.795201e+09,4,-52.270,0,5.800000e+09,-52.617
...,...,...,...,...,...,...,...
595,2/analyzer_trace.csv,5.904799e+09,195,-51.752,2,5.900000e+09,-52.617
596,2/analyzer_trace.csv,5.904849e+09,196,-53.065,2,5.900000e+09,-52.617
597,2/analyzer_trace.csv,5.904899e+09,197,-52.585,2,5.900000e+09,-52.617
598,2/analyzer_trace.csv,5.904950e+09,198,-51.861,2,5.900000e+09,-52.617


For each row in the root table, the expanded table is expanded with a copy of the contents of the relational data table in its file path ending in `'analyzer_trace.csv'`.

In [6]:
import labbench as lb
from labbench.testing.pyvisa_sim import SpectrumAnalyzer, PowerSensor, SignalGenerator
import numpy as np
from shutil import rmtree

FREQ_COUNT = 3
DUT_NAME = "DUT 63"
DATA_PATH = './data'


# the labbench.testing devices support simulated pyvisa operations
lb.visa_default_resource_manager('@sim')

class Testbed(lb.Rack):
    sensor: PowerSensor = PowerSensor()
    analyzer: SpectrumAnalyzer = SpectrumAnalyzer()
    db: lb.CSVLogger = lb.CSVLogger(path=DATA_PATH)

    def open(self):
        # remove prior data before we start
        self.db.observe_paramattr(self.analyzer)
        self.db.observe_paramattr(self.sensor, always=['sweep_aperture'])

    def single(self, frequency: float):
        self.analyzer.center_frequency = frequency
        self.sensor.frequency = frequency

        self.sensor.trigger()
        self.analyzer.trigger()

        return dict(
            analyzer_trace=self.analyzer.fetch(),
            sensor_reading=self.sensor.fetch()[0]
        )

rmtree(Testbed.db.path, True)

with Testbed() as rack:
    for freq in np.linspace(5.8e9, 5.9e9, FREQ_COUNT):
        rack.single(freq)

        # this could also go in single()
        rack.db.new_row(
            comments='try 1.21 GW for time-travel',
            dut = DUT_NAME,
        )