## Accessing DM-EFD data


In this notebook we demonstrate how to extract data from the DM-EFD using [aioinflux](https://aioinflux.readthedocs.io/en/stable/index.html), a Python client for InfluxDB, and proceed with data analysis using Pandas dataframes. 

This is complementaty to the [Chronograf](https://test-chronograf-efd.lsst.codes) interface which we use for time-series visualization.

In addition to `aioinflux`, you'll need to install `pandas`, `numpy` and `matplotlib` to run this notebook.

In [None]:
import matplotlib
%matplotlib notebook
from matplotlib import pylab as plt
import aioinflux
import getpass
import pandas as pd

We'll access the DM-EFD instance deployed at the AuxTel lab in Tucson. You need to be on site or connected to the NOAO VPN. 

If you are familiar with the AuxTel lab environment, you might be able to authenticate using the generic `saluser`. Ping me at Slack (`@afausti`) if you have any problem.

In [None]:
username = "saluser"
password = getpass.getpass(f"Password for {username}:")

The following configures the `aioinflux` Python client to connect to the DM-EFD InfluxDB instance. 

In [None]:
client = aioinflux.InfluxDBClient(host='test-influxdb-efd.lsst.codes', 
                                  port='443', 
                                  ssl=True, 
                                  username=username, 
                                  password=password,
                                  db='efd')

We can configure the output to be a Pandas dataframe, which is very convenient for data analysis.  Specify a time range for data in `InfluxQL`.  The default is 20hrs ago, but this may need to be changed depending on how recently data was taken.

In [None]:
client.output = 'dataframe'
time_span = 'time > now() - 20h'

Query each of the measurements we may want to correlate later in the notebook.  Note that this could be done as a single query, but the result is a dictionary of `DataFrames` which I find less convenient to use than named variables corresponding to one `DataFrame` each.

In [None]:
df1 = await client.query(f'SELECT "temp1", "temp2", "temp3", "temp4", "temp5", "temp6", "ccdTemp0" FROM "efd"."autogen"."lsst.sal.ATCamera.wreb" WHERE {time_span}')

df2 = await client.query(f'SELECT "expTime", "numImages" FROM "efd"."autogen"."lsst.sal.ATCamera.command_takeImages" WHERE {time_span}')

df3 = await client.query(f'SELECT "analog_I" FROM "efd"."autogen"."lsst.sal.ATCamera.wrebPower" WHERE {time_span}')

shutter_close = await client.query(f'SELECT "priority" FROM "efd"."autogen"."lsst.sal.ATCamera.logevent_endShutterClose" WHERE {time_span}')

shutter_open = await client.query(f'SELECT "priority" FROM "efd"."autogen"."lsst.sal.ATCamera.logevent_endShutterOpen" WHERE {time_span}')

start_readout = await client.query(f'SELECT "priority", "private_sndStamp", "private_rcvStamp" FROM "efd"."autogen"."lsst.sal.ATCamera.logevent_startReadout" WHERE {time_span}')

end_readout = await client.query(f'SELECT "priority", "private_sndStamp", "private_rcvStamp" FROM "efd"."autogen"."lsst.sal.ATCamera.logevent_endReadout" WHERE {time_span}')

start_integration = await client.query(f'SELECT "priority", "private_sndStamp", "private_rcvStamp" FROM "efd"."autogen"."lsst.sal.ATCamera.logevent_startIntegration" WHERE {time_span}')

Select a telemetry stream, `analog_I`, and overplot log messages from the `start_integration`, `start_readout`, and `end_readout` log streams.

In [None]:
plt = df3.plot(y="analog_I")
for i, r in start_integration.iterrows():
    plt.axvline(pd.Timestamp(r['private_sndStamp'], unit='s'), c='g', alpha=0.3)
for i, r in start_readout.iterrows():
    plt.axvline(pd.Timestamp(r['private_sndStamp'], unit='s'), c='b', alpha=0.3)
for i, r in end_readout.iterrows():
    plt.axvline(pd.Timestamp(r['private_sndStamp'], unit='s'), c='r', alpha=0.3)

Inspect time stamps.  There are three obvious ones: the timestamp from `InfluxDB`, `private_rcvStamp` from SAL, and `private_sndStamp` from SAL.

In [None]:
plt = df3.plot()
diff = []
for i, r in start_integration.iterrows():
    plt.axvline(i, c='g', alpha=0.3)
    plt.axvline(pd.Timestamp(r['private_rcvStamp'], unit='s'), c='b', alpha=0.3)
    plt.axvline(pd.Timestamp(r['private_sndStamp'], unit='s'), c='r', alpha=0.3)
    diff.append(pd.Timestamp(r['private_rcvStamp'], unit='s') - pd.Timestamp(r['private_sndStamp'], unit='s'))
    diff[-1] = diff[-1].value/1000000000

Look at the histogram of timestamp differences for `private_rcvStamp` - `private_sndStamp`.

In [None]:
diff_df = pd.DataFrame(diff)
bins = [el/1. + 32.5 for el in range(10)]
diff_df.hist(bins=bins)