## Temperature latency


In this notebook we demonstrate how to extract data from the DM-EFD using [aioinflux](https://aioinflux.readthedocs.io/en/stable/index.html), a Python client for InfluxDB, and proceed with data analysis using Pandas dataframes. 

This is complementaty to the [Chronograf](https://test-chronograf-efd.lsst.codes) interface which we use for time-series visualization.

In addition to `aioinflux`, you'll need to install `pandas`, `numpy` and `matplotlib` to run this notebook.

In [None]:
import matplotlib
%matplotlib inline
from matplotlib import pylab as plt
from astropy.time import Time, TimeDelta
import numpy as np
import pandas as pd

from bokeh.plotting import figure, output_notebook, show
from bokeh.models import LinearAxis, Range1d, Span, Label
output_notebook()

from lsst_efd_client import EfdClient

We'll access the DM-EFD instance deployed at the AuxTel lab in Tucson. You need to be on site or connected to the NOAO VPN. 

To access the EFD, you will need to put a file called `.lsst/notebook_auth.yaml` in your home directory.  It should be formatted in the following way (substituting the appropriate values, of course).  Ping Anglo or Simon on Slack (`@afausti`, `@ksk`) if you have any problem.

```yaml
test:
  username: "user"
  host: "endpoint.edu"
  password: "passwd"
```

In [None]:
client = EfdClient('int_efd')

Specify a time range. These must be `astropy.time` objects. We'll specify the end time and use an offset for the start time. This notebook looks at data for the 30 days before 20 March 2020.

In [None]:
t1 = Time('2020-03-20T00:00:00', scale='tai')
window = TimeDelta(30*24*3600, format='sec', scale='tai')

Query the relevant timestamp.  I believe the `sndStamp` is when the message is sent to DDS, so is the earliest timestamp we have for weather data.  The timestamp for when the measurement was recorded in influxDB is in the index of the returned data structure.

In [None]:
tstamps = await client.select_time_series("lsst.sal.Environment.weather", ["private_sndStamp", ], t1-window, t1)

Most operations work on `Timedelta` types, but not the `histogram` function, so we record the difference in seconds here.  Also note that timestamps are actually in TAI, but pandas doesn't know about TAI.  Since we are only looking at time difference, it doesn't matter which system we choose (unless a leap second happens between two of our samples).

In [None]:
deltas = []
for influx_stamp, snd_stamp in zip(tstamps.index.values, tstamps['private_sndStamp']):
    deltas.append((pd.Timestamp(influx_stamp, tz="GMT") - pd.Timestamp(snd_stamp - 37, unit='s', tz="GMT")).total_seconds())

In [None]:
deltas = np.array(deltas)

In [None]:
median = np.median(deltas)
mean = deltas.mean()

Compute histogram

In [None]:
hist, edges = np.histogram(deltas, density=True, bins=np.linspace(0, 5, 500))

In [None]:
p = figure(title='Latency between influx and snd for the Environment_weather subsystem', background_fill_color="#fafafa")
p.yaxis.axis_label = "Number"
p.xaxis.axis_label = "Latency (s)"
p.quad(top=hist, bottom=0, left=edges[:-1], right=edges[1:],
       fill_color='navy', line_color='white', alpha=0.5)
annotation = Label(x=250, y=250, x_units='screen', y_units='screen',
                 text='mean=%.4fs median=%.4fs'%(mean, median),
                 border_line_color='black', border_line_alpha=1.0,
                 background_fill_color='white', background_fill_alpha=1.0)
p.add_layout(annotation)
show(p)