# Getting SQuaSH metrics into InfluxDB

InfluxDB is a time series database, Chronograf is the UI for exploratory analysis and dashboarding, and Kapacitor is the component responsible for alerting (it also does series processing in general). This investigation is to evaluate the use of the  InfluxDB + Cronograf + Kapacitor for LSST DM in implementing parts of SQuaSH and for the DM Engineering Facilities Database (DM-EFD) monitoring.


## Getting started 

Check the [InfluxDB start guide](https://docs.influxdata.com/influxdb/v1.6/introduction/getting-started/) to learn the basics of interacting with the database and the [SQL to InfluxDB terminology crosswalk](https://docs.influxdata.com/influxdb/v1.6/concepts/crosswalk/) to understand how InfluxDB is different from a relational database.

## Points, measurements, tags and fields

**Points** are discrete samples of a metric. Points are written to InfluxDB using the "Line Protocol":

```
<measurement>[,<tag-key>=<tag-value>...] <field-key>=<field-value>[,<field2-key>=<field2-value>...] [unix-nano-timestamp]
```

**Measurements** act as a container for tags, fields, and the timestamp. The measurement name is the description of the  Measurement names are strings, they describe the data that are stored in the associated fields. A measurement is conceptually similar to an SQL table.

**Tags** are used as metadata while **fields** corresponds to your data. An important difference is that tags are indexed while fields are not. So you should consider to turn fields into tags if you are filtering on them.

InfluxDB is a schemaless database. You can add new measurements, tags, and fields at any time.

Check also InfluxDB [key concepts](https://docs.influxdata.com/influxdb/v1.6/concepts/key_concepts).


## Writing data to InfluxDB

There are several ways to write data to InfluxDB, here we show three of them using the CLI client, the HTTP API and the Python client.

1. Insert a single time series point using the [CLI](https://docs.influxdata.com/influxdb/v1.6/introduction/getting-started/#writing-and-exploring-data)

```
> USE mydb
> INSERT cpu,host=server01,region=us_west value=0.64
```

2. Inserting a single time series point using the [HTTP API](https://docs.influxdata.com/influxdb/v1.6/guides/writing_data/)

```
$ curl -XPOST "http://localhost:8086/write?db=mydb" -d 'cpu,host=server01,region=uswest value=0.64'
```

3. Insert a single time series point using the [influxdb Python client](https://github.com/influxdata/influxdb-python)

```python
data = [
    {
        "measurement": "cpu",
        "tags": {
            "host": "server01",
            "region": "us_west"
        },
        "fields": {
            "value": 0.64
        }
    }
]
from influxdb import InfluxDBClient
client = InfluxDBClient('localhost', 8086, 'root', 'root', 'mydb')
client.write_points(data)

```

## Running InfluxDB + Chronograf + Kapacitor

You can use the `docker-compose` configuration in this repository to run a local instance of InfluxDB + Chronograf + Kapacitor

In [None]:
%%bash
docker-compose up -d

On the terminal, you can open the InfluxDB CLI with

```bash
$ docker-compose run influxdb-cli
```

or connect to Chronograf at http://localhost:8888. The most interesting functionalities of Chronograf are the "Data Explorer", "Dashboards", "Alerting" and the "Log Viewer".

SQuaSH metrics follow the concepts developed in [lsst.verify](https://sqr-019.lsst.io/). Here we present two approaches on how SQuaSH metrics can be inserted into InfluxDB. 

Approach #1: each metric in SQuaSH is an InfluxDB measurement 

In [None]:
from influxdb import InfluxDBClient

client = InfluxDBClient('localhost', 8086, 'root', 'root', 'squash')
client.create_database('squash')

In [None]:
import requests

SQUASH_API_URL = "https://squash-restful-api-demo.lsst.codes/"
jobs = requests.get(SQUASH_API_URL + "/jobs").json()

In [None]:
from datetime import datetime

for job_id in jobs['ids']:

    r = requests.get(SQUASH_API_URL + "/job/{}".format(job_id)).json()
    
    # Skip datasets we don't want
    if r['ci_dataset'] == 'unknown' or r['ci_dataset'] == 'decam':
        continue
    
    print('Sending point for job {}...'.format(job_id))

    points = []
    
    for meas in r['measurements']:
        point = dict()
        point['measurement'] = meas['metric']
        point['tags'] =  {'filter_name': r['meta']['filter_name'], 
                          'dataset':  r['ci_dataset']}
        point['time'] = r['date_created']
        point['fields'] = {'value': meas['value']}
        points.append(point)
        
    client.write_points(points)

Approach #2: each verification package in SQuaSH is an InfluxDB measurement

In [None]:
from influxdb import InfluxDBClient

client = InfluxDBClient('localhost', 8086, 'root', 'root', 'squash_2')
client.create_database('squash_2')

In [None]:
import requests

SQUASH_API_URL = "https://squash-restful-api-demo.lsst.codes/"
jobs = requests.get(SQUASH_API_URL + "/jobs").json()

In [None]:
from datetime import datetime

for job_id in jobs['ids']:

    r = requests.get(SQUASH_API_URL + "/job/{}".format(job_id)).json()
    
    if r['ci_dataset'] == 'unknown' or r['ci_dataset'] == 'decam':
        continue
    
    print('Sending point for job {}...'.format(job_id))

    points = []
    
    for meas in r['measurements']:
        point = dict()
        # we could add the verification package in addition to metric to improve 
        # this
        point['measurement'] = meas['metric'].split('.')[0]
        point['tags'] =  {'filter_name': r['meta']['filter_name'], 
                          'dataset':  r['ci_dataset']}                  
        point['time'] = r['date_created']
        point['fields'] = {meas['metric']: meas['value']}
        points.append(point) 
    
    client.write_points(points)

It turns out that the second approach fits better along the InfluxDB concepts. You can verify that by exploring the two InfluxDB databases `squash` and `squash_2` just created using [Chronograf](http://localhost:8888/sources/0/chronograf/data-explorer).

## Alerting

Alerting is done with [Kapacitor](https://docs.influxdata.com/kapacitor) which integrates with Chronograf.


TODO: demonstrate how to create alerting rules programatically using the metric definition and specifications from the SQuaSH API (similar to what we did when Investigating Honeycomb).

## Discussion

### InfluxDB

- Built-in HTTP API (but not RESTful)
- CLI
- InfluxDB Python client
- SQL-like query language
- Functions like mean(), median(), min() and max()


### Chronograf

- Written in Go and React.js, implements most of Grafana fucntionalities but works with InfluxDB only.
- Explore mode: query builder is great, we can have multiple queries in a single graph
- Query builder: flexible, has the ability to edit and test the actual query before submission
- Easy to create dashboards, nice control of the graph properties, has a presentation mode
- [Template variables are great to customize dashboards](https://docs.influxdata.com/chronograf/v1.6/guides/dashboard-template-variables/)
- [You can export your dashboard definition to JSON](https://www.influxdata.com/blog/chronograf-dashboard-definitions/)
- Easy to configure alerting rules
- You can export data from the system (download CSV from the UI, from the [HTTP API](http://localhost:8888/docs#tag/measurements) or from thr CLI)
- [Combine metrics and logs dashboard](https://docs.influxdata.com/chronograf/v1.6/guides/analyzing-logs/#logs-in-dashboards).
- Combine data from two different databases in the same plot/dashboard. That's a really imporant feature in order to correlate metrics from the DM-EFD and SQuaSH. 

## Kapacitor

- Alerting rules
- Time series processing 
- [HTTP API](https://docs.influxdata.com/kapacitor/v1.5/working/api/)