# Logging and Reporting

## Table of contents
* [Parameters](#params)
* [Imports and setup](#imports)
* [Try every server](#every-server)
* [Report](#report)

<a class="anchor" id="params"></a>
## Parameters
The first code cell must contain parameters with string values for compatibility with Times Square.

See: https://rsp.lsst.io/v/usdfdev/guides/times-square/index.html

In [None]:
# Parameters
env = "tucson"  # usdf-dev, tucson, slac, summit
record_limit = "9999"
response_timeout = "3.05"  # seconds, how long to wait for connection
read_timeout = "20"  # seconds

<a class="anchor" id="imports"></a>
## Imports and General Setup

In [None]:
from collections import defaultdict
from pprint import pp

import pandas as pd
import requests

In [None]:
limit = int(record_limit)
timeout = (float(response_timeout), float(read_timeout))

# Env list comes from drop-down menu top of:
# https://rsp.lsst.io/v/usdfdev/guides/times-square/
envs = dict(
    # rubin_usdf_dev = '',
    # data_lsst_cloud = '',
    # usdf = '',
    # base_data_facility = '',
    summit="https://summit-lsp.lsst.codes",
    usdf_dev="https://usdf-rsp-dev.slac.stanford.edu",
    # rubin_idf_int = '',
    tucson="https://tucson-teststand.lsst.codes",
)
envs

<a class="anchor" id="every-server"></a>
## Try to access every Server, every Log in our list
We call the combination of a specific Server and specific Log a "service".
This is a First Look.  As such, we don't try to get a useful list of records. 
Instead, we save a few pieces of data from each service.  A more tailored web-service call should be done to get useful records.  For each service, we save:
1. The number of records retrieved
1. The list of fields found in a record (we assume all records from a service have the same fields)
1. An example of 1-2 records.
1. The [Facets](https://en.wikipedia.org/wiki/Faceted_search) of the service for all service fields that are not explictly excluded.

In [None]:
verbose = False
fields = defaultdict(set)  # fields[(env,log)] = {field1, field2, ...}
examples = defaultdict(list)  # examples[(env,log)] = [rec1, rec2]
results = defaultdict(
    dict
)  # results[(env,log)] = dict(server,url, ok, numfields, numrecs)
facets = defaultdict(
    dict
)  # facets[(env,log)] = dict(field) = set(value-1, value-2, ...)

# Dumb! Using same ignore set for all LOGS.
ignore_fields = set(
    [
        "tags",
        "urls",
        "message_text",
        "id",
        "date_added",
        "obs_id",
        "day_obs",
        "seq_num",
        "parent_id",
        "user_id",
        "date_invalidated",
        "date_begin",
        "date_end",
        "time_lost",  # float
        #'systems','subsystems','cscs',  # values are lists, special handling
    ]
)
for env, server in envs.items():
    ok = True
    try:
        recs = None
        log = "exposurelog"
        #!url = f'{server}/{log}/messages?is_human=either&is_valid=either&offset=0&{limit=}'
        url = f"{server}/{log}/messages?is_human=either&is_valid=either&{limit=}"
        print(f"\nAttempt to get logs from {url=}")
        response = requests.get(url, timeout=timeout)
        response.raise_for_status()
        recs = response.json()
        flds = set(recs[0].keys())
        if verbose:
            print(f"Number of {log} records: {len(recs):,}")
            print(f"Got {log} fields: {flds}")
            print(f"Example record: {recs[0]}")
        fields[(env, log)] = flds
        examples[(env, log)] = recs[:2]

        facflds = flds - ignore_fields
        # Fails when r[fld] is a LIST instead of singleton
        # I think when that happens occasionaly, its a BUG in the data! It happens.
        facets[(env, log)] = {
            fld: set([str(r[fld]) for r in recs if not isinstance(r[fld], list)])
            for fld in facflds
        }
    except Exception as err:
        ok = False
        print(f"ERROR getting {log} from {env=} using {url=}: {err=}")
    numf = len(flds) if ok else 0
    numr = len(recs) if ok else 0
    results[(env, log)] = dict(
        ok=ok, server=server, url=url, numfields=numf, numrecs=numr
    )

    print()
    try:
        recs = None
        log = "narrativelog"
        #! url = f'{server}/{log}/messages?is_human=either&is_valid=true&offset=0&{limit=}'
        url = f"{server}/{log}/messages?is_human=either&is_valid=either&{limit=}"
        print(f"\nAttempt to get logs from {url=}")
        response = requests.get(url, timeout=timeout)
        response.raise_for_status()
        recs = response.json()
        flds = set(recs[0].keys())
        if verbose:
            print(f"Number of {log} records: {len(recs):,}")
            print(f"Got {log} fields: {flds}")
            print(f"Example record: {recs[0]}")
        fields[(env, log)] = flds
        examples[(env, log)] = recs[:2]

        facflds = flds - ignore_fields
        # Fails when r[fld] is a LIST instead of singleton
        # I think when that happens occasionaly, its a BUG in the data! It happens.
        # Look for BAD facet values like: {'None', None}
        facets[(env, log)] = {
            fld: set([r[fld] for r in recs if not isinstance(r[fld], list)])
            for fld in facflds
        }
    except Exception as err:
        ok = False
        print(f"ERROR getting {log} from {env=} using {url=}: {err=}")
    numf = len(flds) if ok else 0
    numr = len(recs) if ok else 0
    results[(env, log)] = dict(
        ok=ok, server=server, url=url, numfields=numf, numrecs=numr
    )

<a class="anchor" id="report"></a>
## Report
This is a silly report that may be useful for developers. Not so much for astronomers.

<a class="anchor" id="ok_table"></a>
### Success/Failure table

In [None]:
show_columns = ["ok", "server", "numfields", "numrecs"]
df = pd.DataFrame(data=dict(results)).T.loc[:, show_columns]
print(f'Got results from {df["ok"].values.sum()} of {len(df)} env/logs')
df

<a class="anchor" id="field_names"></a>
### Field Names

In [None]:
print("Field names for each Environment/Log source:")
for (env, log), flds in fields.items():
    field_names = ", ".join(flds)
    print(f"\n{env}/{log}: {field_names}")
#!dict(fields)

<a class="anchor" id="facets"></a>
### Facets

In [None]:
dict(facets)
for (env, log), flds in facets.items():
    print(f"{env}/{log}:")
    for fld, vals in flds.items():
        print(f"  {fld}: \t{vals}")

<a class="anchor" id="examples"></a>
### Example Records

In [None]:
for (env, log), recs in examples.items():
    print(f"\n{env=}, {log=}: ")
    print("  Example records: ")
    pp(recs)