# Interacting with the exposure log from a notebook

## Introduction

The exposurelog service exposes an endpoint that, depending on how it is called, can perform various actions on the exposure log stored in the database.
The intent of this notebook is to show examples of how one can wrap the underlying calls to make the interaction more like other client libraries.
The operations that can be done with the service endpoint are to:
* Get messages -- by default this is all the messages marked `is_valid`, but can be configured to return all messages
* Find a message by ID -- return a message as a dictionary for a particular message id if it exists
* Add a message -- Inserts a new message in the database.  By default, the observation ID must already exist in the repository, but if `is_new` is set to `True` the service will not do the check assuming that observation ID will show up in the repository eventually.
* Edit a message -- Edit an existing message.  In reality, this creates a new message in the database and marks the old one with `is_valid=False` and associating the new message with the old message ID as its parent.
* Delete a message -- This removes a message from the exposurelog, by marking the record associated with the message ID as invalid.
* Search for messages -- There are many ways for searching the messages and constraints can be stacked togehter.

## Set up some helper functions

In principle there can be lots of instances of the exposure log service.
The `ENDPOINT` variable configures which particular instance you wish to query.

In [None]:
from dataclasses import field, dataclass
from datetime import datetime
import os
import pandas as pd
import requests
from typing import List

ENDPOINT = 'https://base-lsp.lsst.codes/exposurelog/messages/'

This simply checks that the result from a request is successful.

In [None]:
def check_resp(resp, success=200):
    if resp.status_code == success:
        return
    else:
        # Maybe try to get some info out of the response on failure
        raise ValueError(f'Request failed with code: {resp.status_code}')

Add a message.
If a `user_id` is not specified, it will try to use the username in this container.
If a `user_agent` is not specified, a default indicating that the message is coming from a notebook running in nublado.

In [None]:
def add_message(obs_id, instrument, message_text, user_id=None, user_agent=None, is_new=False, is_human=True, exposure_flag='none'):
    if exposure_flag not in ['none', 'junk', 'questionable']:
        raise ValueError('The exposure_flag argument must be one of: none, junk, or questionable')
    data = {'obs_id': obs_id, 'instrument': instrument, 'message_text': message_text, 'is_new': is_new, 'is_human': is_human, 'exposure_flag': exposure_flag}
    if user_id:
        data['user_id'] = user_id
    else:
        data['user_id'] = os.environ['JUPYTERHUB_USER']
    if user_agent:
        data['user_agent'] = user_agent
    else:
        data['user_agent'] = 'notebook:nublado'
        
    resp = requests.post(ENDPOINT, json=data)
    check_resp(resp)
    return resp.json()

Get all historical messages.
By default this will return a `pandas.DataFrame`, but can be configured to return a list of dictionaries instead.
Only valid messages are returned unless `all=True` in which case both valid and invalid messages are returned.

In [None]:
def get_messages(all=False, as_dataframe=True):
    resp = requests.get(ENDPOINT)
    check_resp(resp)
    messages = resp.json()    
    if all:
        params = {'is_valid': False}
        resp = requests.get(ENDPOINT, params=params)
        check_resp(resp)
        messages += resp.json()
    if as_dataframe:
        return pd.DataFrame(messages)
    return messages

Given you know the ID for a specific message, retrieve that message as a dictionary.

In [None]:
def get_message_by_id(message_id):
    resp = requests.get(f'{ENDPOINT}{message_id}')
    check_resp(resp)
    return resp.json()

Edit a message.
None of the message fields are required, but specifying none of them will result in an exact copy of the message matching `message_id`.
By default this will raise an exception if an invalid message is being edited, but this can be overridden.

In [None]:
def edit_message(message_id, message_text=None, site_id=None, user_id=None, user_agent=None, is_human=None, exposure_flag=None, check_validity=True):
    resp = requests.get(f'{ENDPOINT}{message_id}')
    check_resp(resp)
    message = resp.json()
    if not message['is_valid'] and check_validity:
        raise ValueError(f'Message {message_id} is marked as invalid in the database.')
    data = {}
    loc_vars = locals()
    for k in ['message_text', 'site_id', 'user_id', 'user_agent', 'is_human', 'exposure_flag']:
        if loc_vars[k]:
            data[k] = loc_vars[k]
        else:
            data[k] = message[k]
    resp = requests.patch(f'{ENDPOINT}{message_id}', json=data)
    check_resp(resp)
    return resp.json()

Delete the message matching `message_id`.

In [None]:
def delete_message(message_id):
    resp = requests.delete(f'{ENDPOINT}{message_id}')
    check_resp(resp, success=204)

Searching is a little more complicated since it can involve lots of parameters.
I chose to implement this as a class with a search method.
Simply instantiate the class passing the constraints you wish and call the `search()` method on the object.
This defaults to returning the 50 most recent valid messages.
Those are the same defaults as the service itself.

In [None]:
@dataclass
class MessageSearcher:
    site_ids: List[str] = None
    obs_id: str = None
    instruments: List[str] = None
    min_day_obs: int = None
    max_day_obs: int = None
    message_text: str = None
    user_ids: List[str] = None
    user_agents: List[str] = None
    is_human: bool = None
    is_valid: bool = True
    exposure_flags: str = None
    min_date_added: datetime = None
    max_date_added: datetime = None
    has_date_invalidated: bool = None
    min_date_invalidated: datetime = None
    max_date_invalidated: datetime = None
    has_parent_id: bool = None
    order_by: List[str] = None
    offset: int = 0
    limit: int = 50

    def search(self, as_dataframe=True):
        params = {}
        for k in self.__dict__:
            if self.__dict__[k]:
                params[k] = self.__dict__[k]
        resp = requests.get(f'{ENDPOINT}', params=params)
        check_resp(resp)
        if as_dataframe:
            return pd.DataFrame(resp.json())
        return resp.json()

## Let's try things out

First, list all valid messages.

In [None]:
valid_messages = get_messages()
valid_messages.sort_values(by=['date_added'], ascending=False)  # Sort messages by when they are added with newer ones on top

Let's try adding a message.
We'll just give the minimum information.
Remember we will have to specify `is_new` since we don't have a valid observation id hanging around right now.
A copy of the message as it was ingested is returned.

> Note that currently the service does not provide validation of instrument names, so we will have to be fairly rigorous about our conventions for names of the various instruments

In [None]:
message = add_message('Testing Obs ID', 'AuxTel', 'This is the message text used by the demo notebook', is_new=True)
message

Now we can get the message back that we put in.

In [None]:
new_message = get_message_by_id(message['id'])
new_message

Why not fix up the message a little.
Notice that the parent_id in the edited message points to the message we originally added.

In [None]:
edit_message = edit_message(message['id'], message_text='An example of changing the message text after the fact')
edit_message

We can see both the parent message and the edited one by looking at the list of all messages.
Notice the original message still exists, but is now marked invalid.

In [None]:
all_messages = get_messages(all=True)
all_messages.sort_values(by=['date_added'], ascending=False)

It turns out we don't want that message after all.
Remember we edited the message, so we need to use the id of the edited message, not the original we added.
Then list all message to confirm it is now invalid.

In [None]:
delete_message(edit_message['id'])

In [None]:
all_messages = get_messages(all=True)
all_messages.sort_values(by=['date_added'], ascending=False)