## Analyze and Plot Details for LRAUV ESP Samples from CANON Campaigns
*Query databases directly for detailed information that STOQS api requests don't deliver*

Executing this Notebook requires a personal STOQS server.  It can be run from either a Docker installation or from a development Vagrant Virtual Machine. 

### Docker Instructions
Install and start the software as 
[detailed in the README](https://github.com/stoqs/stoqs#production-deployment-with-docker). (Note that on MacOS you will need to modify settings in `docker-compose.yml` and your `.env` files &mdash; look for comments referencing 'HOST_UID'.)
        
Then, from your `$STOQS_HOME/docker` directory start the Jupyter Notebook server pointing to MBARI's master STOQS database server. Note: firewall rules limit unprivileged access to such resources.

    docker-compose exec \
        -e DATABASE_URL=postgis://everyone:guest@kraken.shore.mbari.org:5432/stoqs \
        stoqs stoqs/manage.py shell_plus --notebook

A message is displayed giving a URL for you to use in a browser on your host, e.g.:

    http://127.0.0.1:8888/?token=<a_token_generated_upon_server_start>

In the browser window opened to this URL navigate to this file (`stoqs/contrib/notebooks/CANON_ESP_Sample_details.ipynb`) and open it. You will then be able to execute the cells and modify the code to suit your needs.

---

### Vagrant VM Instructions
Install and provision your VM as [detailed in the README](https://github.com/stoqs/stoqs#getting-started-with-a-stoqs-development-system).

Configure for connection to the campaign databases on MBARI's master STOQS database server. Note: firewall rules limit unprivileged access to such resources.


    cd $STOQS_HOME/stoqs
    ln -s mbari_campaigns.py campaigns.py
    export DATABASE_URL=postgis://everyone:guest@kraken.shore.mbari.org:5432/stoqs
    
Launch the Jupyter Notebook server on your VM with:

    cd $STOQS_HOME/stoqs/contrib/notebooks
    ../../manage.py shell_plus --notebook

A message is displayed giving a URL for you to use in a browser on your host, e.g.:

    http://127.0.0.1:8888/?token=<a_token_generated_upon_server_start>

Port 8888 on your VM is mapped to port 8887 on your host, so in a web browser on your host open the URL (using the `<a_token_generated_upon_server_start>` printed after the Jupyter Notebook server is started):

    http://127.0.0.1:8887/?token=<a_token_generated_upon_server_start>

Navigate to this file (stoqs/contrib/notebooks/CANON_ESP_Sample_details.ipynb) and open it. You will then be able to execute the cells and modify the code to suit your needs.

In [None]:
import os
from collections import defaultdict
os.environ["DJANGO_ALLOW_ASYNC_UNSAFE"] = "true"

# Define CANON Campaigns for which there are LRAUV ESP Samples
dbs = {'CN17S': 'stoqs_canon_april2017',
       'CN18S': 'stoqs_canon_may2018',
       'CN18F': 'stoqs_canon_september2018',
       'CN19S': 'stoqs_canon_may2019',
       'CN19F': 'stoqs_canon_fall2019',
       'CN20S': 'stoqs_canon_july2020',
       'CN20F': 'stoqs_canon_october2020',
      }

# Save the ESP Sample identifers (Activity name) in each Campaign
db_samples = defaultdict(list)
for cid, db in dbs.items():
    print(f"{cid}: {db}")
    for sample in (Sample.objects.using(db)
                   .filter(instantpoint__activity__platform__name__contains='ESP')
                   .values_list('instantpoint__activity__name', flat=True)):
        ##print(f"\t{sample}")
        db_samples[cid].append(sample)


In [None]:
for cid, samples in db_samples.items():
    print(f"{cid}")
    for sample in samples:
        print(f"\t{sample}")
        sample_locations = (Measurement.objects.using(dbs[cid])
                            .filter(instantpoint__activity__name=sample))
        for location in (Measurement.objects.using(dbs[cid])
                         .filter(instantpoint__activity__name=sample)):
            lon, lat, depth = (location.geom.x, location.geom.y, location.depth)
            print(f"{lon}, {lat}, {depth}")
            
        breakpoint()

In [None]:
acts = Activity.objects.using(db).filter(instantpoint__measurement__in=near_ts_loc)

In [None]:
acts.values_list('platform__name', flat=True).distinct()

In [None]:
pctds = acts.filter(platform__name='WesternFlyer_PCTD').order_by('startdate').distinct()
esps = acts.filter(platform__name='makai_ESP_Archive').order_by('startdate').distinct()

In [None]:
pctds

In [None]:
%matplotlib inline
import pylab as plt
plt.scatter([pctd.mappoint.x for pctd in pctds],
            [pctd.mappoint.y for pctd in pctds], c='b')
plt.scatter([esp.mappoint.x for esp in esps],
            [esp.mappoint.y for esp in esps], c='r')

In [None]:
%matplotlib inline

import matplotlib.pyplot as plt
from matplotlib import pylab
from numpy import arange
import operator

def plot_platforms(ax):
    plat_labels = []

    # Plot in order by platformtype name and platform name
    for ypos, plat in enumerate(
                        sorted(plat_start_ends.keys(),
                               key=operator.attrgetter('platformtype.name', 'name'))):
        plat_labels.append(f'{plat.name} ({plat.platformtype.name})')    
        for bdate, edate in plat_start_ends[plat]:
            dd = edate - bdate
            if dd < 1:
                dd = 1
            ax.barh(ypos+0.5, dd, left=bdate, height=0.8, 
                    align='center', color='#' + plat.color, alpha=1.0) 

    ax.set_title(Campaign.objects.using(db).get(id=1).description)
    ax.set_ylim(-0.5, len(plat_labels) + 0.5)
    ax.set_yticks(arange(len(plat_labels)) + 0.5)
    ax.set_yticklabels(plat_labels)

    ax.grid(True)
    plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%B %Y'))
    plt.gca().xaxis.set_major_locator(mdates.MonthLocator())
    plt.gca().xaxis.set_minor_locator(mdates.DayLocator())
    plt.gcf().autofmt_xdate()

pylab.rcParams['figure.figsize'] = (15, 9)
fig, ax = plt.subplots()
plot_platforms(ax)
plt.show()

There appear to be 10 events measured by the Benthic Event Detectors. Let's find the start times for these events and use _k_-means clustering to group the BEDs event data start times into 10 clusters.

In [None]:
import numpy as np
from sklearn.cluster import KMeans
bed_starts = np.array(Activity.objects.using(db)
                              .filter(platform__name__contains='BED')
                              .values_list('startdate', flat=True)
                              .order_by('startdate'), dtype=np.datetime64)
km = KMeans(n_clusters=10).fit(bed_starts.reshape(-1, 1))

Pick the earliest event start time and construct start and end times that we'll use to instruct the STOQS loader that these are the times when we want to load ADCP data from all the moorings into the database.

In [None]:
events = {}
for bed_start in bed_starts:
    label = km.predict(bed_start.reshape(-1, 1))[0]
    if label not in events.keys():
        events[label] = bed_start
    # Print the clusters of start times and tune n_clusters above to get the optimal set
    ##print(bed_start, label)

Print `Event()` instances of begining and end times for use in [loadCCE_2015.py](https://github.com/stoqs/stoqs/blob/3a596e6791104054c676a0ba84e81ec02b7ca06b/stoqs/loaders/CCE/loadCCE_2015.py#L23-L32)

In [None]:
from datetime import datetime, timedelta
event_start_ends = defaultdict(list)
def print_Events(events, before, after, type):
    for start in events.values():
        beg_dt = repr(start.astype(datetime) - before).replace('datetime.', '')
        end_dt = repr(start.astype(datetime) + after).replace('datetime.', '')
        event_start_ends[type].append((mdates.date2num(start.astype(datetime) - before),
                                       mdates.date2num(start.astype(datetime) + after)))
        print(f"        Event({beg_dt}, {end_dt}),")

# Low-resolution region: 1 day before to 2 days after the start of each event
before = timedelta(days=1)
after = timedelta(days=2)
print("lores_event_times = [")
print_Events(events, before, after, 'lores')
print("                    ]")

# High-resolution region: 4 hours before to 14 hours after the start of each event
before = timedelta(hours=4)
after = timedelta(hours=14)
print("hires_event_times = [")
print_Events(events, before, after, 'hires')
print("                    ]")

Plot timeline again, but this time with events as shaded regions across all the Platforms.

In [None]:
def plot_events(ax):
    for type in ('lores', 'hires'):
        for bdate, edate in event_start_ends[type]:
            dd = edate - bdate
            if dd < 1:
                dd = 1
            # Plot discovered events as gray lines across all platforms
            ax.barh(0, dd, left=bdate, height=32, 
                    align='center', color='#000000', alpha=0.1) 

pylab.rcParams['figure.figsize'] = (15, 9)
fig, ax2 = plt.subplots()
plot_platforms(ax2)
plot_events(ax2)
plt.show()