## NWB-Datajoint tutorial 2

**Note: make a copy of this notebook and run the copy to avoid git conflicts in the future**

This is the second in a multi-part tutorial on the NWB-Datajoint pipeline used in Loren Frank's lab, UCSF. It demonstrates how to curate the results of spike sorting.

Finish [tutorial 0](0_intro.ipynb) and [tutorial 1](1_spikesorting.ipynb) before proceeding.

Let's start by importing the `nwb_datajoint` package, along with a few others. 

In [None]:
import os
import numpy as np

import nwb_datajoint as nd

import warnings
warnings.simplefilter('ignore', category=DeprecationWarning)
warnings.simplefilter('ignore', category=ResourceWarning)

In [None]:
# We also import a bunch of tables so that we can call them easily
from nwb_datajoint.common import (RawPosition, HeadDir, Speed, LinPos, StateScriptFile, VideoFile,
                                  DataAcquisitionDevice, CameraDevice, Probe,
                                  DIOEvents,
                                  ElectrodeGroup, Electrode, Raw, SampleCount,
                                  LFPSelection, LFP, LFPBandSelection, LFPBand,
                                  SortGroup, SpikeSorting, SpikeSorter, SpikeSorterParameters, SpikeSortingWaveformParameters, SpikeSortingParameters, SpikeSortingMetrics, CuratedSpikeSorting,\
                                  FirFilter,
                                  IntervalList, SortInterval,
                                  Lab, LabMember, Institution,
                                  BrainRegion,
                                  SensorData,
                                  Session, ExperimenterList,
                                  Subject,
                                  Task, TaskEpoch,
                                  Nwbfile, AnalysisNwbfile, NwbfileKachery, AnalysisNwbfileKachery)

In the previous tutorials, we have inserted and sorted our data. This tutorial will go through manual curation. First, a few words on the various tools that we will be using and the ways in which they come into play for curation:

[kachery-p2p](https://github.com/flatironinstitute/kachery-p2p): This tool allows us to store and share data conveniently by giving a URI (unique resource identifier; in this case, a 40-digits long, hexadecimal SHA-1 hash) to each file and enabling peer-to-peer download. The advantage of this approach is that one can identify a file without having to specify the path, which may be different on different machines. This facilitates sharing. To use `kachery-p2p`, a daemon must be running. In the Frank lab system in which we share a number of compute resources (e.g. the _virgas_), we have created a dedicated user (`kacheryuser`) who runs the daemon on `typhoon`. As long as you're in the `kachery-users` group (you can check this by logging onto one of our compute servers and typing `groups`), you will be able to use this daemon. In the context of curation, `kachery-p2p` is used to create a _feed_, which you can think of as an append-only log. The feed (like other files) is defined by its URI and is stored in the `kachery-storage` directory via `kachery-p2p`.

[labbox-ephys](https://github.com/flatironinstitute/labbox-ephys): This is a visualization tool to curate the results of spike sorting. At the end of spike sorting, a feed is created (`curation_feed_uri` attribute in `SpikeSorting` table) and the recording and the associated sorting objects are added to it. The feed is then considered a _workspace_ (this distinction is somewhat superficial; when you see _feed_, think of a log that can hold any information and is kept track of by `kachery-p2p`; when you see _workspace_, think of a feed in the context of spike sorting). You can add multiple recording and sorting objects to a single workspace. `labbox-ephys` then prepares the visualization for curating based on the information in the workspace, and appends messages to it whenever the user takes an action (e.g. giving a label to a unit). 

The curation can be done on via a web app or on Jupyter Lab. Here we will first explore the Jupyter Lab widget based curation.

In [None]:
# Define the name of the file that you copied and renamed; make sure it's something unique. 
nwb_file_name = 'beans20190718.nwb'
filename, file_extension = os.path.splitext(nwb_file_name)
# This is a copy of the original nwb file, except it doesn't contain the raw data (for storage reasons)
nwb_file_name2 = filename + '_' + file_extension

### Jupyter Lab widget

First, make sure that our results are stored in `SpikeSorting` table.

In [None]:
SpikeSorting & {'nwb_file_name': nwb_file_name2}

In [None]:
# import some stuff
import labbox_ephys as le
import labbox_ephys_widgets_jp as lew
import spikeextractors as se
import numpy as np

# load the workspace
workspace_uri = (SpikeSorting & {'nwb_file_name': nwb_file_name2}).fetch1('curation_feed_uri')
workspace = le.load_workspace(workspace_uri=workspace_uri)

In [None]:
# View the workspace
workspace_view = lew.WorkspaceView(workspace=workspace)
display(workspace_view)

Once you open up the workspace view, you will see a set of recordings that have been added to the workspace. Clicking on a recording then takes you to a page that gives you information about the recording as well as the associated sorting objects. If you then click on the sorting, then it will take you to the curation view. Try exploring the many visualization widgets. The most important is the `Units Table` and the Curation menu, which allows you to give labels to the units. The curation labels will persist even if you| suddenly lose connection to the Jupyter Lab session. This is because these are appended to the workspace as soon as the action is created.

In [None]:
# TODO
# get new sorting extractor and recompute metrics; then populate downsteram tables
# try opening analysis nwb file as a nwbsortingextractor
# check with alessio about timestamps in recording and sorting extractors
# common analysis that invovles lfp, e.g. filtering, phase histogram of spiking; eric has some ripple detection code

### Web app

In addition to this Jupyter widget, one can access it via a web application as well. The same recording and sorting pairs can be accessed via both methods.

To access the web app, go to https://sortingview.vercel.app/. Click on 'Start by selecting a backend provider.' Enter the URI for the Frank lab backend (gs://labbox-franklab/sortingview-backends/franklab.json) and click on 'Set backend provider'. Now click 'select a workspace.' This will take you to a page that shows you all the workspaces that we have created. The workspace should be your user name on our servers. Click on it and you should see the recordings that you added, similar to the workspace view in the Jupyter widget. 

Note that you cannot yet set curation labels. This is because you have not authenticated yourself yet. We will use google account for this. First, contact Kyu and tell him your Google account that you plan to use. He will then add you to the list of users that can make modifications to the workspaces. Then go back to the first page of `sortingview` and click 'sign in using a Google account.' You should then be able to log in. 