# DataJoint U24 - Workflow Array Electrophysiology

## Setup

First, let's change directories to find the `dj_local_conf` file.

In [1]:
import os
# change to the upper level folder to detect dj_local_conf.json
if os.path.basename(os.getcwd())=='notebooks': os.chdir('..')
assert os.path.basename(os.getcwd())=='workflow-array-ephys', ("Please move to the "
                                                               + "workflow directory")
# We'll be working with long tables, so we'll make visualization easier with a limit
import datajoint as dj; dj.config['display.limit']=10

If you haven't already populated the `lab`, `subject`, `session`, `probe`, and `ephys` schemas, please do so now with [04-automate](./04-automate-optional.ipynb). Note: exporting `ephys` data is currently only supported on the `no_curation` schema. 

In [2]:
from workflow_array_ephys.pipeline import lab, subject, session, probe, ephys
from workflow_array_ephys.export import (element_lab_to_nwb_dict, subject_to_nwb, 
                                         session_to_nwb, ecephys_session_to_nwb, 
                                         write_nwb)
from element_interface.dandi import upload_to_dandi

Connecting cbroz@dss-db.datajoint.io:3306


## Export to NWB

Each of the following elements has tools for interacting with NWB files: `element-lab`, `element-session`, `element-array-ephys`, and `element-interface`. We'll use the following keys for testing these functions:

In [3]:
lab_key={"lab": "LabA"}
protocol_key={"protocol": "ProtA"}
project_key={"project": "ProjA"}
session_key={"subject": "subject5",
             "session_datetime": "2018-07-03 20:32:28"}


### Element Lab

Because an NWB file must include session information, `element_lab_to_nwb_dict` can only help package information from the Lab schema into `dict` format. This would be helpful for a team using Element Lab, but not others. 

In [3]:
help(element_lab_to_nwb_dict)

Help on function element_lab_to_nwb_dict in module element_lab.export.nwb:

element_lab_to_nwb_dict(lab_key=None, project_key=None, protocol_key=None)
    Generate a dictionary object containing all relevant lab information used
        when generating an NWB file at the session level.
        All parameters optional, but should only specify one of respective type
    Use: mynwbfile = pynwb.NWBFile(identifier="your identifier",
                             session_description="your description",
                             session_start_time=session_datetime,
                             **element_lab_to_nwb_dict(
                                lab_key=key1,
                                project_key=key2,
                                protocol_key=key3))
    
    :param lab_key: Key specifying one entry in element_lab.lab.Lab
    :param project_key: Key specifying one entry in element_lab.lab.Project
    :param protocol_key: Key specifying one entry in element_lab.lab.Protocol
  

In [11]:
element_lab_to_nwb_dict(lab_key=lab_key, protocol_key=protocol_key, 
                        project_key=project_key)

{'institution': 'Example Uni',
 'lab': 'The Example Lab',
 'experiment_description': 'Example project to populate element-lab',
 'keywords': ['Example', 'Study'],
 'related_publications': ['arXiv:1807.11104', 'arXiv:1807.11104v1'],
 'protocol': 'ProtA',
 'notes': 'Protocol for managing data ingestion'}

### Element Animal

`subject_to_nwb` can use a session key to retrieve subject information, and will return an nwb file with a number of sections specified. When packaging into an NWB file, `pynwb` will display a warning regarding timezone information - datetime fields are assumed to be in local time, and will be converted to UTC.

In [19]:
subject_to_nwb(session_key=session_key)

  warn("Date is missing timezone information. Updating to local timezone.")


subject pynwb.file.Subject at 0x140613566479040
Fields:
  date_of_birth: 2020-01-01 00:00:00-06:00
  description: {"subject": "subject5", "sex": "F", "subject_birth_date": "2020-01-01", "subject_description": "rich", "line": null, "strain": null, "source": null}
  sex: F
  subject_id: subject5

### Element Session

`session_to_nwb` pulls the same information as above, while also including information about session experimenter and session time. The export process provides the same warning about timezone conversion to UTC.


In [21]:
session_to_nwb(session_key=session_key)

  warn("Date is missing timezone information. Updating to local timezone.")


root pynwb.file.NWBFile at 0x140613434487232
Fields:
  experimenter: ['User1']
  file_create_date: [datetime.datetime(2022, 5, 31, 13, 58, 25, 578725, tzinfo=tzlocal())]
  identifier: 9d2131bb-5747-4bf1-95f0-4b24b54b1968
  session_description: Successful data collection
  session_id: subject5_2018-07-03T20:32:28
  session_start_time: 2018-07-04 01:32:28+00:00
  subject: subject pynwb.file.Subject at 0x140613566479712
Fields:
  date_of_birth: 2020-01-01 00:00:00-06:00
  description: {"subject": "subject5", "sex": "F", "subject_birth_date": "2020-01-01", "subject_description": "rich", "line": null, "strain": null, "source": null}
  sex: F
  subject_id: subject5

  timestamps_reference_time: 2018-07-04 01:32:28+00:00


### Element Array Electrophysiology

`ecephys_session_to_nwb` provides a full export mechanism, returning an NWB file with raw, data, spikes, and LFP. Optional arguments determine which pieces are exported. For demonstration purposes, we recommend limiting `end_frame`.


In [22]:
help(ecephys_session_to_nwb)

Help on function ecephys_session_to_nwb in module element_array_ephys.export.nwb.nwb:

ecephys_session_to_nwb(session_key, raw=True, spikes=True, lfp='source', end_frame=None, lab_key=None, project_key=None, protocol_key=None, nwbfile_kwargs=None)
    Main function for converting ephys data to NWB
    
    Parameters
    ----------
    session_key: dict
    raw: bool
        Whether to include the raw data from source. SpikeGLX and OpenEphys are supported
    spikes: bool
        Whether to include CuratedClustering
    lfp:
        "dj" - read LFP data from ephys.LFP
        "source" - read LFP data from source (SpikeGLX supported)
        False - do not convert LFP
    end_frame: int, optional
        Used to create small test conversions where large datasets are truncated.
    lab_key, project_key, and protocol_key: dictionaries used to look up optional additional metadata
    nwbfile_kwargs: dict, optional
        - If element-session is not being used, this argument is required an

In [4]:
nwbfile = ecephys_session_to_nwb(session_key=session_key,
                                 raw=True,
                                 spikes=True,
                                 lfp="dj",
                                 end_frame=100,
                                 lab_key=lab_key,
                                 project_key=project_key,
                                 protocol_key=protocol_key,
                                 nwbfile_kwargs=None)

  warn("Date is missing timezone information. Updating to local timezone.")
creating units table for paramset 0: 100%|██████████| 499/499 [00:41<00:00, 12.11it/s]


In [5]:
nwbfile

root pynwb.file.NWBFile at 0x140297891486016
Fields:
  acquisition: {
    ElectricalSeries1 <class 'pynwb.ecephys.ElectricalSeries'>,
    ElectricalSeries2 <class 'pynwb.ecephys.ElectricalSeries'>
  }
  devices: {
    262716621 <class 'pynwb.device.Device'>,
    714000838 <class 'pynwb.device.Device'>
  }
  electrode_groups: {
    probe262716621_shank0 <class 'pynwb.ecephys.ElectrodeGroup'>,
    probe714000838_shank0 <class 'pynwb.ecephys.ElectrodeGroup'>
  }
  electrodes: electrodes <class 'hdmf.common.table.DynamicTable'>
  experiment_description: Example project to populate element-lab
  experimenter: ['User1']
  file_create_date: [datetime.datetime(2022, 5, 31, 15, 47, 41, 270996, tzinfo=tzlocal())]
  identifier: 172f2d3b-44c1-4ae1-8785-2d20d3df3db1
  institution: Example Uni
  keywords: ['Example' 'Study']
  lab: The Example Lab
  notes: Protocol for managing data ingestion
  processing: {
    ecephys <class 'pynwb.base.ProcessingModule'>
  }
  protocol: ProtA
  related_publicatio

`write_nwb` can then be used to write this file to disk. The following cell will include a timestamp in the filename.

In [6]:
import time
from workflow_array_ephys.paths import get_ephys_root_data_dir
    
write_nwb(nwbfile, f'./temp_nwb/{time.strftime("_test_%Y%m%d-%H%M%S.nwb")}')

## DANDI Export

`element-interface.dandi` includes the `upload_to_dandi` utility to support direct uploads. For more information, see [DANDI documentation](https://www.dandiarchive.org/handbook/10_using_dandi/).

In order to upload, you'll need...
1. A DANDI account
2. A `DANDI_API_KEY`
3. A `dandiset_id`

These values can be added to your `dj.config` as follows:

In [None]:
dj.config['custom']['dandiset_id']="<six digits as string>" 
dj.config['custom']['dandi.api']="<40-character alphanumeric string>"

This would facilitate routine updating of your dandiset.

In [3]:
upload_to_dandi(
    data_directory="./temp_nwb/",
    dandiset_id=dj.config['custom']['dandiset_id'],
    staging=True,
    working_directory="./temp_nwb/",
    api_key=dj.config['custom']['dandi.api'],
    sync=False)

A newer version (0.40.0) of dandi/dandi-cli is available. You are using 0.39.4


PATH                 SIZE DONE    DONE% CHECKSUM STATUS MESSAGE   
dandiset.yaml                                    done   updated   
Summary:                  0 Bytes                1 done 1 updated 
                          <0.00%                                  


FileExistsError: File './temp_nwb/200178/test1.nwb' already exists