# MAP ML/ANN Inputs via DataJoint

Adapted from [Data Output Format Document](https://docs.google.com/document/d/1oY7ul-c9NYSmGexPRN90ZQbbFhPR1-Jy0Nlt-daquVs/edit)

## Setup

In [1]:
import os
import numpy as np

In [2]:
import datajoint as dj

dj.config['database.host'] = 'mesoscale-activity.datajoint.io'
dj.config['lab.database'] = 'map_lab'
dj.config['experiment.database'] = 'daveliu_experiment'
dj.config['ephys.database'] = 'daveliu_ephys'
dj.config['ccf.database'] = 'map_ccf'

In [3]:
# assumes map-ephys installed via pip ('pip install -e .'' from git checkout) or in sys.path
from pipeline import lab
from pipeline import experiment
from pipeline import ephys
from pipeline import ccf

Please enter DataJoint username: chris-vathes
Please enter DataJoint password: ········
Connecting chris-vathes@mesoscale-activity.datajoint.io:3306


## Session Dictionary

Since Session Dictionary as described here is built from sub-components, we start with sessions here, build up the component results, and combine at the end.

Sequence adjusted from reference document ad-hoc according to schema structure.

- Session Dictionary:
  - Trial Dictionary: see below
  - Animal behavior per trial dictionary: see below
  - Neuron Dictionary: see below
  - Perturbation Dictionary: see below
  - General meta data: to discuss

### Basic Queries / Fetches example for Session

In [4]:
_ = experiment.Session.describe()

# 
-> lab.Subject
session              : smallint                     # session number
---
session_date         : date                         # 
-> lab.Person
-> lab.Rig



In [5]:
experiment.Session()

subject_id  institution 6 digit animal ID,session  session number,session_date,username,rig
123457,1,2017-10-17,daveliu,RRig
123457,2,2017-10-18,daveliu,RRig
123457,3,2017-10-19,daveliu,RRig
123457,4,2017-10-20,daveliu,RRig
123457,5,2017-10-21,daveliu,RRig
123457,6,2017-10-23,daveliu,RRig
123457,7,2017-10-24,daveliu,RRig


### Filtering Session via subject_id / session

Bold fields in table output above indicate pkey - this choice made since we know it has spike data

In [6]:
my_session = experiment.Session & {'subject_id': 412330, 'session': 32}
my_session

subject_id  institution 6 digit animal ID,session  session number,session_date,username,rig
412330,32,2018-07-16,daveliu,RRig


### Queries vs Fetches of Sesssion

In [7]:
type(my_session)  # datajoint classes represent 'entity sets' in base and query
my_session_retrieved = my_session.fetch()  # actual data is 'fetched' (by default in ndarray)
my_session_retrieved_as_dict = my_session.fetch(as_dict=True)  # also can be retrieved as python dictionary
my_session_retrieved_key = my_session.fetch('KEY')  # fields can be listed, 'KEY' is special primary key selector
print('result:', type(my_session), '\n',
      'numpy:', type(my_session_retrieved), my_session_retrieved.shape, my_session_retrieved.dtype, '\n',
      'dict:', type(my_session_retrieved_as_dict), type(my_session_retrieved_as_dict[0]), '\n',
     'key dict:', type(my_session_retrieved_key), my_session_retrieved_key[0])

result: <class 'pipeline.experiment.Session'> 
 numpy: <class 'numpy.ndarray'> (1,) [('subject_id', '<i8'), ('session', '<i8'), ('session_date', 'O'), ('username', 'O'), ('rig', 'O')] 
 dict: <class 'list'> <class 'collections.OrderedDict'> 
 key dict: <class 'list'> {'subject_id': 412330, 'session': 32}


### Programmatic Queries

The session 'magically' chosen above could have been retrieved programmatically -

In [8]:
experiment.Session & ephys.TrialSpikes

subject_id  institution 6 digit animal ID,session  session number,session_date,username,rig
412330,32,2018-07-16,daveliu,RRig


In [9]:
my_session_key = (experiment.Session() & ephys.TrialSpikes()).fetch('KEY')[0]
print(my_session_key)

{'subject_id': 412330, 'session': 32}


In [10]:
my_session = experiment.Session & my_session_key
my_session

subject_id  institution 6 digit animal ID,session  session number,session_date,username,rig
412330,32,2018-07-16,daveliu,RRig


## Trial Dictionary

- Trial Dictionary:
  - Trial condition: Trial number length array, currently with trial type (e.g., lick-left) and trial event times (pole up)
  - General meta data: to discuss

## Behavior Dictionary

- Animal behavior per-trial dictionary:
  - Response per trial in a way that is simple to compare to trial condition (lick-left, lick-right)
  - Detailed responses: currently just vector of lick times
  - General meta data: to discuss

### Get Trials related to Session

In [11]:
_ = experiment.SessionTrial.describe()

# 
-> Session
trial                : smallint                     # trial number
---
trial_uid            : int                          # unique across sessions/animals
start_time           : decimal(8,4)                 # (s) relative to session beginning



In [12]:
experiment.SessionTrial.definition
my_trials = experiment.SessionTrial & my_session
my_trials

subject_id  institution 6 digit animal ID,session  session number,trial  trial number,trial_uid  unique across sessions/animals,start_time  (s) relative to session beginning
412330,32,0,0,0.5
412330,32,1,1,0.5
412330,32,2,2,0.5
412330,32,3,3,0.5
412330,32,4,4,0.5
412330,32,5,5,0.5
412330,32,6,6,0.5


In [13]:
_ = experiment.TrialEvent.describe()

# 
-> BehaviorTrial
-> TrialEventType
trial_event_time     : decimal(8,4)                 # (s) from trial start, not session start
---
duration             : decimal(8,4)                 # (s)



In [14]:
experiment.TrialEvent & my_session

subject_id  institution 6 digit animal ID,session  session number,trial  trial number,trial_event_type,"trial_event_time  (s) from trial start, not session start",duration  (s)
412330,32,0,delay,2.6033,1.2
412330,32,0,go,3.8033,1.5
412330,32,0,presample,0.5,0.9033
412330,32,0,sample,1.4033,1.2
412330,32,1,delay,2.5311,1.2
412330,32,1,go,3.7311,1.5
412330,32,1,presample,0.5,0.8311


In [15]:
_ = experiment.ActionEvent.describe()

# 
-> BehaviorTrial
-> ActionEventType
action_event_time    : decimal(8,4)                 # (s) from trial start



In [16]:
experiment.ActionEvent & my_session

subject_id  institution 6 digit animal ID,session  session number,trial  trial number,action_event_type,action_event_time  (s) from trial start
412330,32,0,left lick,4.3176
412330,32,0,right lick,4.4009
412330,32,20,left lick,4.1313
412330,32,20,left lick,4.3911
412330,32,20,left lick,4.5296
412330,32,20,left lick,4.6736
412330,32,20,left lick,4.8327


In [17]:
my_action_events = (experiment.ActionEvent & my_session).fetch()

In [18]:
my_action_events[my_action_events['trial']==20][0:10]

array([(412330, 32, 20, 'left lick', Decimal('4.1313')),
       (412330, 32, 20, 'left lick', Decimal('4.3911')),
       (412330, 32, 20, 'left lick', Decimal('4.5296')),
       (412330, 32, 20, 'left lick', Decimal('4.6736')),
       (412330, 32, 20, 'left lick', Decimal('4.8327')),
       (412330, 32, 20, 'left lick', Decimal('5.5566')),
       (412330, 32, 20, 'left lick', Decimal('5.7901')),
       (412330, 32, 20, 'left lick', Decimal('5.9404')),
       (412330, 32, 20, 'left lick', Decimal('6.0967')),
       (412330, 32, 20, 'left lick', Decimal('6.2696'))],
      dtype=[('subject_id', '<i8'), ('session', '<i8'), ('trial', '<i8'), ('action_event_type', 'O'), ('action_event_time', 'O')])

In [19]:
_ = experiment.BehaviorTrial.describe()

# 
-> SessionTrial
---
-> TaskProtocol
-> TrialInstruction
-> EarlyLick
-> Outcome



In [20]:
experiment.BehaviorTrial & my_session

subject_id  institution 6 digit animal ID,session  session number,trial  trial number,task  task type,task_protocol  task protocol,trial_instruction,early_lick,outcome
412330,32,0,audio delay,1,left,no early,hit
412330,32,1,audio delay,1,right,no early,ignore
412330,32,2,audio delay,1,right,no early,ignore
412330,32,3,audio delay,1,left,no early,ignore
412330,32,4,audio delay,1,right,no early,ignore
412330,32,5,audio delay,1,right,no early,ignore
412330,32,6,audio delay,1,right,no early,ignore


## Neuron Dictionary

- Neuron Dictionary:
  - Spike time cell: all spike times of this cell
  - General meta data: to discuss

In [21]:
ephys.Unit() & my_session

subject_id  institution 6 digit animal ID,session  session number,electrode_group  Electrode_group is like the probe,unit,unit_uid  unique across sessions/animals,unit_quality,unit_channel  channel on the electrode for which the unit has the largest amplitude,spike_times  (s),waveform  average spike waveform
412330,32,1,0,0,all,,=BLOB=,=BLOB=
412330,32,1,1,1,all,,=BLOB=,=BLOB=
412330,32,1,2,2,all,,=BLOB=,=BLOB=
412330,32,1,3,3,all,,=BLOB=,=BLOB=
412330,32,1,4,4,all,,=BLOB=,=BLOB=
412330,32,1,5,5,all,,=BLOB=,=BLOB=
412330,32,1,6,6,good,,=BLOB=,=BLOB=


In [22]:
my_units = (ephys.Unit() & my_session).fetch(order_by='unit ASC')

In [23]:
len(my_units), my_units[1]['unit'], my_units[-1:]['unit']

(409, 1, array([408]))

In [24]:
ephys.Unit.UnitTrial() & my_session

subject_id  institution 6 digit animal ID,session  session number,electrode_group  Electrode_group is like the probe,unit,trial  trial number
412330,32,1,0,39
412330,32,1,0,40
412330,32,1,0,41
412330,32,1,0,42
412330,32,1,0,43
412330,32,1,0,44
412330,32,1,0,45


## Stimulation Dictionary

**TBD** - Not implemented in experiment yet

- Stimulation per-trial dictionary:
  - Stim condition (categorical: no-stim, bi-lateral). Requires discussion for MAP
  - Detailed stim parameters (time, power, etc.)

## Populate / Data Export Example

### queries to merge / combine: tbd if needed

In [25]:
my_export_schema_name = 'chris-vathes_demo_export'  # change accordingly
schema = dj.schema(my_export_schema_name)

In [26]:
import os
import numpy as np
import datajoint as dj

@schema
class MlAnnExport(dj.Computed):
    definition = '''
    -> experiment.Session
    '''
    key_source = experiment.Session() & ephys.TrialSpikes()
    
    def make(self, key):
        self.export_session(key)
        self.insert1(key)
        
    def export_session(self, key):
        print('exporting', key)
        my_session = experiment.Session & key
        
        # e.g. os.path.join(base_path, 'session{}.npy'.format(my_session['session']))
        with open(os.devnull, 'wb') as f:
            np.save(f, (experiment.Session & key).fetch())
            
        with open(os.devnull, 'wb') as f:  # session_trial.npy
            np.save(f, (experiment.SessionTrial & key).fetch())
            
        with open(os.devnull, 'wb') as f:  # session_trial_event.npy
            np.save(f, (experiment.TrialEvent & key).fetch())
            
        with open(os.devnull, 'wb') as f:  # session_action_event.npy
            np.save(f, (experiment.ActionEvent & key).fetch())
            
        with open(os.devnull, 'wb') as f:  # session_behavior_trial.npy
            np.save(f, (experiment.BehaviorTrial & key).fetch())
            
        with open(os.devnull, 'wb') as f:  # session_unit.npy
            np.save(f, (ephys.Unit() & key).fetch(order_by='unit ASC'))
            
        with open(os.devnull, 'wb') as f:  # session_unit_trial.npy
            np.save(f, (ephys.Unit.UnitTrial() & key).fetch())

In [31]:
MlAnnExport()  # exported from previous run

subject_id  institution 6 digit animal ID,session  session number
,


In [32]:
MlAnnExport.delete()  # force rexport

About to delete:
Nothing to delete


In [33]:
MlAnnExport.populate()

exporting {'subject_id': 412330, 'session': 32}


In [34]:
MlAnnExport()

subject_id  institution 6 digit animal ID,session  session number
412330,32
