<a href="https://colab.research.google.com/github/ashimakeshava/NMA_marmots/blob/master/Behavioral_DF_Paula.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [7]:
#@title Data retrieval
import os, requests
import numpy as np
import pandas as pd
from pandas import DataFrame

fname = []
for j in range(3):
  fname.append('steinmetz_part%d.npz'%j)
url = ["https://osf.io/agvxh/download"]
url.append("https://osf.io/uv3mw/download")
url.append("https://osf.io/ehmw2/download")

for j in range(len(url)):
  if not os.path.isfile(fname[j]):
    try:
      r = requests.get(url[j])
    except requests.ConnectionError:
      print("!!! Failed to download data !!!")
    else:
      if r.status_code != requests.codes.ok:
        print("!!! Failed to download data !!!")
      else:
        with open(fname[j], "wb") as fid:
          fid.write(r.content)


In [3]:
#@title Data loading


alldat = np.array([])
for j in range(len(fname)):
  alldat = np.hstack((alldat, np.load('steinmetz_part%d.npz'%j, allow_pickle=True)['dat']))

# select just one of the recordings here. 11 is nice because it has some neurons in vis ctx. 
dat = alldat[11]
print(dat.keys())


dict_keys(['spks', 'wheel', 'pupil', 'lfp', 'response', 'response_time', 'bin_size', 'stim_onset', 'contrast_right', 'contrast_left', 'brain_area', 'brain_area_lfp', 'feedback_time', 'feedback_type', 'gocue', 'mouse_name', 'date_exp', 'trough_to_peak', 'waveform_w', 'waveform_u', 'active_trials', 'contrast_left_passive', 'contrast_right_passive', 'spks_passive', 'lfp_passive', 'pupil_passive', 'wheel_passive'])


alldat contains 39 sessions from 10 mice, data from Steinmetz et al, 2019. Time bins for all measurements are 10ms, starting 500ms before stimulus onset. The mouse had to determine which side has the highest contrast. For each dat = alldat[k], you have the following fields:


*   dat['mouse_name']: mouse name
*   dat['date_exp']: when a session was performed
*   dat['spks']: neurons by trials by time bins.
*   dat['brain_area']: brain area for each neuron recorded.
*   dat['contrast_right']: contrast level for the right stimulus, which is always contralateral to the recorded brain areas.
*   dat['contrast_left']: contrast level for left stimulus.
*   dat['gocue']: when the go cue sound was played.
*   dat['response_times']: when the response was registered, which has to be after the go cue. The mouse can turn the wheel before the go cue (and nearly always does!), but the stimulus on the screen won't move before the go cue.
*   dat['response']: which side the response was (-1, 0, 1). When the right-side stimulus had higher contrast, the correct choice was -1. 0 is a no go response.
*   dat['feedback_time']: when feedback was provided.
*   dat['feedback_type']: if the feedback was positive (+1, reward) or negative (-1, white noise burst).
*   dat['wheel']: exact position of the wheel that the mice uses to make a response, binned at 10ms.
*   dat['pupil']: pupil area (noisy, because pupil is very small) + pupil horizontal and vertical position.
*   dat['lfp']: recording of the local field potential in each brain area from this experiment, binned at 10ms.
*   dat['brain_area_lfp']: brain area names for the LFP channels.
*   dat['trough_to_peak']: measures the width of the action potential waveform for each neuron. Widths <=10 samples are "putative fast spiking neurons".
*   dat['waveform_w']: temporal components of spike waveforms. w@u reconstructs the time by channels action potential shape.
*   dat['waveform_u]: spatial components of spike waveforms.
*   dat['%X%_passive']: same as above for X = {spks, lfp, pupil, wheel, contrast_left, contrast_right} but for passive trials at the end of the recording when the mouse was no longer engaged and stopped making responses.

So the variables we need for the behavioral analysis are:
'mouse_name', 'date_exp', 'contrast_right', 'contrast_left', 'gocue', 'response_time','response', 'feedback_time', 'feedback_type'


gocue, response_time, feedback_time are all arrays of arrays (an array of 1 cell arrays) and refuse to be tranformed to a df straightforwardly, so we're going to convert them in this really stupid ghetto way, then add them back. Sorry for anyone looking at this.

In [8]:
behav_dat = pd.DataFrame()
cum_trial_num=0

for k in range(len(alldat)):
  temp=alldat[k]
  t1=temp['response_time']
  t2=temp['gocue']
  t3=temp['feedback_time']

  response_time, gocue, feedback_time=np.zeros(len(t1)),np.zeros(len(t2)),np.zeros(len(t3))
  trial=np.zeros(len(t1))
  cum_trial_num=cum_trial_num+len(t1)

  for i in range(0,len(t1)):
    response_time[i]=t1[i][0]
    gocue[i]=t1[i][0]
    feedback_time=t1[i][0]
    trial[i]=i
    i+=1

  #print(str(response_time.shape))

  your_keys=['mouse_name', 'date_exp', 'contrast_right', 'contrast_left',  'response', 'response_time', 'feedback_type']
  sess_behav = { your_key: temp[your_key] for your_key in your_keys }
  sess_behav["response_time"]= response_time 
  sess_behav["gocue"]=gocue
  sess_behav["feedback_time"]=feedback_time
  sess_behav["trial_num"]=trial

  #lengths = [len(v) for v in sess_behav.values()]
  #print(sess_behav.keys())
  sess_behav_long =pd.DataFrame.from_dict(sess_behav)
  #print(sess_behav_long.head())
  
  if k==0:
    behav_dat=sess_behav_long
  else:
    behav_dat=pd.concat([behav_dat, sess_behav_long], ignore_index=True)
  


behav_dat.tail()

Unnamed: 0,mouse_name,date_exp,contrast_right,contrast_left,response,response_time,feedback_type,gocue,feedback_time,trial_num
10045,Theiler,2017-10-11,0.25,1.0,0.0,2.297503,-1.0,2.297503,2.101029,338.0
10046,Theiler,2017-10-11,0.25,1.0,-1.0,1.158803,-1.0,1.158803,2.101029,339.0
10047,Theiler,2017-10-11,0.25,1.0,0.0,2.003709,-1.0,2.003709,2.101029,340.0
10048,Theiler,2017-10-11,0.25,1.0,0.0,2.076758,-1.0,2.076758,2.101029,341.0
10049,Theiler,2017-10-11,0.25,1.0,0.0,2.101029,-1.0,2.101029,2.101029,342.0


Awesome! Now we need to calculate the following variables:

*   Trial type (no-go, two image, one image, equal contrast)
  * no_go if contrast is 0 on both sides
  * one_image
  * two_image_unequal: two images presented and one higher than the other
  * two_image_equal: two images with equal contrast
*   Accuracy (was the response correct or not)
  * no_go trials are only correct if 0
  * two_image_equal trials are correct if -1 or 1
  * one_image and two_image_unequal trials are correct depending on the direction





In [9]:
##creating trial type variable
def conditions(s):
    if (s['contrast_right']==0 and s['contrast_left']==0):
        return "no_go"
    elif (s['contrast_right']==0 and s['contrast_left']>0):
        return "one_image"
    elif (s['contrast_left']==0 and s['contrast_right']>0):
        return "one_image"
    elif (s['contrast_left']!=0 and s['contrast_right']!=0):
        if (s['contrast_left']==s['contrast_right']):
          return "two_image_equal"
        else:
          return "two_image_unequal"
    else:
      return "???"

behav_dat['trial_type'] = behav_dat.apply(conditions, axis=1)

behav_dat.head()

Unnamed: 0,mouse_name,date_exp,contrast_right,contrast_left,response,response_time,feedback_type,gocue,feedback_time,trial_num,trial_type
0,Cori,2016-12-14,0.0,1.0,1.0,1.150204,1.0,1.150204,1.470381,0.0,one_image
1,Cori,2016-12-14,0.5,0.0,-1.0,1.399503,1.0,1.399503,1.470381,1.0,one_image
2,Cori,2016-12-14,0.5,1.0,1.0,0.949291,1.0,0.949291,1.470381,2.0,two_image_unequal
3,Cori,2016-12-14,0.0,0.0,0.0,2.266802,1.0,2.266802,1.470381,3.0,no_go
4,Cori,2016-12-14,1.0,0.5,1.0,0.816776,-1.0,0.816776,1.470381,4.0,two_image_unequal
