# Data Wrangling

Explore the experimental data & convert to HDDM-ready format (CSV)

## Fine-tune output for individual .mat files

In [12]:
import scipy.io as scio

# Import .mat file as python object instead of numpy array to make life a little more bearable
data = scio.loadmat('../data/data_18333.mat', struct_as_record=False)

For now, let's focus on the following:  
- Reaction time (rt)  
- Response (0/1)  
- Stimulus (1-4 for now)  

This is similar to `examples/hddm_simple.csv`, used to play around with the HDDM library

In [13]:
dat_struct = data['data'][0,0]  # Actual data structure, owns to matlab weirdness

Before outputing to CSV, data for each subject will go in a python dictionary in the form of key --> array. The plan is to then create an array of dictionaries for all patients, with each dictionary representing the data gathered for an individual

In [32]:
"""
Conversion from convoluted numpy array that scipy.io spits out to a more
pythonic data structure.
Leverage python instead of numpy for data manipulation, since the use of
numpy isn't really necessary for this data.
"""
subject = dict.fromkeys(['rt', 'response', 'stim'])

subject['rt'] = dat_struct.rt1.tolist()[0]
subject['response'] = [x[0] for x in dat_struct.perf1.tolist()]
subject['stim'] = [x[0] for x in dat_struct.conditions1.tolist()]

subject

{'response': [1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  0,
  1,
  1,
  1,
  1,
  1,
  0,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  0,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  0,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  0,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1,
  1],
 'rt': [2.4512633636236387,
  1.18735081395198,
  1.2590923631187252,
  1.6991102252864039,
  1.0747144389224559,
  1.0452650363113207,
  1.1684425454263874,
  1.1043389724582084,
  1.1543610957191959,
  1.474297757495151,
  1.4988657792455342,
  0.9227490303092054,
  1.2929393460972278,
  1.6773387458683828,
  1.356698573619724,
  1.6131281121852226,
  1.6206319541875018,
  2.4685545

Now that the data is in a desirable format, we can dump it to a CSV file