# Attention summary

In this notebook, we will process the attention datasets into summary datasets.

In [2]:
from datetime import datetime, timedelta
import pandas as pd
import numpy as np
import os

  return f(*args, **kwds)
  return f(*args, **kwds)


First we read in the data.

In [112]:
datadir = "/Users/jokedurnez/Documents/projects/projectsOngoing/accounts/Data/CAFE/Physio/"
subject = 'WI_AMP_005'

subcsv = os.path.join(datadir,
                      'Preliminary Physio Wristband Data for Mollie',
                      'Datavyu_Attention_csv',
                      '%s_Attention_SY.csv'%subject)
outdir = os.path.join(datadir,'preprocessed',subject)
data = pd.read_csv(subcsv)

datacode = data[['TV.ordinal','TV.onset',
                 'TV.offset','TV.code01']].dropna()
data = data[['Attention.ordinal','Attention.onset',
             'Attention.offset','Attention.code01']]

In [97]:
data.head()

Unnamed: 0,Attention.ordinal,Attention.onset,Attention.offset,Attention.code01
0,0,25262,28866,TO
1,1,28867,31144,O-Experimenter
2,2,31145,34000,TO
3,3,34001,36244,P
4,4,36245,38386,O-Shoes


In [98]:
datacode

Unnamed: 0,TV.ordinal,TV.onset,TV.offset,TV.code01
0,0.0,208896.0,929389.0,Baseline + NoTV
1,1.0,929390.0,1397535.0,ChildTV
2,2.0,1397536.0,1846642.0,AdultTV


The next step is to split the cells of the attention data where the condition changes.  We first define the cuts, then for each cut find the row where the condition changes, and then split up the condition.

In [113]:
# split cells where condition changes
cuts = list(datacode['TV.onset'])+[(list(datacode['TV.offset'])[len(datacode)-1])]
for cut in cuts:
    chcell = np.where((cut > data['Attention.onset']) & (cut < data['Attention.offset']))[0]
    splitrow = data.iloc[chcell]
    cellA = {
        'Attention.ordinal': list(data.loc[chcell,'Attention.ordinal'])[0],
        'Attention.onset': list(data.loc[chcell,'Attention.onset'])[0],
        'Attention.offset': int(cut),
        'Attention.code01': list(splitrow['Attention.code01'])[0]  
    }
    cellB = {
        'Attention.ordinal': list(data.loc[chcell,'Attention.ordinal'])[0]+0.5,
        'Attention.onset': int(cut),
        'Attention.offset': list(data.loc[chcell,'Attention.offset'])[0],
        'Attention.code01': list(splitrow['Attention.code01'])[0]
    }
    data = data.drop(chcell)
    data = data.append(cellA,ignore_index=True).append(cellB,ignore_index=True)

data = data.sort_values(by='Attention.ordinal').reset_index(drop=True)

See for example below how we split up the row where the cut occurred...

In [100]:
data.iloc[17:22]

Unnamed: 0,Attention.ordinal,Attention.onset,Attention.offset,Attention.code01
17,17.0,169083,175202,O-Shoes
18,18.0,175203,208896,TO
19,18.5,208896,261732,TO
20,19.0,261733,264486,TO
21,20.0,264487,267274,O-Table


Next we annotate all cells with the correct conditions...

In [114]:
for idx,row in datacode.iterrows():
    # annotate cells
    condtimes = (data['Attention.onset'] >= row['TV.onset']) & \
        (data['Attention.offset'] <= row['TV.offset'])
    data.loc[condtimes,'condition'] = row['TV.code01']

In [115]:
# add durations
data['duration'] = data['Attention.offset'] - data['Attention.onset']

See below how it successfully split up the cut in condition occurred...

In [116]:
data.iloc[17:22]

Unnamed: 0,Attention.ordinal,Attention.onset,Attention.offset,Attention.code01,condition,duration
17,17.0,169083,175202,O-Shoes,,6119
18,18.0,175203,208896,TO,,33693
19,18.5,208896,261732,TO,Baseline + NoTV,52836
20,19.0,261733,264486,TO,Baseline + NoTV,2753
21,20.0,264487,267274,O-Table,Baseline + NoTV,2787


In [120]:
grouped = data[['duration','condition','Attention.code01']] \
    .groupby(['condition','Attention.code01']) \
    .aggregate(['mean','count','median','sum'])
grouped.columns = ['mean','count','median','sum']
grouped

Unnamed: 0_level_0,Unnamed: 1_level_0,mean,count,median,sum
condition,Attention.code01,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
AdultTV,O-Hand,2532.5,2,2532.5,5065
AdultTV,O-Table,611.0,1,611.0,611
AdultTV,P,1129.5,4,900.0,4518
AdultTV,TO,8879.375,16,4912.0,142070
AdultTV,TV,19787.0,15,7139.0,296805
Baseline + NoTV,I,2005.0,4,2073.0,8020
Baseline + NoTV,M,985.0,1,985.0,985
Baseline + NoTV,M-Camera,7445.0,1,7445.0,7445
Baseline + NoTV,O-Book,2617.0,1,2617.0,2617
Baseline + NoTV,O-Floor,1381.666667,3,1461.0,4145


In [122]:
grouped.to_csv(os.path.join(outdir,"ATTENTION_%s_summary.csv"%(subject)))