# VERSION 1:

* What I would've done if I were doing the assignment
* Combination of __iterating__ through files to load them in, and then using __groupy__ to compute means
* End result: 15 lines of readable (to me!) code

In [36]:
import pandas as pd
from glob import glob
P_file_wildcard = '../data/WM*.txt'
participant_files = glob(P_file_wildcard)
dfs = []
for f in participant_files:
    tmp_df = pd.read_csv(f,sep='\t')
    tmp_df = tmp_df[tmp_df['Procedure[Block]'] == 'TrialsPROC']
    dfs.append(tmp_df)
allsubs_df = pd.concat(dfs,axis=0)
allsubs_df['pid'] = allsubs_df.HCPID.str.extract('(\d+)') # note the issue with 856766 HCPID var
print('total trials per participant equals:',len(allsubs_df) / allsubs_df['pid'].nunique())
allsubs_df_corr = allsubs_df[allsubs_df['Stim.ACC']==1]
print('RT TABLE:\n',allsubs_df_corr.groupby(['StimType','BlockType'])['Stim.RT'].mean().unstack())
print('ACC TABLE:\n',allsubs_df.groupby(['StimType','BlockType'])['Stim.ACC'].mean().unstack())

total trials per participant equals: 160.0
RT TABLE:
 BlockType      0-Back       2-Back
StimType                          
Body       866.377540  1083.185613
Face       766.740053   993.550256
Place      786.328891  1019.625000
Tools      763.776128   975.646714
ACC TABLE:
 BlockType  0-Back  2-Back
StimType                 
Body       0.8860  0.7785
Face       0.9425  0.8805
Place      0.9380  0.8840
Tools      0.9090  0.8520


# VERSION 2:

* What I would've done if I were smarter
* Combination of __list comprehension__ to read through files, and then using __pivot tables__ to compute means
* End result: 10 (or fewer) lines of readable code

In [46]:
import pandas as pd
from glob import glob
P_full_path = '../data/WM*.txt'
dfs = []
allsubs_df = pd.concat([pd.read_table(file,sep='\t').append(dfs) for file in glob(P_full_path)])
allsubs_df = allsubs_df[allsubs_df['Procedure[Block]'] == 'TrialsPROC']
allsubs_df['pid'] = allsubs_df.HCPID.str.extract('(\d+)') # note the issue with 856766 HCPID var
print('total trials per participant equals:',len(allsubs_df) / allsubs_df['pid'].nunique())
print(pd.pivot_table(allsubs_df[allsubs_df['Stim.ACC']==1],index=['StimType','BlockType'], values = 'Stim.RT', aggfunc='mean').unstack())
print(pd.pivot_table(allsubs_df,index=['StimType','BlockType'], values = 'Stim.ACC', aggfunc='mean').unstack())

total trials per participant equals: 160.0
              Stim.RT             
BlockType      0-Back       2-Back
StimType                          
Body       866.377540  1083.185613
Face       766.740053   993.550256
Place      786.328891  1019.625000
Tools      763.776128   975.646714
          Stim.ACC        
BlockType   0-Back  2-Back
StimType                  
Body        0.8860  0.7785
Face        0.9425  0.8805
Place       0.9380  0.8840
Tools       0.9090  0.8520
