## stimulus-response experiment analysis
The following bits of code get the data and do preliminary analysis. Run each cell in sequence. Follow the directions below. 

In [1]:
# start up
import numpy as np
import pandas as pd
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
import matplotlib
import re 
from os import listdir
import glob
import analyze_learning_dat as al

### Get all raw data
- Copy all of the raw data in to the *data* folder to import and extract accuracies.
- use the following cell to import all the file names into the variable *files'.
- Get the file names to pass to *analyze_learning_dat()* function to extract accuracies. 

In [2]:
#grab files
files = glob.glob('./data/*.txt')
files

['./data/S-R_Learning_Task.2020-12-03-0852.data.9472c7e9-7615-4da5-b1e1-3507816f37fb.txt',
 './data/S-R_Learning_Task.2021-01-17-0253.data.b41c35e3-f6b8-405d-9f23-4586beb3beec.txt',
 './data/S-R_Learning_Task.2020-12-09-0745.data.1902ec48-1834-4634-af25-8d0743841c20.txt',
 './data/S-R_Learning_Task.2020-12-09-2235.data.dbf56a5a-1551-4b21-9d0e-81f658ae7326.txt',
 './data/S-R_Learning_Task.2021-01-21-2143.data.2677c116-6f38-48a8-a2b8-8a9da3899eec.txt',
 './data/S-R_Learning_Task.2020-12-06-0216.data.f0fd9254-0462-44c9-98c9-31c8b9bf10fb.txt',
 './data/S-R_Learning_Task.2021-01-10-1041.data.ca2793b8-ff8c-4a8e-ad90-e0c4d9e2233f.txt',
 './data/S-R_Learning_Task.2021-01-18-2006.data.69c08a5b-60b9-4e68-b6d3-8e6beb6c41ae.txt',
 './data/S-R_Learning_Task.2021-01-14-0040.data.55a88c1b-a561-4450-b657-a01d2042210f.txt',
 './data/S-R_Learning_Task.2021-01-27-0102.data.a8f08c33-a472-42b9-ae7c-4d8a09505f9b.txt',
 './data/S-R_Learning_Task.2020-12-03-1128.data.0f7bc6f7-2076-4170-9b42-fc165f3d45e7.txt',

### Analysis function
- The analysis function is in analyze_learning_dat.py. It is imported into this script right at the beginning as 'al'. See below for how it is being used. 
- You can pass a single filename with its path or a group of file names to analyze at once. The function will create a dictionary for all the subjects and uses the last 4 characters of the unique identifiers to use as names to identify each subject. 

### Get the identifier keys for all processed files. 
- These identifier keys are just the last four characters of the unique subject identifiers. 
- Use the identifiers to index into the *allsubs* dictionary and make individual plots. 

In [4]:
allsubs=al.analyze_learning_dat(files)
allsubs.keys()

dict_keys(['6f37fb', 'b3beec', '841c20', 'ae7326', '899eec', 'bf10fb', 'e2233f', '6c41ae', '42210f', '505f9b', '3d45e7', '5623af', '831922', '2b5be2', '782772', '44958d', 'b4494f', 'd4af05', '150278', '9dc05c', '0023c4', '4cb81b', '26668b', '536124', '27be3b', '5cd9b3', 'aa7ea9', '9a5023', 'f6ac91', '9a77f6', '6cd711', '567f54', '67c062', '6ed5d3', '77a5d0', '02920d', '7bf79c', '3ae3cc', '6720f9', 'aa8941', '2f7385', 'ae3463', '513534', '264198', '1e52ff', '7dd15b', '3e21d4', '1b19bb', '7a2728', '38c59c', '93ffec', '9acf95', '9ebcf7', '71a0e8', '1543fb', '223c73', '51cc49', '1f0024', 'a0a2dd', '57f3e2', '276e9b', '842417', 'cb9d13', '67f2cf', 'a621c2', '88e760', '292129', '598a27', '6e88c7', '6fe1cb', 'af7457', '53b63a', '24c706', '50819f', '4d9df2', 'b70f66', '8d3097', 'f6bfad', '8376d2', 'e7e832', '7c5838', '019aa2', '6de3f3', '6f0ecb', '827245', 'cff8cd', '1a551b', '3942a5', 'd11ef4', '1b4785', 'd588b5', '71772b', 'a5f11e', '78af73', '302d47', 'fe0f78', '20d7af', 'f99203', '67f835',

In [19]:
allsubs=al.analyze_learning_dat(files)
allsubs

{'6f37fb': {'n3': array([0.58333333, 1.        , 0.58333333, 0.75      , 1.        ,
         0.75      , 0.83333333, 0.83333333, 0.83333333, 0.91666667,
         1.        , 0.83333333, 1.        ]),
  'm3': array([0.25      , 0.58333333, 0.66666667, 0.66666667, 0.91666667,
         0.83333333, 0.91666667, 0.83333333, 0.91666667, 0.83333333,
         0.91666667, 1.        , 1.        ]),
  'n6': array([0.33333333, 0.44444444, 0.61111111, 0.61111111, 0.72222222,
         0.83333333, 0.72222222, 0.83333333, 0.77777778, 0.88888889,
         0.88888889, 0.88888889, 0.77777778]),
  'm6': array([0.27777778, 0.55555556, 0.66666667, 0.66666667, 0.5       ,
         0.44444444, 0.83333333, 0.83333333, 0.66666667, 0.66666667,
         0.72222222, 0.83333333, 0.88888889]),
  'test3': 0.8611111111111112,
  'test6': 0.6388888888888888},
 'b3beec': {'n3': array([0.5       , 0.75      , 0.83333333, 0.91666667, 1.        ,
         1.        , 1.        , 0.83333333, 0.83333333, 1.        ,
         

### Examine data
- You can examine data here by either plotting or looking at tables. Make sure you explore all of the methods for pandas DataFrames like *plot()* and *describe()*
- Use the identifier keys to find individual subjects: **allsubs[*identifier key*]** Changing to pandas dataFrame allows for quick and easy plotting and simple stats. 
- Use an additional key to look at individual conditions: n3, n6, m3, m6. E.g.: **allsubs[*169e*][*m6*]**



In [21]:
#for k in allsubs.keys():
   
 #  temp_sub = pd.DataFrame(allsubs[k])
   #temp_sub.plot(title=k, ylim=[0,1.1])
    

sub_data = pd.DataFrame(allsubs)
sub_data.transpose().to_json('meta_learning_Collins_data_122020.JSON', orient='table')

## Analyses
Three way ANOVA 
Three factors are task place(learning & testing), blcok size(3 vs. 6), conditions(normal vs. meta)
- A = task place
    - A1 = learning 
    - A2 = testing 
- B = blcok size 
    - B1 = size 3
    - B2 = size 6
- C = task condition
    - C1 = normal 
    - C2 = meta 



- learning accuracy is based on average from last three trials 
- Testing accuracy is based on average across all four iterations 

purpose of ANOVA 
1. examine if there are BC interaction effect (between block size & task condition)
2. examine if there are interaction effect across all three factors.
3. if 1 is true, how would task conditionm(meta) impact both block (3 & 6)



I think we should have 8 groups/accuracy for ANOVA. They are:

- ln3 - learning normal 3 (include only last 3 trials)
- ln6 - learning normal 6 (only last 3 trails)
- lm3 - learning meta 3 (only last 3 trails)
- lm6 - learning meta 6 (only last 3 trails)

? There are 3 blocks for each group, we should get 9 attempts. means # of correct push out of 9 = accuracy? 

? does the stimulus matter? I mean we do not need to know the accuracy for each stimulus right?


- tn3 - testing normal 3 -> 3 (images) * 4 (attemps) * 3 (blcoks) = 36 total
- tn6 - testing normal 6 -> 6 (images) * 4 (attemps) * 3 (blocks) = 72 total
- tm3 - testing meta 3 -> 36 total
- tm6 - testing meta 6 -> 72 total

- accuracy for 3 = # of correct push / 36
- accuracy for 6 = # of correct push / 72


eventually each participants should have 8 numbers representing their performacne, for the purpose of ANOVA



1. we need individual plot & bars for learning & testing (similar to yours)
2. ANOVA

I think we should prioritize 1 & 2, for the purpose of thesis.

3. N-back
4. self-report 
