**Python Tutorial: Level 6**
Source:https://kerblooee.github.io/pytutorial/
 Created for the OVGU Cognitive Neuroscience Master's course H.3: Projektseminar Fall 2019
Updated August 2020
Taught by Reshanne Reeder

## PsychoPy - Response collection and saving data

We are already familiar with using keypresses using `event.waitKeys()`, but this class has a lot more to it than just controlling the sequence of an experiment. We can also use it to store keypresses for response collection. To store a list of keys, simply define "keys":


In [1]:
from psychopy import event, visual, monitors, core

#mon = monitors.Monitor('myMonitor', width=35.56, distance=60)
#mon.setSizePix([1920, 1080])
win = visual.Window(size=(400,400), color=[-1,-1,-1])

my_text = visual.TextStim(win)

nTrials=10

for trial in range(nTrials):
    my_text.text = "Please make a keypress for trial " + str(trial)
    my_text.draw()
    win.flip()
    keys = event.waitKeys()
    print(keys)
    
#win.close()

['1']
['2']
['3']
['4']
['3']
['1']
['1']
['2']
['1']
['3']


Running this code, you will find that you are only able to make a single keypress for each trial. This is because we have made a keypress a necessary step to move on to the next trial (`waitKeys`). The experiment waits for any keypress, and as soon as the condition is met, it goes on to the next trial. You can see the name of the key when it is printed. Note that keypresses are always stored as strings.

If you want to make keypresses independent of the trial flow, use `event.getKeys`:


In [2]:
from psychopy import core, event

nTrials = 10
for trial in range(nTrials):
    core.wait(2)
    keys=event.getKeys()
    print(keys)

# win.close() # Assuming the window from above is still open, otherwise this will cause an error


[]
[]
['d', 'd', 'f', 'd', 'f', 'd', 'f']
['s', 'd', 'a', 's', 'd', 's', 'a']
['d', 'f', 'd', 's', 'f', 's', 'd', 'f']
['s', 'd', 's', 'a', 'd', 's', 'a', 'd']
['s', 'd', 'a', 'd', 's', 'a']
['s', 'a', 'd', 's', 'd', 's', 'a', 'd', 's', 'a', 'd', 's', 'a', 'f', 'e', 'a', 's', 'f', 'd', 's', 'x']
[]
[]


The difference between `event.waitKeys` and `event.getKeys` is that the former will stop the experiment until a key is pressed, whereas the latter is independent of trial flow. In the example above, you can make as many keypresses as you want within 2 seconds, and they will be recorded in a list. The list will be refreshed for each trial.

If you only want certain keypresses to count toward response collection, you can add a `keyList` as an argument in `event.getKeys` or `.waitKeys` (remember to code accepted keys as strings):


In [None]:
from psychopy import core, event

nTrials = 10
for trial in range(nTrials):
    core.wait(1)
    keys=event.getKeys(keyList=['1','2'])
    #draw some stuff
    #win flip
    #wait
    print(keys)

# win.close()

[]
[]
[]
['1', '2', '1', '2', '1']
['2', '1', '2', '1', '2', '1', '2', '1', '2']
['1', '2', '1', '2', '1', '2', '1', '2', '1', '2', '1']
['2', '1', '2', '1', '2', '1', '2', '1', '2']
['1', '2', '1', '2', '1', '2', '1']
[]
[]


Now only a 1 or 2 response will be recorded in keys. If you use `event.waitKeys`, only a 1 or 2 response will allow you to go on to a subsequent trial. This is good for controlling against accidental responses. This technique is also a good way for an experimenter to control the start of an experiment. For example:


In [None]:
from psychopy import core, event, visual, monitors

mon = monitors.Monitor('myMonitor', width=35.56, distance=60)
mon.setSizePix([1920, 1080])
win = visual.Window(monitor=mon, size=(400,400), color=[-1,-1,-1])

nTrials=10
my_text=visual.TextStim(win)

#To start the experiment, experimenter presses a "w" (arbitrary)
my_text.text = "Wait for experimenter to start"
my_text.draw()
win.flip()
event.waitKeys(keyList=['w'])

#win.close()


['w']

: 

The window will only close once you've pressed the correct button.

Although `getKeys` is more flexible than `waitKeys`, because `getKeys` records responses across the whole trial, it can be quite hard to control. For example, say you want to present an initial fixation, followed by a stimulus, and you only want to collect responses during stimulus presentation and not during the fixation. You can implement `getKeys` like this:


In [None]:
from psychopy import core, event, visual, monitors

mon = monitors.Monitor('myMonitor', width=35.56, distance=60)
mon.setSizePix([1920, 1080])
win = visual.Window(monitor=mon, size=(400,400), color=[-1,-1,-1])

nTrials=10
my_text=visual.TextStim(win)
fix=visual.TextStim(win, text='+')

for trial in range(nTrials):
    
    keys = event.getKeys(keyList=['1','2']) #put getkeys HERE??
    my_text.text = "trial %i" %trial #insert integer into the string with %i
    
    fix.draw()
    win.flip()
    core.wait(1) # Reduced from 2s for faster running in example
    
    my_text.draw()
    win.flip()
    core.wait(0.5) # Reduced from 1s for faster running in example
    
    print(keys) #which keys were pressed?
    
win.close()

[]
['1']
['2', '1']
[]
[]
[]
[]
['1']
[]
['2']


To give yourself a bit more control with this, you can add `"event.clearEvents"` immediately preceding the point at which you want to start collecting responses. This function flushes any irrelevant keys that have been pressed and starts the key collection anew:


In [None]:
from psychopy import core, event, visual, monitors

mon = monitors.Monitor('myMonitor', width=35.56, distance=60)
mon.setSizePix([1920, 1080])
win = visual.Window(monitor=mon, size=(400,400), color=[-1,-1,-1])

nTrials=10
my_text=visual.TextStim(win)
fix=visual.TextStim(win, text='+')

for trial in range(nTrials):
    
    my_text.text = "trial %i" %trial #insert integer into the string with %i
    
    fix.draw()
    win.flip()
    core.wait(1)
    
    event.clearEvents() #clear events HERE
    keys = event.getKeys(keyList=['1','2']) #put getkeys HERE??
    
    my_text.draw()
    win.flip()
    core.wait(0.5)
    
    print(keys) #which keys were pressed?
    
win.close()


['2']
[]
[]
[]
['1', '2']
['2', '1']
['1', '2']
['1', '2', '1', '2']
['1']
['1']


You can see the empty lists where I tried to respond during fixation.

With `"getKeys"`, you are prone to recording multiple responses as seen above, because it will automatically collect as many responses as you make within the allotted time. So how do you only take the first response a subject makes during a trial (that is, only record the first response as the "true" response, and ignore subsequent responses)? You should add a separate response collector variable in this case:

```python
if keys: #if there are keypresses stored in keys
    sub_resp = keys[0] #only count the first keypress
```

Implemented in the full script, it looks like this:


In [None]:
from psychopy import core, event, visual, monitors

mon = monitors.Monitor('myMonitor', width=35.56, distance=60)
mon.setSizePix([1920, 1080])
win = visual.Window(monitor=mon, size=(400,400), color=[-1,-1,-1])

nTrials=10
my_text=visual.TextStim(win)
fix=visual.TextStim(win, text='+')

sub_resp = None # Initialize outside the loop for the first print

for trial in range(nTrials):
    
    keys = [] # Reset keys to empty list
    my_text.text = "trial %i" %trial #insert integer into the string with %i
    
    fix.draw()
    win.flip()
    core.wait(1) # Reduced from 2s for example
    
    event.clearEvents() #clear events HERE
    keys = event.getKeys(keyList=['1','2']) #put getkeys HERE
    
    my_text.draw()
    win.flip()
    core.wait(0.5) # Reduced from 1s for example
    
    print("keys that were pressed", keys) #which keys were pressed?
    
    if keys:
        sub_resp = keys[0] #only take first response
        
    print("response that was counted", sub_resp)    
    
win.close()


keys that were pressed ['1']
response that was counted 1
keys that were pressed ['1']
response that was counted 1
keys that were pressed ['1', '2', '1', '2', '1']
response that was counted 1
keys that were pressed ['2', '1', '2', '1', '2']
response that was counted 2
keys that were pressed ['2', '2']
response that was counted 2
keys that were pressed ['1', '2']
response that was counted 1
keys that were pressed ['2']
response that was counted 2
keys that were pressed ['1', '2', '1']
response that was counted 1
keys that were pressed ['1']
response that was counted 1
keys that were pressed ['1']
response that was counted 1


`getKeys` and `waitKeys` also have a `"timeStamped"` option, which in principle could be used to record response timing, but personally I have found that this is difficult to control. Instead, there are a couple of different ways of recording responses (the second way we will get to in the psychtoolbox portion of this tutorial). The first way (before the days of PsychoPy3), I recorded responses using clock timing (see level5) and the `"maxWait"` argument of `waitKeys`. `maxWait` allows you to enter an amount of time (in seconds) that a stimulus should appear on screen. For example:

```python
#waits for 2 seconds, then continues if no response
keys = event.waitKeys(maxWait=2, keyList=['1', '2'])
```

Then, you can record the exact time during a trial in which a participant made a response, using a clock, implemented like so:


In [None]:
from psychopy import core, event, visual, monitors

mon = monitors.Monitor('myMonitor', width=35.56, distance=60)
mon.setSizePix([1920, 1080])
win = visual.Window(monitor=mon, size=(400,400), color=[-1,-1,-1])

nTrials=10
my_text=visual.TextStim(win)

rt_clock = core.Clock()  # create a response time clock

for trial in range(nTrials):
    rt_clock.reset() #reset timing for every trial
    event.clearEvents(eventType='keyboard') #reset keys for every trial

    my_text.text = "trial %i" % trial
    my_text.draw()
    win.flip()

    keys = event.waitKeys(maxWait=2, keyList=['1', '2']) #waits for 2 seconds then continues
    if keys:
        print(rt_clock.getTime(), keys) #get time at which the subject made a keypress

win.close()


0.8851170539855957 ['1']
0.009994983673095703 ['2']
0.4510948657989502 ['2']
0.39842700958251953 ['1']
0.1160120964050293 ['2']
0.10191988945007324 ['1']
0.12199902534484863 ['2']
0.12213587760925293 ['1']
0.10178208351135254 ['2']
0.18268513679504395 ['1']


This code will present the stimulus for maximally 2 seconds, and print the response time for each trial. As it is coded now, the stimulus presentation time is pseudo-response-dependent- that is, the response terminates a trial (if a response is made within 2 seconds), or the experiment continues after 2 seconds(if there is no keypress). This may look a little jarring or messy because stimulus presentation times are all different. So how do you make a stimulus appear for 2 seconds regardless of the response made, but still record response time accurately? One method is to add another (identical) stimulus for the remaining time following a response:

```python
#waits for stimulus duration then continues
    keys = event.waitKeys(maxWait=2, keyList=['1', '2'])
    if keys:
        resp_time = rt_clock.getTime() #use getTime to determine the response time
        #stimulus duration minus however long it took the subject to respond
        remaining_time=2-resp_time
        my_text.draw()
        win.flip()
        core.wait(remaining_time)
```

Implemented in a functional snippet like this:


In [None]:
from psychopy import core, event, visual, monitors

mon = monitors.Monitor('myMonitor', width=35.56, distance=60)
mon.setSizePix([1920, 1080])
win = visual.Window(monitor=mon, size=(400,400), color=[-1,-1,-1])

nTrials=10
my_text=visual.TextStim(win)

rt_clock = core.Clock()  # create a response time clock

for trial in range(nTrials):
    rt_clock.reset() #reset timing for every trial
    event.clearEvents(eventType='keyboard') #reset keys for every trial

    my_text.text = "trial %i" % trial
    my_text.draw()
    win.flip()
    
    #waits for 2 seconds then continues
    keys = event.waitKeys(maxWait=2, keyList=['1', '2'])
    if keys:
        resp_time = rt_clock.getTime() #use getTime to determine the response time
        #stimulus duration minus however long it took the subject to respond
        remaining_time=2-resp_time
        my_text.draw()
        win.flip()
        core.wait(remaining_time)

win.close()




Now, regardless of how long it takes a subject to respond, the stimulus appears for the full 2 seconds, and response time is recorded at the moment the participant made a response.

If you want to use a countdown timer or frame-based timing, recording responses is a little different. Because the presentation timing is dependent on a while loop (see level5), you will have to use `"getKeys"` instead of `"waitKeys"`, and instruct your experiment to "listen" for keypresses during the while loop. For example:


In [None]:
from psychopy import core, event, visual, monitors

mon = monitors.Monitor('myMonitor', width=35.56, distance=60)
mon.setSizePix([1920, 1080])
win = visual.Window(monitor=mon, size=(400,400), color=[-1,-1,-1])

nTrials=10
my_text=visual.TextStim(win)

rt_clock = core.Clock()  # create a response time clock
cd_timer = core.CountdownTimer() #add countdown timer

for trial in range(nTrials):
    rt_clock.reset()  # reset timing for every trial
    cd_timer.add(2) #add 2 seconds

    event.clearEvents(eventType='keyboard')  # reset keys for every trial
    while cd_timer.getTime() > 0: #for 2 seconds

        my_text.text = "trial %i" % trial
        my_text.draw()
        win.flip()

        keys = event.getKeys(keyList=['1', '2'])  #collect keypresses after first flip

        if keys:
            resp_time = rt_clock.getTime() #use getTime to determine the response time
            print(keys, resp_time) #print keys and response times

win.close()


['1'] 0.9140980243682861
['2'] 1.3682599067687988
['1'] 0.6841731071472168
['2'] 0.7842271327972412
['1'] 0.8676011562347412
['2'] 0.9844369888305664
['1'] 1.0844440460205078
['1'] 0.7508609294891357
['2'] 0.8343038558959961
['1'] 0.9511518478393555
['2'] 1.051271915435791
['1'] 0.4839608669281006
['2'] 0.5840470790863037
['1'] 0.684161901473999
['2'] 0.7842919826507568
['1'] 0.9010560512542725
['1'] 1.3515918254852295
['2'] 1.4516689777374268
['1'] 1.569648027420044
['2'] 1.651831865310669
['1'] 0.7012319564819336
['2'] 0.8177180290222168
['1'] 0.9182240962982178
['2'] 1.0678980350494385


This allows you to get rid of `core.wait()`, and also the repeated stimulus presentation for any "remaining time" following a response.

However, experiment timing with the `CountdownTimer` recruits that tricky `"getKeys"` function again for response collection, and if you test out your snippet (as shown above), you'll see you've got the problem of collecting multiple keys. You cannot simply add a `"sub_resp"` variable because the keys are collected independently for every frame now. So how do you only take the first response in this case? I use `"count"` to count up the number of responses made on a given trial:

```python
    count=-1 #reset the counter for every while loop
    while cd_timer.getTime() > 0: #for 2 seconds

        my_text.text = "trial %i" % trial
        my_text.draw()
        win.flip()

        keys = event.getKeys(keyList=['1', '2'])  #collect keypresses after first flip
        
        if keys:
            count=count+1 #count up the number of times a key is pressed
            
            if count == 0: #if this is the first time a key is pressed
                resp_time = rt_clock.getTime()
                sub_resp = keys      
```

Implemented in a functioning code snippet, it looks like this:


In [None]:
from psychopy import core, event, visual, monitors

mon = monitors.Monitor('myMonitor', width=35.56, distance=60)
mon.setSizePix([1920, 1080])
win = visual.Window(monitor=mon, size=(400,400), color=[-1,-1,-1])

nTrials=10
my_text=visual.TextStim(win)

rt_clock = core.Clock()  # create a response time clock
cd_timer = core.CountdownTimer() #add countdown timer

for trial in range(nTrials):
    rt_clock.reset()  # reset timing for every trial
    cd_timer.add(2) #add 2 seconds

    event.clearEvents(eventType='keyboard')  # reset keys for every trial
    
    count = -1 #start the counter for the while loop
    resp_time = None
    sub_resp = None
    
    while cd_timer.getTime() > 0: #for 2 seconds

        my_text.text = "trial %i" % trial
        my_text.draw()
        win.flip()

        keys = event.getKeys(keyList=['1', '2'])  #collect keypresses after first flip

        if keys:
            count=count+1 #count up the number of times a key is pressed
            
            if count == 0: #if this is the first time a key is pressed
                resp_time = rt_clock.getTime() #get RT for first response in that loop
                sub_resp = keys #get key for only the first response in that loop
                
    # Only print if a response was made
    if sub_resp:
        print(sub_resp, resp_time)

win.close()


['1'] 0.763812780380249
['1'] 0.8175280094146729
['2'] 1.0013179779052734
['2'] 1.3681600093841553
['2'] 1.0178210735321045
['2'] 1.0176191329956055
['2'] 0.7507009506225586
['2'] 0.6865460872650146
['2'] 1.2180027961730957
['2'] 0.90087890625


Now you only collect one response per trial.

Finally, what if you need to exit in the middle of the experiment? It is good to have an escape route in case a subject needs to stop early, or there is any other unforeseen problem. In this case, you can tell your experiment to do different things depending on which button is pressed:

```python
keys = event.getKeys(keyList=['1', '2', 'escape'])  #collect keypresses after first flip

        if keys:
            if 'escape' in keys: #if someone wants to escape the experiment
                win.close() #close the window
                core.quit() # Also typically used to stop PsychoPy
            else: #otherwise...
                resp_time = rt_clock.getTime()
                print(resp_time, keys)
```

TO THE [KEYPRESS EXERCISES](Link_to_Keypress_Exercises)

BACK TO [TABLE OF CONTENTS](#Table-of-Contents)

---

## Recording data

It is important to adjust the output of your experiment in a way that is easiest for you to read it, so you can catch errors and easily interpret stored results. This brings us to how to record data so that it can be saved easily. There are many ways to record data across an experiment, so it's up to you how you want to organize it all. I collect data in lists or dictionaries to save to different file formats. Let's start with lists. First, I pre-define lists of zeros that will be filled online:


In [None]:
nTrials=3
nBlocks=2

# Note: The way this is defined `[[0]*nTrials]*nBlocks` creates shallow copies.
# It's often safer to use a list comprehension for 2D lists: 
sub_resp = [[0 for _ in range(nTrials)] for _ in range(nBlocks)]

print(sub_resp)


[[0, 0, 0], [0, 0, 0]]


In the above example with `sub_resp`, I have created 2 lists of zeros (one for each block), each with a size of 3 (number of trials). These lists will then be filled during response collection:

```python
sub_resp = [[0]*nTrials]*nBlocks

for block in range(nBlocks):
    #...
    for trial in range(nTrials):
        #...
        
        if keys:
            if 'escape' in keys:
                core.quit()
            else:
                if keys[0] != None: # Simplified check for response
                    sub_resp[block][trial] = keys[0] # Use keys[0] for the first key
```

Using indexing, you can fill the relevant keys that are pressed and the response time for every trial. If you exit the experiment early, any keypresses that have been made up until that point will be stored, but everything else will remain zeros.

As for recording accuracy, the correct response for a given trial will depend on your particular task. As an easy example, say you want your participants to do a simple math problem on each trial (sometimes used as an attention check for central visual focus):


In [None]:
math_problems = ['1+3=','1+1=','3-2=','4-1='] #write a list of simple arithmetic
solutions = [4,2,1,3] #write solutions
prob_sol = list(zip(math_problems,solutions))

print(prob_sol)


[('1+3=', 4), ('1+1=', 2), ('3-2=', 1), ('4-1=', 3)]


If you randomly select a problem from the list on a trial-by-trial basis, you can update the `"prob"` (the problem that will be shown on that trial) and the `"corr_resp"` (the correct response for that trial), in a trial-by-trial manner before presenting the stimuli:


In [None]:
import numpy as np

nBlocks=2
nTrials=4

math_problems = ['1+3=','1+1=','3-2=','4-1='] 
solutions = [4,2,1,3] 
prob_sol = list(zip(math_problems,solutions))

# Using list comprehension for safer list initialization
corr_resp = [[0 for _ in range(nTrials)] for _ in range(nBlocks)]
prob = [[0 for _ in range(nTrials)] for _ in range(nBlocks)]

for block in range(nBlocks):
    
    for trial in range(nTrials):
        #choose a random problem from the list
        selected_prob = prob_sol[np.random.choice(4)]
        prob[block][trial] = selected_prob
        #the solution is at index 1 in the zipped list
        corr_resp[block][trial] = selected_prob[1]
        
        print(prob[block][trial], corr_resp[block][trial])
        
        #draw stimulus here


('1+1=', 2) 2
('4-1=', 3) 3
('3-2=', 1) 1
('1+3=', 4) 4
('4-1=', 3) 3
('3-2=', 1) 1
('1+1=', 2) 2
('4-1=', 3) 3


Then, after the stimulus is presented (the math problem), to record subject accuracy, you compare the subject's response to the correct response:

```python
        #record subject accuracy
        #correct- remembers keys are saved as strings
        if sub_resp[block][trial] == str(corr_resp[block][trial]):
            sub_acc[block][trial] = 1 #arbitrary number for accurate response
        #incorrect- remember keys are saved as strings              
        elif sub_resp[block][trial] != str(corr_resp[block][trial]):
            sub_acc[block][trial] = 2 #arbitrary number for inaccurate response 
                                    #(should be something other than 0 to distinguish 
                                    #from non-responses)
```

All together, in a functional snippet, this looks like:


In [None]:
from psychopy import core, event, visual, monitors
import numpy as np # need for np.random.choice

#monitor specs
mon = monitors.Monitor('myMonitor', width=35.56, distance=60)
mon.setSizePix([1920, 1080])
win = visual.Window(monitor=mon, size=(400,400), color=[-1,-1,-1])

#blocks, trials, stims, and clocks
nBlocks=2
nTrials=4
my_text=visual.TextStim(win)
rt_clock = core.Clock()  # create a response time clock
cd_timer = core.CountdownTimer() #add countdown timer

#prefill lists for responses (using list comprehension for safety)
sub_resp = [[0 for _ in range(nTrials)] for _ in range(nBlocks)]
sub_acc = [[0 for _ in range(nTrials)] for _ in range(nBlocks)]
prob = [[0 for _ in range(nTrials)] for _ in range(nBlocks)]
corr_resp = [[0 for _ in range(nTrials)] for _ in range(nBlocks)]
resp_time = [[0 for _ in range(nTrials)] for _ in range(nBlocks)] # Added for completion

#create problems and solutions to show
math_problems = ['1+3=','1+1=','3-2=','4-1='] #write a list of simple arithmetic
solutions = [4,2,1,3] #write solutions
prob_sol = list(zip(math_problems,solutions))

for block in range(nBlocks):
    for trial in range(nTrials):
        #what problem will be shown and what is the correct response?
        selected_prob = prob_sol[np.random.choice(4)]
        prob[block][trial] = selected_prob
        corr_resp[block][trial] = selected_prob[1]
        
        rt_clock.reset()  # reset timing for every trial
        cd_timer.add(1.5) # Reduced from 3s for example speed

        event.clearEvents(eventType='keyboard')  # reset keys for every trial
        
        count=-1 #for counting keys
        sub_resp_val = None # Temporary variable for response inside the loop
        
        while cd_timer.getTime() > 0: #for 3 seconds

            my_text.text = prob[block][trial][0] #present the problem for that trial
            my_text.draw()
            win.flip()

            #collect keypresses after first flip
            keys = event.getKeys(keyList=['1','2','3','4','escape'])

            if keys:
                if 'escape' in keys:
                    win.close()
                    core.quit()
                
                count=count+1 #count up the number of times a key is pressed

                if count == 0: #if this is the first time a key is pressed
                    #get RT for first response in that loop
                    resp_time[block][trial] = rt_clock.getTime()
                    #get key for only the first response in that loop
                    sub_resp_val = keys[0] # remove from list
        
        # Save the collected response after the while loop finishes
        if sub_resp_val is not None:
            sub_resp[block][trial] = sub_resp_val
            
            #record subject accuracy
            #correct- remembers keys are saved as strings
            if sub_resp[block][trial] == str(corr_resp[block][trial]):
                sub_acc[block][trial] = 1 #arbitrary number for accurate response
            #incorrect- remember keys are saved as strings              
            elif sub_resp[block][trial] != str(corr_resp[block][trial]):
                sub_acc[block][trial] = 2 #arbitrary number for inaccurate response 
                                        #(should be something other than 0 to distinguish 
                                        #from non-responses)
        else:
            # No response was made, accuracy remains 0 (or a specific non-response code)
            sub_resp[block][trial] = 'None'
            sub_acc[block][trial] = 0
                    
        #print results
        print('problem=', prob[block][trial], 'correct response=', 
              corr_resp[block][trial], 'subject response=',sub_resp[block][trial], 
              'subject accuracy=',sub_acc[block][trial])

win.close()


problem= ('1+1=', 2) correct response= 2 subject response= 2 subject accuracy= 1
problem= ('4-1=', 3) correct response= 3 subject response= 3 subject accuracy= 1
problem= ('3-2=', 1) correct response= 1 subject response= 2 subject accuracy= 2
problem= ('1+3=', 4) correct response= 4 subject response= 4 subject accuracy= 1
problem= ('4-1=', 3) correct response= 3 subject response= 3 subject accuracy= 1
problem= ('3-2=', 1) correct response= 1 subject response= 1 subject accuracy= 1
problem= ('1+1=', 2) correct response= 2 subject response= 2 subject accuracy= 1
problem= ('4-1=', 3) correct response= 3 subject response= 3 subject accuracy= 1


At the end of the experiment, you should have complete lists of everything you would like to save in a file:


In [None]:
# The results here will be based on the last run of the previous cell
print(prob)
print(corr_resp)
print(sub_resp)
print(sub_acc)


[[('1+1=', 2), ('4-1=', 3), ('3-2=', 1), ('1+3=', 4)], [('4-1=', 3), ('3-2=', 1), ('1+1=', 2), ('4-1=', 3)]]
[[2, 3, 1, 4], [3, 1, 2, 3]]
[['2', '3', '2', '4'], ['3', '1', '2', '3']]
[[1, 1, 2, 1], [1, 1, 1, 1]]


TO THE [RECORDING DATA EXERCISES](Link_to_Recording_Data_Exercises)

BACK TO [TABLE OF CONTENTS](#Table-of-Contents)

---

## Saving data: csv

The filename for your data should be the filename you have defined at the beginning of your script (which we went over many moons ago in level4). The classic (but least flexible) way of saving data is using a comma-separated value (csv) file. To save these lists in a csv, you need to 1.) import the `csv` module, 2.) provide the filename and directory name, 3.) data to be saved, and 4.) save location:


In [None]:
import csv #1
import os

# 2 create filename and directory (using placeholder path for demonstration)
filename = 'savecsv_example.csv'
print(filename)

main_dir = os.getcwd() #define the main directory where experiment info is stored
# Simulating the creation of the target directory structure
data_path = os.path.join(main_dir,'exp','data')
os.makedirs(data_path, exist_ok=True)
#point to a data directory to save the output
data_dir = os.path.join(data_path, filename)
print(data_dir.replace(main_dir, "/path/to/your/current/directory"))

#3: a list of lists of all the data you want to save
data_as_list = [prob, corr_resp, sub_resp, sub_acc]

#4: mode='w' means 'write mode'. "sub_data" is arbitrary, but stay consistent
with open(data_dir, mode='w', newline='') as sub_data:
    #delimiter=',' for lists of values separated by commas
    data_writer = csv.writer(sub_data, delimiter=',')
    data_writer.writerow(data_as_list) #write

# Read back the saved file for demonstration (this will show the ugly output)
with open(data_dir, mode='r') as f:
    content = f.read()
    print("\nContent of the generated CSV file:\n")
    print(content)


savecsv_example.csv
/path/to/your/current/directory/exp/data/savecsv_example.csv


This will give you a file that looks like this (with one row):

```csv
"[[('1+1=', 2), ('3-2=', 1), ('1+3=', 4), ('4-1=', 3)], [('1+1=', 2), ('3-2=', 1), ('1+3=', 4), ('4-1=', 3)]]","[[2, 1, 4, 3], [2, 1, 4, 3]]","[['2', '1', '3', '3'], ['2', '1', '3', '3']]","[[1, 1, 2, 1], [1, 1, 2, 1]]"
```

This looks pretty ugly, and csv is pretty bare-bones in terms of aesthetics. You don't need to save all your experiment data in one file, of course. If you want, you can separate files by block or by type (probs, corr_resp, etc.). You can specify this when you are creating your csv files. For example:


In [None]:
import csv
import os

# Ensure the variables from the previous section are available (prob, corr_resp, etc.)

filename = 'savecsv_example2' #leave off the extension for now, since saving multiple files
main_dir = os.getcwd()
data_path = os.path.join(main_dir,'exp','data')
os.makedirs(data_path, exist_ok=True)
data_dir = os.path.join(data_path, filename)

# to save each data type individually with one block per row
data_as_list = [prob, corr_resp, sub_resp, sub_acc]

#add a types list for labelling
types = ['problem','correct_answer','subject_response','subject_accuracy']

count=-1 #add a counter to cycle through filenames

for data in data_as_list: #open each data type to save individually
    count=count+1
    
    #add type to filename
    full_filename = data_dir + '_' + types[count] + '.csv'
    with open(full_filename, mode='w', newline='') as sub_data:
        data_writer = csv.writer(sub_data, delimiter=',')
        for block in data: #loop over each block now
            # Note: For 'problem', the elements are tuples (e.g., ('1+1=', 2)), 
            # which CSV will handle by enclosing in quotes and separating by comma.
            data_writer.writerow(block) #write a new row for that block
            
print("4 files created in the data directory: ")
for t in types:
    print(f"  - {filename}_{t}.csv")


If done correctly, this should create 4 files (one for each data type), with 2 rows in each file (one for each block), with 4 data points per row (one for each trial).

TO THE [SAVE CSV EXERCISES](Link_to_Save_CSV_Exercises)

BACK TO [TABLE OF CONTENTS](#Table-of-Contents)

---

## Saving data: JSON

Another method of saving your data is the JSON file. JSON requires that you save your data as a dictionary. You can append data to dictionaries using an online method like I did with the lists, or you could transform your lists into dictionaries if you would rather stick with lists for online data collection. The latter is shown here:


In [None]:
nBlocks = 2

for block in range(nBlocks):
    #run experiment
    #...
    #save data
    data_as_dict = []
    for a,b,c,d in zip(prob[block], corr_resp[block], sub_resp[block], sub_acc[block]):
        #the names listed here do not need to be the samr as the variable names
        data_as_dict.append({'problem':a,'corr_resp':b,'sub_resp':c,'sub_acc':d})
        
    # Only print for the first block as an example
    if block == 0:
        print(data_as_dict)        


[{'problem': ('1+1=', 2), 'corr_resp': 2, 'sub_resp': '2', 'sub_acc': 1}, {'problem': ('4-1=', 3), 'corr_resp': 3, 'sub_resp': '3', 'sub_acc': 1}, {'problem': ('3-2=', 1), 'corr_resp': 1, 'sub_resp': '2', 'sub_acc': 2}, {'problem': ('1+3=', 4), 'corr_resp': 4, 'sub_resp': '4', 'sub_acc': 1}]


Then, you would use the following syntax to save the data to JSON:


In [None]:
import json as json
import os

nBlocks = 2
main_dir = os.getcwd()
data_path = os.path.join(main_dir,'exp','data')

for block in range(nBlocks):
    #run experiment
    #...

    #JSON files can be saved with txt or JSON extension, I like to use .txt
    filename = 'savejson_example'
    data_dir = os.path.join(data_path, filename)

    data_as_dict = []
    for a,b,c,d in zip(prob[block], corr_resp[block], sub_resp[block], sub_acc[block]):
        #the names listed here do not need to be the samr as the variable names
        data_as_dict.append({'problem':a,'corr_resp':b,'sub_resp':c,'sub_acc':d})
        
    output_filepath = data_dir + '_block%i.txt'%block
    with open(output_filepath, 'w') as outfile:
        json.dump(data_as_dict, outfile, indent=4)
        
    if block == 0:
        print(f"File '{os.path.basename(output_filepath)}' created.")

print("2 JSON files (for block 0 and 1) created in the data directory.")


And you get output that looks something like this for each block:

```json
[{"problem": ["1+1=", 2], "corr_resp": 2, "sub_resp": "2", "sub_acc": 1}, {"problem": ["3-2=", 1], "corr_resp": 1, "sub_resp": "1", "sub_acc": 1}, {"problem": ["1+3=", 4], "corr_resp": 4, "sub_resp": "3", "sub_acc": 2}, {"problem": ["4-1=", 3], "corr_resp": 3, "sub_resp": "3", "sub_acc": 1}]
```

That was a lot less scripting to get where you wanted, wasn't it? But just like learning how to calculate ANOVAs by hand in year 1 stats, learning how to save CSVs is a lesson in appreciating all the hard work that went into your ancestors' python.

TO THE [SAVE JSON EXERCISES](Link_to_Save_JSON_Exercises)

BACK TO [TABLE OF CONTENTS](#Table-of-Contents)

---

## Reading data: JSON

So why save your data in a dictionary in the first place? Saving your data as lists makes it easier to copy/paste saved data into excel, for example. But my opinion is that loading data in excel is rather messy and limited -- you are prone to lose bits of data here and there just by human error, and you can't do a lot of analyses in excel, either. Saving your data as a dictionary makes it much nicer to read and analyze in python. So for the final part of the tutorial, I will demonstrate how to load saved data from a dictionary using the JSON example.

First, importing a dictionary back into python in a readable, tabular format, is as easy as using the "pandas" module (the python data analysis library). First, import pandas:


In [None]:
import pandas as pd #shorten name for ease of reference
import os

main_dir = os.getcwd()
data_path = os.path.join(main_dir,'exp','data')
filename = 'savejson_example'
data_dir = os.path.join(data_path, filename)


When you then type "pd.", you can press tab and scroll through all the various options pandas contains. One of these is `"read_json"`:


In [None]:
#load the imported data as a variable (df)
df = pd.read_json(data_dir+'_block1.txt')
print(df)


     problem  corr_resp  sub_resp  sub_acc
0  [4-1=, 3]          3         3        1
1  [3-2=, 1]          1         1        1
2  [1+1=, 2]          2         2        1
3  [4-1=, 3]          3         3        1


Then, your data are loaded as a "DataFrame" object (`df`), which you can view in tabular format. Doesn't that look nice??

Because your data load as an object, you can specify which columns to print using the `df.X` method:


In [None]:
print(df.problem)


0    [4-1=, 3]
1    [3-2=, 1]
2    [1+1=, 2]
3    [4-1=, 3]
Name: problem, dtype: object


You can also print your data as a formatted table using the `pd.DataFrame` function (the aesthetics of results vary depending on the environment you are in):


In [None]:
pd.DataFrame(df)


Unnamed: 0,problem,corr_resp,sub_resp,sub_acc
0,"[4-1=, 3]",3,3,1
1,"[3-2=, 1]",1,1,1
2,"[1+1=, 2]",2,2,1
3,"[4-1=, 3]",3,3,1


You can also filter your data in different ways:


In [None]:
acc_trials = df.loc[df['sub_acc'] == 1] #show only trials on which subject was correct
print(acc_trials)


     problem  corr_resp  sub_resp  sub_acc
0  [4-1=, 3]          3         3        1
1  [3-2=, 1]          1         1        1
2  [1+1=, 2]          2         2        1
3  [4-1=, 3]          3         3        1


You can also calculate mean accuracy this way:


In [None]:
len(acc_trials)/len(df['sub_resp']) #divide 1 responses by total responses


1.0

You can also compute correlations between columns of data:


In [None]:
print("Pearson r:")
print(pd.DataFrame.corr(df[['corr_resp', 'sub_resp', 'sub_acc']],method='pearson'))
print("Spearman rho:")
print(pd.DataFrame.corr(df[['corr_resp', 'sub_resp', 'sub_acc']],method='spearman'))


Pearson r:
           corr_resp  sub_resp  sub_acc
corr_resp        1.0       1.0      NaN
sub_resp         1.0       1.0      NaN
sub_acc          NaN       NaN      NaN
Spearman rho:
           corr_resp  sub_resp  sub_acc
corr_resp        1.0       1.0      NaN
sub_resp         1.0       1.0      NaN
sub_acc          NaN       NaN      NaN


TO THE [READ DATA EXERCISES](Link_to_Read_Data_Exercises)

BACK TO [TABLE OF CONTENTS](#Table-of-Contents)

---

ONWARD TO: [Advanced: Misc tricks for improving your code](Link_to_Advanced_Level)