# Assignment 4: List Generation for Experiments
## Computational Methods in Psychology (and Neuroscience)
### Psychology 4500/7559 --- Fall 2020

# Objectives

Upon completion of this assignment, the student will have:

1. Read in a stimulus pool from a file.

2. Randomly generated lists to use in a experiment.

3. Written the lists out to files for future use.


# Assignment

* Write code in a Jupyter notebook (after making a copy and renaming it to have your userid in the title --- e.g., A04_ListGen_mst3k).

## Design

Your assignment is to write a script that reads in a pool of stimuli
and creates lists of dictionaries that you will later present to
participants as part of an experiment.  

The script should be configurable such that you can specify different
numbers of lists and trials, along with other details specific to the
experiment you decide to do.

Each dictionary represents a trial and should contain all the
information necessary to identify the stimulus to be presented,
details about that stimulus, and the condition in which to present it.
This information will be experiment-specific, as outlined below.

You have three options for your experiment.  Please select **one** of
the following experiments, keeping in mind that your next assignment
will be to code the experiment presentation and response collection
for the lists you generate from this assignment.

  
* ***When you are done, save this notebook as HTML (`File -> Download as -> HTML`) and upload it to the matching assignment on UVACollab.***  

## Generic Study/Test Block Function

In [2]:
import random
from csv import DictReader
import copy

# function to make a study/test block from the pools past in
def gen_block(pools, cond, num_items):
    # fill the study list
    study_list = []
    
    # loop over pools
    for pool in pools:
        # loop over items to add from that pool
        # this will be num_items/num_types for mixed lists
        for i in range(num_items):
            study_item = pool.pop()
            study_item.update({'novelty': 'target', 
                               'cond': cond})
            study_list.append(study_item)

    # shuffle the study_list
    random.shuffle(study_list)
    
    # copy the study list to be the start of the test list
    test_list = copy.deepcopy(study_list)
    
    # loop over pools
    for pool in pools:
        # loop over items to add from that pool
        # this will be num_items/num_types for mixed lists
        for i in range(num_items):
            test_item = pool.pop()
            test_item.update({'novelty': 'lure', 
                              'cond': cond})
            test_list.append(test_item)
    
    # shuffle the test list
    random.shuffle(test_list)
    
    return {'study': study_list, 'test': test_list}

## Verification function

In [3]:
import numpy as np

# verification function
def verify_blocks(blocks, study_key='study', test_key='test', 
                  cond_key='cond', mixed_cond='mixed',
                  novelty_key='novelty', type_key='in_out'):
    # pull out the unique conditions from the first item in each study list
    conds = np.array([b[study_key][0][cond_key] for b in blocks])
    uconds = np.unique(conds)
    num_conds = len(uconds)
    print('Conds:', conds)
        
    # verify number of blocks is multiple of num_conds
    assert len(blocks) % num_conds == 0
    
    # verify each cond is same number of times
    cond_counts = np.array([(conds==cond).sum() for cond in uconds])
    print('Cond Counts:', cond_counts)
    assert np.all((cond_counts - cond_counts[0])==0)

    # verify number of study items is always the same
    num_study = np.array([len(b[study_key]) for b in blocks])
    print('Num Study:', num_study)
    assert np.all((num_study - num_study[0])==0)

    # verify number of test items is always the same
    num_test = np.array([len(b[test_key]) for b in blocks])
    print('Num Test:', num_test)
    assert np.all((num_test - num_test[0])==0)
    
    # verify study block is half length of test block
    assert np.all((num_study*2 - num_test)==0)
    
    # do some checks on each block
    for b in blocks:
        if b[study_key][0][cond_key] == mixed_cond:
            # verify mixed lists
            # must have same number of each type
            types = np.array([item[type_key] for item in b[study_key]])
            utypes = np.unique(types)
            type_counts = np.array([(types == ut).sum() 
                                    for ut in utypes])
            assert np.all((type_counts - type_counts[0]) == 0)
            
    print('It passed all the tests!!!')


## Option 1: Valence Study

The main question of this study is whether recognition memory for
words depends on the emotional or affective valence of those words.
Participants will study lists of positive (+), negative (-), and
neutral (~) words and then, after a short delay, they will be given a
recognition test over all the studied target words plus a matched set
of non-studied lures.  The stimuli are contained in three separate CSV
files:

- [Positive Pool](./pos_pool.csv)
- [Negative Pool](./neg_pool.csv)
- [Neutral Pool](./neu_pool.csv)

You will need to read these files in as lists of dictionaries (hint,
use the ``DictReader`` from the ``csv`` module that was covered in
class.)  Use these pools to create lists of trials for two
experimental conditions: pure or mixed.  In the *pure* condition,
all of the trials should be words from the same valence (be sure to
have the same number of positive, negative, and neutral pure lists.)
In the *mixed* condition, each list should contain an equal number of
positive, negative, and neutral words in *random* order (hint, use the
``shuffle`` function provided by the ``random`` module.) 

You will need to generate a matching test list for each study list
that includes all the studied items, plus a set of lures that match
the valence of the studied words.

Be sure to add in information to each trial dictionary that identifies
the word, its valence, the condition of the list, and whether it is a
target or a lure.  Feel free to add in more information if you would
like.


In [4]:
# config variables
pos_file = 'pos_pool.csv'
neg_file = 'neg_pool.csv'
neu_file = 'neu_pool.csv'

# number of pools
num_pools = 3

# number of items in pure lists (must be evenly divisible by num_pools)
num_items_pure = 9

# number of repetitions of each block type
num_reps = 3       

# verify these numbers make sense
num_items_mixed = int(num_items_pure / num_pools)
assert num_items_mixed * num_pools == num_items_pure

In [5]:
# load in the pools (must add in valence)
pos_pool = [dict({'valence': 'pos'}, **i) 
            for i in DictReader(open(pos_file, 'r'))]
neg_pool = [dict({'valence': 'neg'}, **i) 
            for i in DictReader(open(neg_file, 'r'))]
neu_pool = [dict({'valence': 'neu'}, **i) 
            for i in DictReader(open(neu_file, 'r'))]

# print out number of items in each pool
print('pos_pool:', len(pos_pool))
print('neg_pool:', len(neg_pool))
print('neu_pool:', len(neu_pool))

# shuffle the pools
random.shuffle(pos_pool)
random.shuffle(neg_pool)
random.shuffle(neu_pool)

pos_pool: 301
neg_pool: 292
neu_pool: 208


In [6]:
# generate the blocks
blocks = []
for r in range(num_reps):
    # generate a pure pos block
    blocks.append(gen_block([pos_pool], 'pos', 
                            num_items_pure))
    
    # generate a pure neg block
    blocks.append(gen_block([neg_pool], 'neg', 
                            num_items_pure))
    
    # generate a pure neu block
    blocks.append(gen_block([neu_pool], 'neu', 
                            num_items_pure))
    
    # generate a mixed pos/neg/neu block
    blocks.append(gen_block([pos_pool, neg_pool, neu_pool], 
                            'mixed', num_items_mixed))

# shuffle the blocks
random.shuffle(blocks)

# let's see how many items we have left
print('pos_pool:', len(pos_pool))
print('neg_pool:', len(neg_pool))
print('neu_pool:', len(neu_pool))

len(blocks)

pos_pool: 229
neg_pool: 220
neu_pool: 136


12

In [13]:
blocks[0]

{'study': [{'valence': 'neg',
   'description': 'wicked',
   'word_no': '493',
   'valence_mean': '2.96',
   'valence_sd': '2.3700000000000001',
   'arousal_mean': '6.0899999999999999',
   'arousal_sd': '2.4399999999999999',
   'dominance_mean': '4.3600000000000003',
   'dominance_sd': '2.6499999999999999',
   'word_frequency': '9',
   'novelty': 'target',
   'cond': 'neg'},
  {'valence': 'neg',
   'description': 'maniac',
   'word_no': '862',
   'valence_mean': '3.7599999999999998',
   'valence_sd': '2.0',
   'arousal_mean': '5.3899999999999997',
   'arousal_sd': '2.46',
   'dominance_mean': '4.2199999999999998',
   'dominance_sd': '2.0699999999999998',
   'word_frequency': '4',
   'novelty': 'target',
   'cond': 'neg'},
  {'valence': 'neg',
   'description': 'loser',
   'word_no': '851',
   'valence_mean': '2.25',
   'valence_sd': '1.48',
   'arousal_mean': '4.9500000000000002',
   'arousal_sd': '2.5699999999999998',
   'dominance_mean': '3.02',
   'dominance_sd': '2.1699999999999999

In [117]:
verify_blocks(blocks, study_key='study', test_key='test', 
              cond_key='cond', mixed_cond='mixed',
              novelty_key='novelty', type_key='valence')

Conds: ['pos' 'mixed' 'mixed' 'neu' 'neu' 'mixed' 'pos' 'pos' 'neg' 'neg' 'neu'
 'neg']
Cond Counts: [3 3 3 3]
Num Study: [9 9 9 9 9 9 9 9 9 9 9 9]
Num Test: [18 18 18 18 18 18 18 18 18 18 18 18]
It passed all the tests!!!


## Option 2: Scene Study

This study will test whether recognition memory for indoor and outdoor
scenes is modulated by the structure of the study lists.
Specifically, participants will study lists that either have indoor
and outdoor scenes that come in pure blocks or intermixed (similar to
the Valence study above).  The participants will then be given a
recognition test over all the studied target images plus a matched set
of non-studied lures.  You can access the lists of stimuli available:

- [Indoor Pool](./indoor.csv)
- [Outdoor Pool](./outdoor.csv)

You will need to read these files in as lists of dictionaries (hint,
use the ``DictReader`` from the ``csv`` module that was covered in
class.)  For the actual experiment we will give you the images that
are referenced by the file names in these pools, but for the list
generation you do not need the images, themselves and should identify
the image you will be presenting using the file name.  Use these pools
to create lists of trials for two experimental conditions: pure or
mixed.  In the *pure* condition, all of the trials should be images
from the same category (be sure to have the same number of indoor
and outdoor pure lists.)  In the *mixed* condition, each
list should contain an equal number of indoor and outdoor
images in *random* order (hint, use the ``shuffle`` function provided
by the ``random`` module.)

You will need to generate a matching test list for each study list
that includes all the studied items, plus a set of lures that match
the image categories from the studied items.

Be sure to add in information to each trial dictionary that identifies
the file name, the category of the image, the condition of the list,
and whether it is a target or a lure.


In [2]:
# config variables
indoor_file = 'indoor.csv'
outdoor_file = 'outdoor.csv'

# number of pools
num_pools = 2

# number of items in pure lists (must be evenly divisible by num_pools)
num_items_pure = 10

# number of repetitions of each block type
num_reps = 3       

# verify these numbers make sense
num_items_mixed = int(num_items_pure / num_pools)
assert num_items_mixed * num_pools == num_items_pure

In [3]:
# load in the pools
indoor_pool = [i for i in DictReader(open(indoor_file, 'r'))]
outdoor_pool = [i for i in DictReader(open(outdoor_file, 'r'))]
print('indoor:', len(indoor_pool))
print('outdoor:', len(outdoor_pool))

# shuffle the pools
random.shuffle(indoor_pool)
random.shuffle(outdoor_pool)

indoor: 335
outdoor: 309


In [4]:
# generate the blocks
blocks = []
for r in range(num_reps):
    # generate a pure indoor block
    blocks.append(gen_block([indoor_pool], 'indoor', 
                            num_items_pure))
    
    # generate a pure outdoor block
    blocks.append(gen_block([outdoor_pool], 'outdoor', 
                            num_items_pure))
    
    # generate a mixed indoor/outdoor block
    blocks.append(gen_block([indoor_pool, outdoor_pool], 'mixed', 
                            num_items_mixed))

# shuffle the blocks
random.shuffle(blocks)

# let's see how many items we have left
print('indoor:', len(indoor_pool))
print('outdoor:', len(outdoor_pool))

len(blocks)

indoor: 245
outdoor: 219


9

In [10]:
blocks[0]

{'study': [{'filename': 'in0217.jpg',
   'in_out': 'indoor',
   'novelty': 'target',
   'cond': 'indoor'},
  {'filename': 'in0339.jpg',
   'in_out': 'indoor',
   'novelty': 'target',
   'cond': 'indoor'},
  {'filename': 'in0036.jpg',
   'in_out': 'indoor',
   'novelty': 'target',
   'cond': 'indoor'},
  {'filename': 'in0311.jpg',
   'in_out': 'indoor',
   'novelty': 'target',
   'cond': 'indoor'},
  {'filename': 'in0343.jpg',
   'in_out': 'indoor',
   'novelty': 'target',
   'cond': 'indoor'},
  {'filename': 'in0024.jpg',
   'in_out': 'indoor',
   'novelty': 'target',
   'cond': 'indoor'},
  {'filename': 'in0263.jpg',
   'in_out': 'indoor',
   'novelty': 'target',
   'cond': 'indoor'},
  {'filename': 'in0157.jpg',
   'in_out': 'indoor',
   'novelty': 'target',
   'cond': 'indoor'},
  {'filename': 'in0207.jpg',
   'in_out': 'indoor',
   'novelty': 'target',
   'cond': 'indoor'},
  {'filename': 'in0183.jpg',
   'in_out': 'indoor',
   'novelty': 'target',
   'cond': 'indoor'}],
 'test': [

In [122]:
verify_blocks(blocks, study_key='study', test_key='test', 
              cond_key='cond', mixed_cond='mixed',
              novelty_key='novelty', type_key='in_out')

Conds: ['mixed' 'indoor' 'mixed' 'outdoor' 'indoor' 'mixed' 'outdoor' 'outdoor'
 'indoor']
Cond Counts: [3 3 3]
Num Study: [10 10 10 10 10 10 10 10 10]
Num Test: [20 20 20 20 20 20 20 20 20]
It passed all the tests!!!
