# Combining Data Types

In this section, we will learn how to combine simple data types into collections, such as lists and arrays. We will explore the advantages and disadvantages of different collection types and learn how to use them.

## Lists

Lists are ordered collections of elements with a dynamic size and type. While lists are very flexible, they may not be the most efficient collection type for all use cases.

Here is an example of a list of reaction times from three different participants:

In [1]:
participant_1_RTs = [713, 552, 473, 143, 638, 311, 668, 937, 621, 459]
participant_2_RTs = [287, 750, 411, 410, 351, 1040, 1124, 891, 924, 664]
participant_3_RTs = [342, 1063, 131, 485, 480, 159, 60, 389, 375, 653]

To compare the reaction times on the 4th trial (index 3), we can use indexing:

In [2]:
print(participant_1_RTs[3], participant_2_RTs[3], participant_3_RTs[3])

143 410 485


Alternatively, we can use a list of lists, which can make the code cleaner:

In [3]:
#   Second (inner) index >  
#    0    1    2    3    4    5    6    7    8    9         First (outer) index \/
participants = [
    [713, 552, 473, 143, 638, 311, 668, 937, 621, 459],    # 0
    [287, 750, 411, 410, 351, 1040, 1124, 891, 924, 664],  # 1
    [342, 1063, 131, 485, 480, 159, 60, 389, 375, 653]     # 2
]

Now, to get a single value (e.g. 4th trial from 2nd participant), we need to provide 2 indices:

In [4]:
print(participants[1][3])

410


We can also compare them now, using loops, without having to manually type something out for each participant:

In [5]:
for participant in participants:
    print(participant[3])

143
410
485


## Arrays

Another collection type that can be useful for data analysis is the array. Arrays are provided by the `numpy` package and have some advantages over lists:

* Fixed size (no appending)
* Fixed type (everything of the same type)
* Bulk computations on arrays are much faster

Here is an example of how to convert our list of lists to a numpy array:

In [6]:
import numpy as np

participants_array = np.array(participants)

Now, it all of a sudden becomes very simple to compare reaction times, or to get all reaction times from a single participant!

In [7]:
print(participants_array[:, 3])  # It's so simple now to compare reaction times!
print(participants_array[1, :])  # Or to get all RTs from a single participant.

[143 410 485]
[ 287  750  411  410  351 1040 1124  891  924  664]


We can also do other operations to aggregate information, like taking the participant mean or standard deviation:

In [8]:
print(participants_array.mean(axis=1))
print(participants_array.std(axis=1))

[551.5 685.2 413.7]
[209.76379573 290.45302546 276.67274893]


Or, just as easily, the trial means:

In [9]:
print(participants_array.mean(axis=0))

[447.33333333 788.33333333 338.33333333 346.         489.66666667
 503.33333333 617.33333333 739.         640.         592.        ]


## Exercises

For this exercise, you will need to import the `log()` function from the math package. Then, try to calculate the log-RTs (the logarithm of each reaction time) using:

1. The three separate RT lists (hint: you may need to use several for-loops)
2. The participants list of lists (hint: two for-loops should be enough here!)
3. The participants array in numpy (hint: use np.log instead of Python's log function; you shouldn't need a loop!)

### Exercise 1

In [10]:
from math import log

# Type your code here, using participant_1_RTs, participant_2_RTs, and participant_3_RTs
print('Participant 1:')
for rt in participant_1_RTs:
    print(log(rt))

print('Participant 2:')
for rt in participant_2_RTs:
    print(log(rt))

print('Participant 3:')
for rt in participant_3_RTs:
    print(log(rt))

Participant 1:
6.569481420414296
6.313548046277095
6.159095388491933
4.962844630259907
6.45833828334479
5.739792912179234
6.504288173536645
6.842683282238422
6.431331081933479
6.129050210060545
Participant 2:
5.659482215759621
6.620073206530356
6.018593214496234
6.016157159698354
5.860786223465865
6.946975992135418
7.024649030453636
6.792344427470809
6.828712071641684
6.498282149476434
Participant 3:
5.834810737062605
6.968850378341948
4.875197323201151
6.184148890937483
6.173786103901937
5.0689042022202315
4.0943445622221
5.963579343618446
5.926926025970411
6.481577129276431


### Exercise 2

In [12]:
from math import log

# Type your code here, using participants
for i, participant_RTs in enumerate(participants):  # You can also do this without enumerate
    print(f'Participant {i + 1}:')
    for rt in participant_RTs:
        print(log(rt))

Participant 1:
6.569481420414296
6.313548046277095
6.159095388491933
4.962844630259907
6.45833828334479
5.739792912179234
6.504288173536645
6.842683282238422
6.431331081933479
6.129050210060545
Participant 2:
5.659482215759621
6.620073206530356
6.018593214496234
6.016157159698354
5.860786223465865
6.946975992135418
7.024649030453636
6.792344427470809
6.828712071641684
6.498282149476434
Participant 3:
5.834810737062605
6.968850378341948
4.875197323201151
6.184148890937483
6.173786103901937
5.0689042022202315
4.0943445622221
5.963579343618446
5.926926025970411
6.481577129276431


### Exercise 3

In [13]:
import numpy as np

# Type your code here, using participants_array
print(np.log(participants_array))

[[6.56948142 6.31354805 6.15909539 4.96284463 6.45833828 5.73979291
  6.50428817 6.84268328 6.43133108 6.12905021]
 [5.65948222 6.62007321 6.01859321 6.01615716 5.86078622 6.94697599
  7.02464903 6.79234443 6.82871207 6.49828215]
 [5.83481074 6.96885038 4.87519732 6.18414889 6.1737861  5.0689042
  4.09434456 5.96357934 5.92692603 6.48157713]]
