# Combining Data Types into Collections

In this section, we will learn how to combine simple data types into collections, such as lists and arrays. We will explore the advantages and disadvantages of different collection types and learn how to use them.

## Lists

Lists are ordered collections of elements with a dynamic size and type. While lists are very flexible, they may not be the most efficient collection type for all use cases.

Here is an example of a list of reaction times from three different participants:

In [5]:
participant_1_RTs = [713, 552, 473, 143, 638, 311, 668, 937, 621, 459]
participant_2_RTs = [287, 750, 411, 410, 351, 1040, 1124, 891, 924, 664]
participant_3_RTs = [342, 1063, 131, 485, 480, 159, 60, 389, 375, 653]

To compare the reaction times on the 4th trial (index 3), we can use indexing:

In [2]:
print(participant_1_RTs[3], participant_2_RTs[3], participant_3_RTs[3])

143 410 485


Alternatively, we can use a list of lists, which can make the code cleaner:

Now, to get a single value (e.g. 4th trial from 2nd participant), we need to provide 2 indices:

We can also compare them now, using loops, without having to manually type something out for each participant:

## Arrays

Another collection type that can be useful for data analysis is the array. Arrays are provided by the `numpy` package and have some advantages over lists:

* Fixed size (no appending)
* Fixed type (everything of the same type)
* Bulk computations on arrays are much faster

Here is an example of how to convert our list of lists to a numpy array:

In [8]:
import numpy as np

participants_array = np.array(participants) #look at arrays_exercises notebook
print(participants_array)

NameError: name 'participants' is not defined

Now, it all of a sudden becomes very simple to compare reaction times, or to get all reaction times from a single participant!

In [7]:
print(participants_array[0.3])
print(participants_array[:,3])

NameError: name 'participants_array' is not defined

We can also do other operations to aggregate information, like taking the participant mean or standard deviation:

In [None]:
print(participants_array.mean(axis=1)) #mean per participant across trials
print(participants_array.mean(axis=0)) #mean per trial across participants 

Or, just as easily, the trial means: