In this notebook, we test the performance of different methods of: adding samples to a growing list; copying lists; and converting those copies to Numpy arrays. The Enobio32 EEG cap transmits data at 500 samples per second, and other caps have higher sampling rates. In order to accomodate high sampling rates, it is imperative to use the fastest method of growing a list of data and of copying that list for analysis using MNE-Python.

In [1]:
import random
from copy import deepcopy

import numpy as np
import pandas as pd

# Adding samples of data

In [7]:
# Setup code

N_REPETITIONS = 15000
# To accomodate data tranmission at 500 Hz, 15000 samples
# represents 30 seconds of data.

channels = ['Fp1','Fp2','AF3','AF4','F3','F4','F7','F8','FC5','FC6','T7','T8',
'FC1','FC2','C3','C4','CP5','CP6','P7','P8','CP1','CP2','P3','P4','01','02',
'PO3','PO4','Oz','Pz','Cz','Fz']

a = [[] for _ in channels]
b = [round(random.random(), 1) for _ in channels]

def add_col((x,y)):
    x.append(y)

df = pd.DataFrame(index=channels)
b_pandas = pd.Series(b, index=df.index)

## Adding one sample of data

### By column (no need for transposing)

In [8]:
%timeit for x,y in zip(a,b): x.append(y)

100000 loops, best of 3: 7.83 µs per loop


In [9]:
%timeit map(add_col, zip(a,b))

100000 loops, best of 3: 10.9 µs per loop


In [10]:
%timeit df[0] = b_pandas

The slowest run took 11.27 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 59 µs per loop


In [11]:
%timeit df[0] = b

The slowest run took 4.89 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 82.7 µs per loop


### By row (we would have to transpose later)

In [12]:
%timeit channels.append(b)

The slowest run took 12.11 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 177 ns per loop


## Adding many samples of data and converting to ndarray

In [13]:
%%timeit
a = [[] for _ in channels]
for i in xrange(N_REPETITIONS):
    for x,y in zip(a,b): 
        x.append(y)
ndarray = np.array(a)

1 loop, best of 3: 8.67 s per loop


In [14]:
%%timeit
a = [[] for _ in channels]
for i in xrange(N_REPETITIONS):
    map(add_col, zip(a,b))
ndarray = np.array(a)

1 loop, best of 3: 8.41 s per loop


In [15]:
%%timeit
a = [[] for _ in channels]
for i in xrange(N_REPETITIONS):
    df[i] = b
ndarray = df.values

1 loop, best of 3: 6.55 s per loop


In [16]:
%%timeit
a = [[] for _ in channels]
for i in xrange(N_REPETITIONS):
    df[i] = b_pandas
ndarray = df.values

1 loop, best of 3: 6.18 s per loop


In [17]:
%%timeit
a = []
for i in xrange(N_REPETITIONS): 
    a.append(b)
ndarray = np.array(a).T

100 loops, best of 3: 15 ms per loop


# Copying nested list and converting to ndarray

In [2]:
# Setup code (similar to setup code above)

N_REPETITIONS = 100000
# To accomodate data tranmission at 500 Hz, 100000 samples
# represents 200 seconds of data.

channels = ['Fp1','Fp2','AF3','AF4','F3','F4','F7','F8','FC5','FC6','T7','T8',
'FC1','FC2','C3','C4','CP5','CP6','P7','P8','CP1','CP2','P3','P4','01','02',
'PO3','PO4','Oz','Pz','Cz','Fz']

fake_eeg_data = []
fake_sample = [round(random.random(), 1) for _ in channels]

# Generate fake EEG data that resembles the structure of the real data
# We will copy this data and convert it to type ndarray.
for i in xrange(N_REPETITIONS):
    fake_eeg_data.append(fake_sample)

In [3]:
%%timeit
# copy.deepcopy() method
copy = deepcopy(fake_eeg_data)
ndarray = np.array(copy).T

10 loops, best of 3: 158 ms per loop


In [4]:
%%timeit
# Preallocate numpy array and copy list into that array
ndarray = np.zeros(shape=(len(fake_eeg_data), len(channels)))
for i, row  in enumerate(fake_eeg_data):
    ndarray[i] = row[:]
ndarray = ndarray.T

10 loops, best of 3: 165 ms per loop


In [5]:
%%timeit
# Nested list comprehension
copy = [[item for item in row] for row in fake_eeg_data]
ndarray = np.array(copy).T

1 loop, best of 3: 440 ms per loop


In [6]:
%%timeit
# List comprehension and slicing
copy = [row[:] for row in fake_eeg_data]
ndarray = np.array(copy).T

10 loops, best of 3: 131 ms per loop
