In [1]:
import sys
import os
sys.path.insert(0, os.path.abspath('../../'))
test_data_path = '../../test/test_data/'
import neurone_loader
import logging
neurone_loader.lazy.logger.debug = neurone_loader.lazy.logger.warning

# Lazy loading


Because raw EEG recordings can be quite large this package is very aware of memory restrictions
and possible bottlenecks due to long loading times from disk.

Therefore most actions that require loading data from disk into memory are executed lazily, meaning ...

1. the data is loaded from disk when you access it for the first time
2. the data remains in memory and can be accessed very fast subsequently

To make working with the data more comfortable, the creation of containers and the loading of metadata on the other hand happens instantly.

In [2]:
from neurone_loader import Recording

In [3]:
# fast: only relevant metadata is loaded from disk
%time rec = Recording(test_data_path)

CPU times: user 16.3 ms, sys: 382 µs, total: 16.6 ms
Wall time: 15.4 ms


In [4]:
%%time
# fast: metadata is already in memory
print(f'Sessions: {len(rec.sessions)}')
print(f'Sampling rate: {rec.sampling_rate}Hz')

Sessions: 2
Sampling rate: 5000Hz
CPU times: user 302 µs, sys: 0 ns, total: 302 µs
Wall time: 189 µs


In [5]:
%%time
# this is slow: the session data needs to be retrieved from disk first
print(f'Session 1 shape: {rec.sessions[0].data.shape}')

(Lazy) loading Session.data
(Lazy) loading Phase.data
(Lazy) loading Phase.data
(Lazy) loading Phase.data
(Lazy) loading Phase.data


Session 1 shape: (2504369, 138)
CPU times: user 10.2 s, sys: 2.78 s, total: 13 s
Wall time: 4.05 s


In [6]:
%%time
# this will be faster because the data is already in memory
print(f'Session 1 shape again: {rec.sessions[0].data.shape}')

Session 1 shape again: (2504369, 138)
CPU times: user 949 µs, sys: 242 µs, total: 1.19 ms
Wall time: 77.5 µs


As you can see above the container object can be contructed and used very memory and time efficient. 
Reading the actual session data, which can take a long time and may consume a lot of memory, is only happening when the data is actually needed.
On subsequent calls the already loaded data is retrieved from memory which is much faster.

To save memory the data can be cleared from memory using the `.clear_data()` function.

In [7]:
rec.clear_data()

TODO: preloading