# Overview of MNE-Python's data structures

In [69]:
import mne
import os.path as op
from __future__ import print_function
import numpy as np

## `Info`: recording information
The `Info` data object is created when data is imported into MNE-Python and contains details such as:

 - date, subject information, and other recording details
 - the samping rate
 - information about the data channels (name, type, position, etc.)
 - digitized points
 - sensor–head coordinate transformation matrices

and so forth. See the [reference documentation]() for a complete list of all data fields. Once created, this object is passed around throughout the data analysis pipeline.

It behaves as a nested Python dictionary:

In [60]:
# Load an example dataset, which of course contains an info object 
raw = mne.io.RawFIF(op.join(mne.datasets.sample.data_path(), 'MEG', 'sample', 'sample_audvis_raw.fif'))

# List all the fields in the info object
print(raw.info.keys())

Opening raw data file /Users/rodin/BRU/mne-python/examples/MNE-sample-data/MEG/sample/sample_audvis_raw.fif...
    Read a total of 3 projection items:
        PCA-v1 (1 x 102)  idle
        PCA-v2 (1 x 102)  idle
        PCA-v3 (1 x 102)  idle
Current compensation grade : 0
    Range : 25800 ... 192599 =     42.956 ...   320.670 secs
Ready.
Adding average EEG reference projection.
1 projection items deactivated
['acq_stim', 'ch_names', 'lowpass', 'buffer_size_sec', 'hpi_results', 'dev_ctf_t', 'projs', 'meas_date', 'meas_id', 'subject_info', 'sfreq', 'filename', 'chs', 'events', 'dev_head_t', 'line_freq', 'proj_id', 'description', 'highpass', 'hpi_subsystem', 'comps', 'custom_ref_applied', 'experimenter', 'file_id', 'proj_name', 'nchan', 'bads', 'hpi_meas', 'dig', 'ctf_head_t', 'acq_pars']


In [17]:
# Obtain the sampling rate of the data
print(raw.info['sfreq'], 'Hz')

600.614990234 Hz


In [18]:
# List all information about the first data channel
print(raw.info['chs'][0])

{'loc': array([-0.1066    ,  0.0464    , -0.0604    , -0.0127    ,  0.0057    ,
       -0.99990302, -0.186801  , -0.98240298, -0.0033    , -0.98232698,
        0.18674099,  0.013541  ], dtype=float32), 'kind': 1, 'unit_mul': 0, 'coil_trans': array([[-0.0127    , -0.186801  , -0.98232698, -0.1066    ],
       [ 0.0057    , -0.98240298,  0.18674099,  0.0464    ],
       [-0.99990302, -0.0033    ,  0.013541  , -0.0604    ],
       [ 0.        ,  0.        ,  0.        ,  1.        ]]), 'ch_name': 'MEG 0113', 'coil_type': 3012, 'coord_frame': 1, 'logno': 113, 'cal': 3.1600000394149674e-09, 'eeg_loc': None, 'range': 0.00030517578125, 'scanno': 1, 'unit': 201}


### Creating custom `Info` objects

Normally, `Info` objects are created by the various [data import functions](http://martinos.org/mne/dev/python_reference.html#reading-raw-data). However, if you wish to create one from scratch, you can use the [`create_info`](http://martinos.org/mne/stable/generated/mne.create_info.html#mne.create_info) function to initialize the minimally required fields. Further fields can be assigned later as one would with a regular dictionary.

In [56]:
# Names for each channel
channel_names = ['MEG1', 'MEG2', 'EEG1', 'EEG2', 'EOG']

# The type (mag, grad, eeg, eog, misc, ...) of each channel
channel_types = ['grad', 'grad', 'eeg', 'eeg', 'eog']

# The sampling rate of the recording
sfreq = 1000  # in Hertz

# Initialize required fields
info = mne.create_info(channel_names, sfreq, channel_types)

# Add some more information
info['description'] = 'My custom dataset'
info['bads'] = ['EEG2']  # Names of bad channels

print(info)

<Info | 14 non-empty fields
    bads : list | EEG2
    ch_names : list | MEG1, MEG2, EEG1, EEG2, EOG
    chs : list | 5 items (EOG: 1, EEG: 2, GRAD: 2)
    comps : list | 0 items
    custom_ref_applied : bool | False
    description : str | 17 items
    dev_head_t : dict | 3 items
    events : list | 0 items
    hpi_meas : list | 0 items
    hpi_results : list | 0 items
    meas_date : numpy.ndarray | 1970-01-01 01:00:00
    nchan : int | 5
    projs : list | 0 items
    sfreq : float | 1000.0
    acq_pars : NoneType
    acq_stim : NoneType
    ctf_head_t : NoneType
    dev_ctf_t : NoneType
    dig : NoneType
    experimenter : NoneType
    file_id : NoneType
    filename : NoneType
    highpass : NoneType
    hpi_subsystem : NoneType
    line_freq : NoneType
    lowpass : NoneType
    meas_id : NoneType
    proj_id : NoneType
    proj_name : NoneType
    subject_info : NoneType
>


## `Raw`: continuous data

Continuous data is stored in objects of type `Raw`. The core data structure is simply a 2D numpy array (channels × samples, `._data`) combined with an `Info` object (`.info`). The data matrix becomes available when the data is loaded:

In [63]:
# Load an example dataset
raw = mne.io.RawFIF(op.join(mne.datasets.sample.data_path(), 'MEG', 'sample', 'sample_audvis_raw.fif'), preload=True)

# Give the size of the data matrix
print('channels x samples:', raw._data.shape)

Opening raw data file /Users/rodin/BRU/mne-python/examples/MNE-sample-data/MEG/sample/sample_audvis_raw.fif...
    Read a total of 3 projection items:
        PCA-v1 (1 x 102)  idle
        PCA-v2 (1 x 102)  idle
        PCA-v3 (1 x 102)  idle
Current compensation grade : 0
    Range : 25800 ... 192599 =     42.956 ...   320.670 secs
Ready.
Adding average EEG reference projection.
1 projection items deactivated
Reading 0 ... 166799  =      0.000 ...   277.714 secs...
[done]
channels x samples: (376, 166800)


### Creating custom `Raw` objects

To create a `Raw` object from scratch, you can use the `RawArray` class, which implements raw data that is backed by a numpy array. Its constructor simply takes the data matrix and `Info` object:

In [70]:
# Generate some random data
data = np.random.randn(5, 1000)

# Initialize an info structure
info = mne.create_info(
    ch_names=['MEG1', 'MEG2', 'EEG1', 'EEG2', 'EOG'],
    ch_types=['grad', 'grad', 'eeg', 'eeg', 'eog'],
    sfreq=100
)

custom_raw = mne.io.RawArray(data, info)
print(custom_raw)

Creating RawArray with float64 data, n_channels=5, n_times=1000
    Range : 0 ... 999 =      0.000 ...     9.990 secs
Ready.
<RawArray  |  n_channels x n_times : 5 x 1000>


## `Epochs`: epoched data

The `Epochs` object wraps a `Raw` object and exposes the underlying data as epochs. The data is represented as a 3D numpy array (epochs × channels × samples), combined with information about the events and the usual `Info` structure.

Information about the events is given with a combination of an `event` matrix and `event_id` dictionary. The matrix has three columns that denote the event onset (in samples), duration (in samples) and code (integer number) respectively. This matrix is usually constructed from a status channel through the [`find_events`](http://martinos.org/mne/stable/generated/mne.find_events.html#mne.find_events) function. The dictionary assigns descriptive labels to the event codes that are of interest and is usually specified manually.

In [81]:
# Load a dataset that contains events
raw = mne.io.RawFIF(op.join(mne.datasets.sample.data_path(), 'MEG', 'sample', 'sample_audvis_raw.fif'))

# Construct the event matrix from the status channel in the recording
events = mne.find_events(raw)

# Show the number of events (number of rows)
print('Number of events:', len(events))

# Show all unique event codes (3rd column)
print('Unique event codes:', np.unique(events[:, 2]))

# Specify event codes of interest with descriptive labels
event_id = dict(left=1, right=2)

# Expose the raw data as epochs, cut from -0.1 s to 1.0 s relative to the event onsets 
epochs = mne.Epochs(raw, events, event_id, tmin=-0.1, tmax=1)
print(epochs)

Opening raw data file /Users/rodin/BRU/mne-python/examples/MNE-sample-data/MEG/sample/sample_audvis_raw.fif...
    Read a total of 3 projection items:
        PCA-v1 (1 x 102)  idle
        PCA-v2 (1 x 102)  idle
        PCA-v3 (1 x 102)  idle
Current compensation grade : 0
    Range : 25800 ... 192599 =     42.956 ...   320.670 secs
Ready.
Adding average EEG reference projection.
1 projection items deactivated
Reading 0 ... 166799  =      0.000 ...   277.714 secs...
[done]
320 events found
Events id: [ 1  2  3  4  5 32]
Number of events: 320
Unique event codes: [ 1  2  3  4  5 32]
145 matching events found
Created an SSP operator (subspace dimension = 4)
4 projection items activated
<Epochs  |  n_events : 145 (good & bad), tmin : -0.1 (s), tmax : 1 (s), baseline : (None, 0),
 'left': 72, 'right': 73>


### Creating custom `Epochs` objects

To create an `Epochs` object from scratch, you can use the `EpochsArray` class, which uses a numpy array directly without wrapping a raw object.

In [86]:
# Generate some random data: 10 epochs, 5 channels, 111 samples per epoch
data = np.random.randn(10, 5, 111)

# Initialize an info structure
info = mne.create_info(
    ch_names=['MEG1', 'MEG2', 'EEG1', 'EEG2', 'EOG'],
    ch_types=['grad', 'grad', 'eeg', 'eeg', 'eog'],
    sfreq=100
)

# Create an event matrix: 10 events with a duration of 1 sample, alternating event codes
events = np.array([
    [0, 1, 1],
    [1, 1, 2],
    [2, 1, 1],
    [3, 1, 2],       
    [4, 1, 1],
    [5, 1, 2],
    [6, 1, 1],
    [7, 1, 2],
    [8, 1, 1],
    [9, 1, 2],
])

# More information about the event codes: subject was either smiling or frowning
event_id = dict(smiling=1, frowning=2)

# Trials were cut from -0.1 to 1.0 seconds
tmin = -0.1

# Create epochs object
custom_epochs = mne.EpochsArray(data, info, events, tmin, event_id)
print(custom_epochs)

10 matching events found
Adding average EEG reference projection.
Created an SSP operator (subspace dimension = 1)
1 projection items activated
No baseline correction applied...
No baseline correction applied...
No baseline correction applied...
No baseline correction applied...
No baseline correction applied...
No baseline correction applied...
No baseline correction applied...
No baseline correction applied...
No baseline correction applied...
No baseline correction applied...
0 bad epochs dropped
<EpochsArray  |  n_events : 10 (all good), tmin : -0.1 (s), tmax : 1.0 (s), baseline : None,
 'frowning': 5, 'smiling': 5>


## `Evoked`: Evoked potential
The result of averaging epochs, known as the evoked or event-related potential, is stored in an `Evoked` object.

## `STC`: Source space representation (cortex only)

The inverse solution, computed by the various source estimation algorithms, is stored as an `STC` object.

## Source space (volumetric)

## Dipole fit