# <p align="center">Introduction to
![Neo](./neo_material/neo_logo.png)
<p align="center">Representing electrophysiology data in Python
Based on tutorial material by Julia Sprenger

## Data sources in electrophysiology

![Data sources](./neo_material/ephys_data_sources.png)


## Data modalities in electrophysiology

![Data sources](./neo_material/ephys_data_modalities.png)

...and many more file formats to capture those

**Goal:** Treat electrophysilogy and other data types in a common framework to 
- allow analyses across modalities
- facilitate reproducibility analysis
- simplify scientific workflows

...without inventing yet another file format standard?

##  Neo - Generic representation of common ephys modalities
- [Neo on PyPi](https://pypi.org/project/neo/) & [github](https://github.com/NeuralEnsemble/python-neo)
- [Neo documentation](https://neo.readthedocs.io)
- Standardized representation of ephys data during runtime
- Efficient handling of large data arrays thanks to `numpy`
- Description of physical units and unit conversion thanks to `quantities`

In [1]:
import neo
import numpy as np
import quantities as pq

## Data Classes
- `AnalogSignal`: Continuous data sampled in **regular** intervals

*essential metadata*: physical unit of samples, time stamps of the samples (first timestamp & sampling interval)

In [2]:
anasig = neo.AnalogSignal(np.zeros((50,2)), units='uV',
                sampling_rate=10000*pq.Hz,
                t_start=120*pq.ms)

Note the dimensions of `Analogsignal` object: (`time`, `channel`)

#### Accessing metadata

In [3]:
print(f'units: {anasig.units}')
print(f'sampling_rate: {anasig.sampling_rate}')
print(f'sampling_period: {anasig.sampling_period.simplified}')
print(f't_start & t_stop: {anasig.t_start, anasig.t_stop}')
print(f'times: {anasig.times}')

units: 1.0 uV
sampling_rate: 10000.0 Hz
sampling_period: 0.0001 s
t_start & t_stop: (array(120.) * ms, array(125.) * ms)
times: [120.  120.1 120.2 120.3 120.4 120.5 120.6 120.7 120.8 120.9 121.  121.1
 121.2 121.3 121.4 121.5 121.6 121.7 121.8 121.9 122.  122.1 122.2 122.3
 122.4 122.5 122.6 122.7 122.8 122.9 123.  123.1 123.2 123.3 123.4 123.5
 123.6 123.7 123.8 123.9 124.  124.1 124.2 124.3 124.4 124.5 124.6 124.7
 124.8 124.9] ms



![AnalogSignal](./neo_material/base_schematic_0.svg)


## Data Classes
- `AnalogSignal`: Continuous data sampled in **regular** intervals
- `IrregularlySampledSignal`: Continuous data sampled in **irregular** intervals
- `ImageSequence`: Continuous 2D **frames** sampled in regular intervals

![ImageSequence](./neo_material/base_schematic_2.svg)

## Data Classes
- `AnalogSignal`: Continuous data sampled in **regular** intervals
- `IrregularlySampledSignal`: Continuous data sampled in **irregular** intervals
- `ImageSequence`: Continuous 2D **image frames** sampled in regular intervals
- `SpikeTrain`: Time point data (& optional waveform snippet)

*essential metadata* time values, physical units of times, (& waveform sampling rate, waveform offset to corresponding time value)

In [4]:
st = neo.SpikeTrain([1, 4, 5.7], units='ms', name='#001', t_start=0*pq.ms, t_stop=300*pq.ms)
print(f'spiketrain: {st}')
print(f't_start & t_stop: {st.t_start, st.t_stop}')

spiketrain: [1.  4.  5.7] ms
t_start & t_stop: (array(0.) * ms, array(300.) * ms)


![SpikeTrain](./neo_material/base_schematic_3.svg)

## Data Classes
- `AnalogSignal`: Continuous data sampled in **regular** intervals
- `IrregularlySampledSignal`: Continuous data sampled in **irregular** intervals
- `ImageSequence`: Continuous 2D **image frames** sampled in regular intervals
- `SpikeTrain`: Time point data (& optional waveform snippet)
- `Event`: Experiment reference time points (e.g. trigger, trial start, ...)
- `Epoch`: Experiment reference time ranges (e.g. trial, stimulation, ...)

![Epoch](./neo_material/base_schematic_5.svg)

### Additional metadata attributes
- human readable label of objects via `name` attribute
- custom metadata annotations via `annotation` and `array_annotation` attributes
- `Event` and `Epoch` can be used to `label` each time point / time period

In [5]:
anasig.name = 'electrode 1A'
anasig.annotate(signal_quality='good')
anasig.array_annotate(channel_id=[1,2])
print(f'name: {anasig.name}')
print(f'annotations: {anasig.annotations}')
print(f'number of channels: {anasig.shape[-1]}')
print(f'array_annotations: {anasig.array_annotations}')

name: electrode 1A
annotations: {'signal_quality': 'good'}
number of channels: 2
array_annotations: {'channel_id': array([1, 2])}


### Neo objects provide utility functions
Some usefull utility attributes and methods of neo data objects are
- `.times` to get array of corresponding time values
- `.time_slice()` to crop to a specific time range
- `.merge()` to combine multiple objects of the same type
- `.concatenate()` to append multiple signal objects
- `.downsample()` to create a new signal with a lower sampling rate
- `.magnitude` to strip all neo features and extract the underlying numpy array
- `.units` to extract the physical unit of the signal
- `dimensionality.latex` to extract a string representation of the physical unit
- check out the jupyter `help()` to discover more in the documentation!

## Relations between data objects - Container Classes
- `ChannelView`: select a subset of channels of a signal, e.g. all even channels of an `AnalogSignal`
- **`Segment`**: contains data objects with a shared clock, e.g. a trial
- `Group`: groups data objects logically (no common clock required, e.g. `SpikeTrain`s of a neuronal unit)
- **`Block`**: contains all objects of a recording

![Block](./neo_material/base_schematic.svg)

### Neo structure
```
*Neo objects*                        *Object access*

Block 0                              block
    Segment 0                        block.segments[0]
        AnalogSignal 0               block.segments[0].analogsignals[0]
        AnalogSignal 1               block.segments[0].analogsignals[1]
        SpikeTrain 0                 block.segments[0].spiketrains[0]
        SpikeTrain 1                 block.segments[0].spiketrains[1]
        SpikeTrain 2                 block.segments[0].spiketrains[2]
 
    Segment 1                        block.segments[1]
        AnalogSignal 2               block.segments[1].analogsignals[0]
        AnalogSignal 3               block.segments[1].analogsignals[1]
        SpikeTrain 3                 block.segments[1].spiketrains[0]
        SpikeTrain 4                 block.segments[1].spiketrains[1]
        SpikeTrain 5                 block.segments[1].spiketrains[2]

    Group 0                          block.groups[0]
        AnalogSignal 0               block.groups[0].analogsignals[0]
        AnalogSignal 2               block.groups[0].analogsignals[1]
        SpikeTrain 0                 block.groups[0].spiketrains[0]
        SpikeTrain 3                 block.groups[0].spiketrains[1]
        
```

### Neo Class Overview
![neo_uml](./neo_material/simple_generated_diagram_with_channelview.svg)

# Automatic generation of Neo objects
<img src="./neo_material/IODiagram.svg" alt="IODiagram" style="width: 600px;"/>

### Loading a recording session
- generation of a complete neo structure requires only **2 lines of code** and the name of the original recording system.

In [6]:
recording_folder = './openephys_example/OpenEphys_SampleData_1'
io = neo.io.OpenEphysIO(recording_folder)
block = io.read_block()
block

Block with 1 segments, 1 groups
annotations: {'openephys_version': '0.4'}
file_origin: './openephys_example/OpenEphys_SampleData_1'
# segments (N=1)
0: Segment with 1 analogsignals, 1 events, 1 spiketrains
   annotations: {'openephys_version': '0.4',
     'date_created': "'3-Oct-2018 131650'",
     'openephys_segment_index': 1}
   # analogsignals (N=1)
   0: AnalogSignal with 2 channels of length 423936; units uV; datatype float32 
      name: 'Signals CH'
      annotations: {'stream_id': 'CH'}
      sampling rate: 40000.0 Hz
      time: 1.2992 s to 11.8976 s

### Accessing data objects
- child objects can be accessed the corresponding attribute based on the class name (plural!). E.g. `.segments`, `.analogsignals`, `.spiketrains`
- child objects are stored in *lists*

In [7]:
# accessing the list of Segments
block.segments
# accessing the first AnalogSignal of a single Segment
segment = block.segments[0]
segment.analogsignals[0]

AnalogSignal with 2 channels of length 423936; units uV; datatype float32 
name: 'Signals CH'
annotations: {'stream_id': 'CH'}
sampling rate: 40000.0 Hz
time: 1.2992 s to 11.8976 s

Inspecting a spiketrain object

In [8]:
spiketrain = segment.spiketrains[0]
spiketrain

SpikeTrain name: 'STp106.0n0#0' annotations: {'id': 'STp106.0n0#0'}

In [9]:
print(f'Number of spikes: {len(spiketrain)}')
spiketrain.times[:20]

Number of spikes: 454


array([1.30545 , 1.403875, 1.415325, 1.422425, 1.428775, 1.45465 ,
       1.4658  , 1.5147  , 1.51905 , 1.521725, 1.546   , 1.563975,
       1.566525, 1.568325, 1.573275, 1.58005 , 1.588125, 1.59505 ,
       1.599125, 1.654425]) * s

### Dealing with large datasets
How to load only required data

- some IOs are based on `RawIO` concept for efficient reading of data
- `RawIO`s require additional symmetries in the dataset for efficient loading
- *lazy* data objects (ProxyObjects) can be loaded using `io.read_block(`**`lazy=True`**`)`
- ProxyObjects provide a `.load(t_start, t_stop)` method that loads requested data in memory and returns complete neo data object.

In [10]:
recording_folder = './openephys_example/OpenEphys_SampleData_1'
io = neo.io.OpenEphysIO(recording_folder)
block = io.read_block(lazy=True)
lazy_anasig = block.segments[0].analogsignals[0]
lazy_anasig

AnalogSignalProxy name: 'Signals CH' annotations: {'stream_id': 'CH'}

#### Loading data from a Proxy object
- ProxyObjects contain metadata and shape information


In [11]:
print(f'signal shape: {lazy_anasig.shape}')
print(f'signal sampling rate: {lazy_anasig.sampling_rate}')
print(f'signal annotations: {lazy_anasig.annotations}')

signal shape: (423936, 2)
signal sampling rate: 40000.0 Hz
signal annotations: {'stream_id': 'CH'}


- data of a specific channel and time range can be loaded selectively into a new neo object

In [12]:
anasig = lazy_anasig.load(time_slice=(5*pq.s,6*pq.s), channel_indexes=[0])

In [13]:
print(f'signal shape: {anasig.shape}')
print(f'signal sampling rate: {anasig.sampling_rate}')
print(f'signal time points: {anasig.times}')
print(f'signal values: {anasig.magnitude}')

signal shape: (40000, 1)
signal sampling rate: 40000.0 Hz
signal time points: [5.       5.000025 5.00005  ... 5.999925 5.99995  5.999975] s
signal values: [[13.45]
 [13.1 ]
 [10.8 ]
 ...
 [12.3 ]
 [13.05]
 [14.15]]


### How to save data?

- selected open formats are supported for writing
 - **NIX**<sup>1</sup>
 - NWB<sup>1</sup>
 - Matlab<sup>2</sup>
 - Ascii<sup>2</sup>
 - Numpy Pickle<sup>3</sup>
 
1. Support of neo-compatible files
2. Does not capture complete set of metadata
3. Strong dependency on Numpy and Neo version

In [14]:
filename = 'my_first_neo_dataset.nix'
with neo.io.NixIO(filename, 'ow') as io:
    io.write_block(block)

### Why two NixIOs in Neo?
**`neo.io.NixIO`**
- non-RawIO implementation
- can read and write arbitrary Neo object structures

`neo.io.NixIOFr`
- RawIO implementation
- can very efficiently read specific Neo-Nix files that contain symmetric objects

# Questions?



Keep in mind these overviews to help you navigate Neo structures.

<table><tr>
<td> <img src="./neo_material/simple_generated_diagram_with_channelview.svg" width="500"/> </td>
<td> <img src="./neo_material/base_schematic.svg" width="500"/> </td>
</tr></table>