McsPyDataTools Tutorial for CMOS MEA file format<a id='Top'></a>
=======================

- ### <a href='#McsPy'>The McsPy module for Mcs CMOS MEA file handling
- ### <a href='#Mcs-HDF5'>Structure of the Mcs-HDF5 CMOS MEA files</a>

- ### <a href='#McsData Module'>McsData Classes and Inheritance</a>
-------------------------------------------------------------------------------------------
  
- <a href='#Accessing your Data with McsData'>Accessing your Data with McsData</a>
----------------------------------------------------------------------------------
 - ### <a href='#Req'>Requirements</a>
 - ### <a href='#naming'>Naming</a>
 - ### <a href='#acquisition'>Acquisiton</a>
 - ### <a href='#channelStream'>ChannelStream</a>
 - ### <a href='#sensorStream'>SensorStream</a>
 - ### <a href='#eventStream'>EventStream</a>

The McsPy module for Mcs CMOS MEA file handling<a id='McsPy'></a>
---------------------------------------------------
With the ```h5py``` package, a omnipotent tool for accessing HDF5 files in python already exists. This toolbox builds upon h5py by subclassing its central ```h5py.Group``` and ```h5py.Dataset``` classes as ```McsGroup``` and ```McsDataset``` respectively. Thus the McsPy classes feature all attributes and methods you might be used to from working with ```h5py```, and simply extend them with MCS specific features. So if you are new to HDF5 in python you can always refer to the h5py documentation and diskussions. Likewise, if you have worked with ```h5py``` previously you will find your self in an at least familiar environment.
If you prefer to work with h5py functionalities at any point in your analysis, feel free to retrieve the h5py base object by from the McsPy object attribute ```.h5py_object```

```python
    h5py_object = self.h5py_object
```

Structure of the HDF5 MCS-CMOS-MEA file system<a id='Mcs-HDF5'></a>
---------------------------------------------------

A MCS-CMOS-MEA filesystem typically consists of two seperate files. A HDF5 MCS-CMOS-MEA RawData (RD) file and a HDF5 MCS-CMOS-MEA ProcessedData (PD) file.
The RD file holds all raw data generated into an CMOS MEA experiment, i.e. mainly diffrent MCS data streams. The coresponding file extension is '.cmcr'.
The PD file itself contains all data generated in post-processing raw data with MCS software. Results lie in their respective subgroups Filter Tool, STA Explorer, Spike Explorer, and Spike Sorter. Furthermore, MCS tools make use of the HDF5 capabilities to mount HDF5 files into each other. In that sense, the PD file mounts the RD file into the "Aquisition" subgroup of its own hierarchy tree. Thus, given the link in the PD file correctly points to the RD file, this toolbox provides an set of intuitive access tools for both, RD and PD via just the PD file. The coresponding file extension is '.cmtr'.

As McsPy toolbox works with the underlying, stricly hierarchical HDF5 structure, the the origin of every data exploration with the McsPy toolbox is the ```McsData``` object. As the docstrings of the class already imply, this class was designed to hold the information of a complete HDF5 MCS-CMOS-MEA file system.

```python
    data = McsData('path to your data')
```

We highly recomend the suplementary use of the HDF Groups **HDFView** software to help visualize and understand the structure of HDF5 files. This can make accessing the data **MUCH** easier.

<a href='#Top'>Back to index</a>

McsData Classes and Inheritance <a id='McsData Module'></a>
---------------------------------------------------------------------------------------

Generally, the McsPy toolbox creates a stucture that, upon navigation through the file, reflects the HDF5 MCS-CMOS-MEA file structure, e.g. just as the HDF5 MCS-CMOS-MEA file system holds raw data in the "Aquisition" subgroup, the McsData object has an attribute ```data.Acquisition```. Therefore, you can refer to the following graphical representation of the HDF5 MCS-CMOS-MEA file hierarchy for easy navigation through the python objects.

Note: The subgroups of the root will not be accessible if they do not exist in the loaded MCS-CMOS-MEA file system.

<a id='file_structure_graphic'></a>
<img src="./Cmos_Hierarchy_short.png">

Upon initialization with the path to your data

```python
    data = McsData('path to your data')
```

membermethods of this class will check if the provided file meets the version requirements to be further processesed. This is neccessary, as not only the way how Mcs programs handle the HDF5 formated files may change but the fileformat itself can undergo changes.

Afterwards all information about the data, which is stored in the file is retrieved from the hdf5 attributes, decoded and saved in the in the attribute ```.attributes``` as a dictionary. The ```.attributes``` is created for every McsPy object.

```python
    data.attributes
```

An access request on one of the subgroups or datasets (i.e. Acquisition, Filter Tool, STA Explorer, Spike Explorer, Spike Sorter) readies the respective data, i.e.s

```python
    aquisition_data = data.Aquisition
```

This instantiates a new McsPy object holding the all data about the requested subgroup.





<a href='#Top'>Back to index</a>

## Accessing your Data with McsData<a id='Accessing your Data with McsData'></a>

Now that the general structure of a HDF5 file and the McsPy package with its McsData class is clear we can walk through some quick and easy examples of how to access and visualize your data.


Navigation through McsPy objects implements two central concepts:
1. Groups work like dicionaries or classes.
2. Datasets work like numpy arrays.

### Naming<a id='naming'></a>

Wherever possible McsPy names instances and attributes as found in the MCS-CMOS-MEA file system. However, some systematic substitutions in group, dataset, and attribute naming are necessary to ensure python compatibility:


| **HDF5 MCS-CMOS-MEA** | **Mcs python toolbox**     |
|-----------------------|----------------------------|
| whitespace            | _                          |
| .                     | _                          |
| ,                     | _                          |
| @                     | at                         |
| (                     | (character removed)        |
| )                     | (character removed)        |
| :                     | (character removed)        |

### Requirements <a id='Req'></a>

So let's dig in and train a feeling for working with MCS data.

First some modules need to be imported:

In [None]:
# These are the imports of the McsData module
import sys, importlib
sys.path.append('D:\\Programming\\McsDataManagement\\McsPyDataTools\\McsPyDataTools')
import McsPy
import McsPy.McsCMOSMEA

#import McsPy.McsData
#import McsPy.McsCMOS
#from McsPy import ureg, Q_

# matplotlib.pyplot will be used in these examples to generate the plots visualizing the data
import matplotlib.pyplot as plt
from matplotlib.figure import Figure
from matplotlib.widgets import Slider, AxesWidget
# These adjustments only need to be made so that the plot gets displayed inside the notebook
%matplotlib inline
# %config InlineBackend.figure_formats = {'png', 'retina'}

# numpy is numpy ...
import numpy as np

# bokeh adds more interactivity to the plots within notebooks. Adds toolbar at the top-right corner of the plot.
# Allows zooming, panning and saving of the plot
import bokeh.io
#import bokeh.mpl
import bokeh.plotting

#import widgets
from ipywidgets import *

Sometimes running Python applications in the background can interfere with the functionalities of this notebook. To make sure that all plots are created correctly you are best advised to exit any other Python related processes.

Next we need to access the data file by initializing an instance of the McsData class from the McsData module with the path to the file:

In [None]:
cmosmea_multi_talent = McsPy.McsCMOSMEA.McsData('..\\TestData\\V200-SensorRoi-3Aux-Dig-Stim2-DiginEvts-5kHz.cmcr')

The filepath points to the folder TestData within the folder, where this notebook resides.

**Note**: The ```McsData``` call actually determines the type of MCS HDF5 file system, which you have called, and intentionally returns an instance of an appropriate class. So do not be confused if the return is not an instance of ```McsData``` as you have maybe expected.

To check if we got access to the file we can simply print the object. This gives a rough overview of the contents of the CMOS MEA RawData, or CMOS MEA ProcessedData file. In general, all McsGroup objects provide some information about themselves and a table of all subgroubs or datasets (which you can check against the graphic) upon printing. The table also provides the McsPy access name of all contents.

In [None]:
print(cmosmea_multi_talent)

The root object holds information about the **Date** of the recording, the **Program** which was used as well as its **Version**.

Secondaly, feel free to browse the ```self.attributes``` on any instance of ```McsGroup``` or ```McsDataset``` for more detailed information. And finally, you can call ```McsGroup.tree(self)``` for an 'indent-tree' of the current HDF5 Group and all its decendents. Feel free to check the output against the graphics <a href='#file_structure_graphic'>above</a>.

**Note**: As file trees may become very large very quickly, you are advised to print the instance rather than the tree once you are familiar with the MCS HDF5 file system.

In [None]:
print(cmosmea_multi_talent.tree())

<a href='#Top'>Back to index</a>

### Accessing Acquisition Data<a id='acquisition'></a>

From the table we see, that we can access the RawData, which is stored in the 'Acquisition' group of the MCS HDF5 file by simply calling the acquisition attribute on our ```cmosmea_multi_talent``` instance. Let's go ahead and get a glimps of the Rawdata streams in the file.

In [None]:
print(cmosmea_multi_talent.Acquisition)

#### Channel Streams<a id='channelStream'></a>
We can navigate further to start and work with some channel data.

In [None]:
print(cmosmea_multi_talent.Acquisition.STG_Waveform)

The Channel Stream 'Digital Data' object contains the two data sets ```ChannelData_1``` and ```ChannelMeta```. The objects we obtain upon access are not arrays, but subclasses of ```h5py.Dataset```. However, ```h5py.Dataset``` can be accessed, sliced and manipulated just as numpy arrays. So ```h5py.Dataset``` have a shape, a size, and a data type. For more information on working with Datasets please refer to the h5py <a href='https://readthedocs.org/projects/h5py/'>documentation</a>.

In [None]:
print('Shape:'.ljust(10)+str(cmosmea_multi_talent.Acquisition.STG_Waveform.ChannelData_1.shape))
print('Size:'.ljust(10)+str(cmosmea_multi_talent.Acquisition.STG_Waveform.ChannelData_1.size))
print('Type:'.ljust(10)+str(cmosmea_multi_talent.Acquisition.STG_Waveform.ChannelData_1.dtype))

Now, let's go ahead and visual the signal of a channel we recorded:

In [None]:
channeldata_1 = cmosmea_multi_talent.Acquisition.STG_Waveform.ChannelData_1
exponent = cmosmea_multi_talent.Acquisition.STG_Waveform.ChannelData_1.Meta['Exponent'][0]
channel_id = cmosmea_multi_talent.Acquisition.STG_Waveform.ChannelData_1.Meta['ChannelID'][0]

plt.figure(figsize=(0.681*20, 0.328*20))
plt.plot(channeldata_1[0,2000:3000])
plt.ylabel('Voltage [Ve'+str(exponent)+']')
plt.xlabel('Time in timesteps')
plt.title('Signal recorded by Channel '+str(channel_id), fontweight='bold')

#### Sensor Streams<a id='sensorStream'></a>
Alternatively we investigate a Sensor Stream.

In [None]:
print(cmosmea_multi_talent.Acquisition.Sensor_Data)

We take the our sensor data and create a simple slider to go through the block of sensor data in time:

In [None]:
%matplotlib notebook

images = cmosmea_multi_talent.Acquisition.Sensor_Data.SensorData_1_1
num_of_images = images.shape[0]-1

fig = plt.figure(figsize=(0.681*10, 0.328*10))
ax = fig.add_subplot(1,1,1)
fig.show()

def updateFrame(Frame):
    ax.imshow(images[Frame,::], cmap="gray")
              
interact(updateFrame, Frame=widgets.IntSlider(min=0,max=num_of_images,step=1,value=0))

<a href='#Top'>Back to index</a>

#### EventStream<a id='eventStream'></a>

EventStreams can be a wide array of events predefined by the user and stored in this stream. From the beginning/end or the duration of a treatment to periodically recurring stimuli this can be everything.

<a href='#Top'>Back to index</a>