In [None]:
%matplotlib inline

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Processing data

This notebook provides an example for how `H5Data.apply()` can be used with pre-defined functions to process raw array data.

In [None]:
from e11 import H5Data
from e11.tools import add_column_index
from e11.process import vrange, total

In [None]:
# read file
import os 
fil = os.path.join(os.getcwd(), 'example_data', 'array_data.h5')
h5 = H5Data(fil)

## Vrange

Here, we are applying the `vrange` function to measure the vertical range of array data.

In [None]:
rng, info = h5.apply(vrange, 'OSC_0', h5.squids, info=True)
rng.head()

In [None]:
rng.describe()

In [None]:
# information about the processing
info

In [None]:
# plot
rng.reset_index().plot(subplots=True)

#output
plt.show()

## Total 

Simularly, we can apply the `total` function to measure the sum of array data inside a given window.

In [None]:
tot = h5.apply(total, 'OSC_0', h5.squids, window=(100, 300))
tot.head()

In [None]:
# plot
tot.reset_index().plot(subplots=True)

#output
plt.show()

## Direct processing

These functions can be applied to array data directly (i.e, without using `H5Data.apply()`).  

Reading data from the disk is usually the slowest part of processing, therefore, this might be the most efficient technique if the same data needs to be processed multiple times, e.g, for optimisation.  

The disadvantage is that squid information is lost during data concatenation.  Therefore this is probably only a good option for known subsets of the data.  

In [None]:
dat = h5.array('OSC_0', squids=[3, 4])
df = vrange(dat)
df.head()

In [None]:
# plot
df.plot()

#output
plt.show()