# Pipeline-based processing in pypillometry

In [1]:
import sys
sys.path.insert(0,"..")
import pypillometry as pp

`pypillometry` implements a pipeline-like approach where each operation executed on a `PupilData`-object returns a copy of the (modified) object. This enables the "chaining" of commands as follows:

In [2]:
d=pp.PupilData.from_file("../data/test.pd")\
    .blinks_detect()\
    .blinks_merge()\
    .lowpass_filter(3)\
    .downsample(50)

This command loads a data-file (`test.pd`), applies a 3Hz low-pass filter to it, downsamples the signal to 50 Hz, detects blinks in the signal and merges short, successive blinks together. The final result of this processing-pipeline is stored in object `d`. 

Here, for better visibility, we put each operation in a separate line. For that to work, we need to tell Python that the line has not yet ended at the end of the statement which we achieve by putting a backslash `\` at the end of each (non-final) line.

We can get a useful summary of the dataset and the operations applied to it by simply printing it:

In [3]:
print(d)

PupilData(test_ro_ka_si_hu_wa_nu_fa_we, 331.3KiB):
 n                 : 6001
 nmiss             : 117.2
 perc_miss         : 1.9530078320279955
 nevents           : 56
 nblinks           : 24
 ninterpolated     : 0.0
 blinks_per_min    : 11.998000333277787
 fs                : 50
 duration_minutes  : 2.0003333333333333
 start_min         : 4.00015
 end_min           : 6.0
 baseline_estimated: False
 response_estimated: False
 History:
 *
 └ reset_time()
  └ blinks_detect()
   └ sub_slice(4,6,units=min)
    └ drop_original()
     └ blinks_detect()
      └ blinks_merge()
       └ lowpass_filter(3)
        └ downsample(50)



We see that sampling rate, number o datapoints and more is automatically printed along with the history of all operations applied to the dataset. This information can also be retrieved separately and in a form useful for further processing using separate functions `summary()` and `get_history()`.

In [4]:
d.summary()

{'name': 'test_ro_ka_si_hu_wa_nu_fa_we',
 'n': 6001,
 'nmiss': 117.2,
 'perc_miss': 1.9530078320279955,
 'nevents': 56,
 'nblinks': 24,
 'ninterpolated': 0.0,
 'blinks_per_min': 11.998000333277787,
 'fs': 50,
 'duration_minutes': 2.0003333333333333,
 'start_min': 4.00015,
 'end_min': 6.0,
 'baseline_estimated': False,
 'response_estimated': False}

In [5]:
d.history

[{'funcstring': 'reset_time()',
  'funcname': 'reset_time',
  'args': (),
  'kwargs': {}},
 {'funcstring': 'blinks_detect()',
  'funcname': 'blinks_detect',
  'args': (),
  'kwargs': {}},
 {'funcstring': 'sub_slice(4,6,units=min)',
  'funcname': 'sub_slice',
  'args': (4, 6),
  'kwargs': {'units': 'min'}},
 {'funcstring': 'drop_original()',
  'funcname': 'drop_original',
  'args': (),
  'kwargs': {}},
 {'funcstring': 'blinks_detect()',
  'funcname': 'blinks_detect',
  'args': (),
  'kwargs': {}},
 {'funcstring': 'blinks_merge()',
  'funcname': 'blinks_merge',
  'args': (),
  'kwargs': {}},
 {'funcstring': 'lowpass_filter(3)',
  'funcname': 'lowpass_filter',
  'args': (3,),
  'kwargs': {}},
 {'funcstring': 'downsample(50)',
  'funcname': 'downsample',
  'args': (50,),
  'kwargs': {}}]

In [6]:
dd=pp.create_fake_pupildata(ntrials=10)

In [7]:
d.apply_history(dd)

IndexError: boolean index did not match indexed array along dimension 0; dimension is 0 but corresponding boolean dimension is 1

In [13]:
dd=dd.downsample(100).lowpass_filter(10)

In [17]:
dd.apply_history(dd)

PupilData(fake_pihodolo_fi_de_lu_re, 1.5MiB):
 n                 : 2415
 nmiss             : 0.0
 perc_miss         : 0.0
 nevents           : 20
 nblinks           : 0
 ninterpolated     : 0.0
 blinks_per_min    : 0.0
 fs                : 100
 duration_minutes  : 0.4025
 start_min         : 7.500064686639967e-05
 end_min           : 0.4023618036278195
 baseline_estimated: False
 response_estimated: False
 History:
 *
 └ downsample(100)
  └ lowpass_filter(10)
   └ downsample(100)
    └ lowpass_filter(10)

In [6]:
d.apply_history(d)

TypeError: string indices must be integers

This object stores also the complete history of the operations applied to the dataset and allows to transfer it to a new dataset.

In [30]:
getattr(d, "reset_time")

<bound method PupilData.reset_time of PupilData(test_re_mo_vi_fa_bu_mi_ma_mi, 331.0KiB):
 n                 : 6001
 nmiss             : 117.2
 perc_miss         : 1.9530078320279955
 nevents           : 56
 nblinks           : 24
 ninterpolated     : 0.0
 blinks_per_min    : 11.998000333277787
 fs                : 50
 duration_minutes  : 2.0003333333333333
 start_min         : 4.00015
 end_min           : 6.0
 baseline_estimated: False
 response_estimated: False
 History:
 *
 └ reset_time()
  └ blinks_detect()
   └ sub_slice(4,6,units=min)
    └ drop_original()
     └ blinks_detect()
      └ blinks_merge()
       └ lowpass_filter(3)
        └ downsample(50)
>