In [1]:
import numpy as np
from tqdm.notebook import tqdm

from neurodsp.sim import sim_oscillation, sim_powerlaw
from neurodsp.filt import filter_signal
from neurodsp.spectral import compute_spectrum

from fooof import FOOOF
from bycycle import Bycycle

from ndspflow.core.workflow import WorkFlow

# WorkFlow Class

`WorkFlow` objects wrap analyses that are typically functionally oriented into a unified object. This allows for reproducible analyses and a clear order of operations, from raw input array to the final output measure. This class consist of three types of nodes:

1. __Input__ : Defines the raw input numpy array, including simulations, reading BIDS structure, or custom classes or functions to read in raw binary data.
    
2. __Transformations__ : Defines the order of preprocessing operations used to manipulate the raw array input. Any function that accepts an y-array, and returns an y-array, with or without an x-array, are support. This allows interfacing of signal processing packages, such as `scipy`, `numpy`, `mne`, `neurodsp`, etc. Examples: filtering, re-referencing, ICA, frequency domain transformations, etc.

3. __Models__ : Defines the model class that is used to fit or extract values out of the transformed array. Models should be initialized and contain a `fit` method that accepts array definitions. Examples: `fooof`, `bycycle`.

4. __Forks__ : Forks refer to the splitting of the workflow into multiple streams. Forks may optionally have additional transform (e.g. one model may want power spectra, whereas another requires time series). A `fit` method call must follow after every fork.

## Overview

Below an entire workflow example is defined and executed. After, we break down each method call in greater detail.

In [2]:
# Settings
n_seconds = 10
fs = 1000
seeds = list(range(4))
freq_range = (1, 100)

# Initialize
wf = WorkFlow(seeds=seeds)

# 1. Define np.array input
wf.simulate(sim_powerlaw, n_seconds, fs, exponent=-2)
wf.simulate(sim_oscillation, n_seconds, fs, freq=10, operator='add')
wf.simulate(sim_oscillation, n_seconds, fs, freq=20, operator='add')

# 2. Transform input
wf.transform(filter_signal, fs, 'lowpass', 100, remove_edges=False)

# 3a. Define spectral fit within a fork
wf.fork()
wf.transform(compute_spectrum, fs)
wf.fit(FOOOF(verbose=False), freq_range)

# 3b. Define time domain fit within a fork
wf.fork()
wf.fit(Bycycle(), fs, freq_range)

# Execute workflow
wf.run(return_attrs=['self', 'aperiodic_params_', 'peak_params_'],
       n_jobs=-1, progress=tqdm)

# Access results
wf.results

[[<fooof.objs.fit.FOOOF at 0x7fb34f9e7670>,
  <bycycle.objs.fit.Bycycle at 0x7fb34f9e7bb0>],
 [<fooof.objs.fit.FOOOF at 0x7fb34f9e7970>,
  <bycycle.objs.fit.Bycycle at 0x7fb34f9f5190>],
 [<fooof.objs.fit.FOOOF at 0x7fb34f9e7310>,
  <bycycle.objs.fit.Bycycle at 0x7fb34f9f55b0>],
 [<fooof.objs.fit.FOOOF at 0x7fb34f9e7d00>,
  <bycycle.objs.fit.Bycycle at 0x7fb34f9f5790>]]

## Initialization

When initializing a workflow objects, arbitary keyword arguments may be passed that are required by one of the sub-classes. In the case below, we want to set the seeds attributes, required by the `Simulation` sub-class. X-axis values may be optionally defined here, if required by the model.

In [3]:
# Initialize
wf = WorkFlow(seeds=seeds)

## 1. Input
Next we define the array input. In this cases, is a series of simulations. Multiples simulations calls may be stacked to create the input array. Any simulation function may be used as long as a single array is returned. The way arrays are combined are defined by the `operator` argument. The general form follows: 

`.simulate(func, *args, operator, **kwargs)`.

In [4]:
# 2. Define np.array input
wf.simulate(sim_powerlaw, n_seconds, fs, exponent=-2)
wf.simulate(sim_oscillation, n_seconds, fs, freq=10, operator='add')
wf.simulate(sim_oscillation, n_seconds, fs, freq=20, operator='add')

## 2. Transform
The raw input array may transformed using any function, as long as a single array is returned. The `axis` argument is used for ndarray with greater than two dimensions and specifies which axis to iterate over to apply the transform. The general form follows: 

`.transform(func, *args, axis, **kwargs)`.

In [5]:
# 2. Transform input
wf.transform(filter_signal, fs, 'lowpass', 100, remove_edges=False)

## 3. Model
The model to be fit is then defined. This should be a class with a `fit` method that accepts a y-array, optionally a x-array if defined during `WorkFlow` initialization, and any args or kwargs required by the model's `fit` method. The workflow forks below continue from the `transform` method call above.

In [6]:
# 3a. Define spectral fit within a fork
wf.fork()
wf.transform(compute_spectrum, fs)
wf.fit(FOOOF(verbose=False), freq_range)

In [7]:
# 3b. Define spectral fit within a fork
wf.fork()
wf.fit(Bycycle(), fs, freq_range)

## Execute

Lastly, the `WorkFlow` is executed in parallel using the `run` method. The the `return_attrs` arguments is used to transfer any attribute in the model class (here a `FOOOF` object), including the model itself, to the `results` attribute of the `WorkFlow` class. Below we return both the model and attributes of the model.

In [8]:
# Execute workflow
wf.run(return_attrs=['self', 'aperiodic_params_', 'peak_params_'],
       n_jobs=-1, progress=tqdm)

# Access results
wf.results

Running Workflow:   0%|          | 0/4 [00:00<?, ?it/s]

[[<fooof.objs.fit.FOOOF at 0x7fb34fa15b80>,
  <bycycle.objs.fit.Bycycle at 0x7fb3a4971130>],
 [<fooof.objs.fit.FOOOF at 0x7fb34fa15a90>,
  <bycycle.objs.fit.Bycycle at 0x7fb34fa15c10>],
 [<fooof.objs.fit.FOOOF at 0x7fb34fa15b50>,
  <bycycle.objs.fit.Bycycle at 0x7fb3a4971790>],
 [<fooof.objs.fit.FOOOF at 0x7fb34fa15700>,
  <bycycle.objs.fit.Bycycle at 0x7fb34fa15850>]]