In this tutorial we will create a CSP pipeline using MindPype.

In [None]:
# this is to setup the path so we can import the mindpype library
import os; os.sys.path.append(os.path.dirname(os.path.abspath('.')))

In [None]:
import mindpype as mp
import numpy as np

The first step to creating a pipeline is to create a session, which serves as a sandbox for all components in the pipeline. After creating the session we will create a graph which represents the pipeline including the processing nodes and the data edges.

In [None]:
# create a session and a graph
session = mp.Session.create()
trial_graph = mp.Graph.create(session)

CSP requires initialization data to use for training. Therefore, we will randomly generate values for the training data and labels and create tensors from these generated values using the ```create_from_data()``` factory method. 

In [None]:
# Create random initialization (training) data and labels
training_data = np.random.random((120,12,500))
labels = np.asarray([0]*60 + [1]*60)


# Create tensors from the data and labels
init_data = mp.Tensor.create_from_data(session,training_data)
init_labels = mp.Tensor.create_from_data(session,labels)


Next we will create our input and output data containers for the graph. Our pipeline will take input data and will output the predicted label so we will create a tensor object for our input and a scalar object for our output.

Then we will create virtual tensors to hold ant intermediate values that are calculated throughout the pipeline using the ```create_virtual()``` method. Since these intermediate values represent data that is only required in the proces of completing a calculation and we do not need to access them later, the virtual type is ideal. The virtual type provides temporary storage and enabled us to free up more memory.

In [None]:
# Create an input tensor with dummy data
input_tensor = mp.Tensor.create_from_data(session, np.random.randn(12, 500))

# Create a scalar that will be populated with the classifier label
classifier_label = mp.Scalar.create_from_value(session,-1)

# Create intermediate (virtual) tensors for the intermediate steps of the pipeline
intermediate_tensors = [mp.Tensor.create_virtual(session),
                        mp.Tensor.create_virtual(session),
                        mp.Tensor.create_virtual(session),
                        mp.Tensor.create_virtual(session)]

Next we will create a filter. We first set the parameter values for the filter, then we will use these values to create a butterworth filter using the ```create_butter()``` factory method.

In [None]:
# create filter parameters
order = 4
bandpass = (8,35) # in Hz
fs = 250

# Create a filter object using the parameters
filter_obj = mp.Filter.create_butter(session,order,bandpass,btype='bandpass',fs=fs,implementation='sos')

We will use an LDA classifier to predict the output labels, so we will create a classifier object using the ```create_LDA()``` factory method.

In [None]:
# Create a classifier object
classifier = mp.Classifier.create_LDA(session)

Now we will add the associated kernels to our graph using the ```add_to_graph()``` factory methods. Each kernel that we add to the graph represents a node that will execute a process. 

For our CSP pipeline, we will first filter the data. Then we will pass the filtered data (which is stored in a virtual tensor, intermediate_tensor[1]) to our CSP kernel which will calculate and apply our spatial filters to our data. Next, we will apply the variance and log kernels to the spatially filtered data (which is stored in a virtual tensor, intermediate_tensor[2]) to aid with feature extraction. Finally, we will use an LDA classifier to make our output label predictions.

In [None]:
# add the processing nodes to the graph using the factory methods
filter_node = mp.kernels.FilterKernel.add_to_graph(trial_graph,input_tensor,filter_obj,intermediate_tensors[0], axis = 1)

CSP_node = mp.kernels.CommonSpatialPatternKernel.add_to_graph(trial_graph, intermediate_tensors[0], intermediate_tensors[1], init_data, init_labels, 2)
var_node = mp.kernels.VarKernel.add_to_graph(trial_graph, intermediate_tensors[1], intermediate_tensors[2], axis = 1)
log_node = mp.kernels.LogKernel.add_to_graph(trial_graph, intermediate_tensors[2], intermediate_tensors[3])
LDA_node = mp.kernels.ClassifierKernel.add_to_graph(trial_graph, intermediate_tensors[3], classifier, classifier_label)

With all of our nodes added to our graph, we will then verify the graph using the ```verify()``` method. Verifying the graph orders the nodes for execution and ensure that the inputs and outputs of each processing node are appropriately typed and sized.

In [None]:
# verify the session (i.e. schedule the nodes) and ensure the inputs and outputs are connected properly
trial_graph.verify()

After our graph has been verified, the next step is to initialize the graph. This step is required for pipelines that have methods that need to be trained or fit. For our pipeline the CSP and the LDA nodes require training/initialization so we will call the ```initialize()``` method on our graph. Note that when we created the graph nodes, we only specified intialization data for the CSP node and did not explicitly provide training data for the classifier node. In these cases, MindPype will identify nodes that require training data but were not exlpicitly provided a reference to training data at node creation. During graph verification, MindPype will automatically search through the upstream nodes (i.e., nodes executed earlier) and identify sources of initialization data. If an upstream node with an initialization data input is found, MindPype will propagate that data through the graph to produce the required training data for the downstream node.

For example, in this pipeline, MindPype will flag that the LDA node requires initialization data but has not been explicitly provided any. It will then search through the nodes immediately preceeding it to find any nodes that were provided initialization data (i.e., check the log node, then var node, then CSP). It will find that the CSP node has initialization data and stop searching. When `graph.initialize()` is called, the CSP node will be initialized, and then the initialization data will be propagated through the graph (i.e., the CSP, log, and variance transformations will be applied) until it reaches the LDA node. This transformed initialization data will be used to initialize the LDA classifier. 

Note that any node can be provided initialization data even if the node itself does not require initialization. In this graph, for example, we could have provided the initialization data to the filter node instead and the initialization data would have been propagated through the graph to initialize both the CSP and LDA nodes.

In [None]:
# Since the graph contains nodes that must be initialzed/trained, we must call initialize() before running the graph
trial_graph.initialize()

We are now ready to run our pipeline. To run the graph for the provided input data, we use the ```execute()``` method.

In [None]:
# RUN!
trial_graph.execute()
# print the value of the most recent trial
print("Trial {}, Predicted label = {}\n".format(1, classifier_label.data))


print("Test Passed =D")