In [None]:
%matplotlib inline


Design Matrix Eshin
===================

This tutorial illustrates how to use the Design_Matrix class to flexibly create design matrices that can then be used with the Brain_Data class to perform univariate regression. Design Matrices can be thought of as "enhanced" pandas dataframes; they can do everything a pandas dataframe is capable of, with some added features.




Load and Manipulate an Onsets File
-----------------------------------

Nltools provides basic file-reading support for 2 or 3 column formatted onset files.
Users can look at the onsets_to_dm function() as a template to build more complex file readers if desired or to see additional features.
Here we simply point to an onsetfile where each event lasted exactly 1 TR, provide some basic experiment metadata, add an intercept, and get back a basic design matrix.



In [None]:
from nltools.utils import get_resource_path
from nltools.file_reader import onsets_to_dm
from nltools.data import Design_Matrix
import os


onsetsFile = os.path.join(get_resource_path(),'onsets_example.txt')
dm = onsets_to_dm(onsetsFile, TR=2.0, runLength=160, sort=True,
                    addIntercept=True)

The class stores basic meta data including convolution functions (default is glover HRF) and whether convolution has been performed, or whether the model contains a constant term.



In [None]:
print(dm.info())

We can easily visualize the design matrix too



In [None]:
dm.heatmap()

We can also add nth order polynomial terms. In this case we'll add a linear term to capture linear trends.
By default the class will add all lower-order polynomials, but is smart enough to realize we already have a constant so it won't be duplicated.



In [None]:
dmpoly = dm.addpoly(1)
dmpoly.heatmap()

We can also easily perform convolution and the class is smart enough to ignore all constant and polynomial columns



In [None]:
dm = dm.convolve()
print(dm.info())
dm.heatmap()

Load and Z-score a Covariates File
----------------------------------

Now we're going to handle a covariates file that's been generated by a preprocessing routine.
First we'll read in the text file using pandas and convert it to a design matrix.
To be explicit with the meta-data we're going to change some default attributes during conversion.



In [None]:
import pandas as pd

covariatesFile = os.path.join(get_resource_path(),'covariates_example.csv')
cov = pd.read_csv(covariatesFile)
cov = Design_Matrix(cov, hasIntercept=False)
cov.heatmap()

The class has several methods features for basic data scaling and manipulation. Others can likely be found in pandas core functionality.
Here we fill NaN values with 0 and zscore all columns except the last. Because the class has all of pandas functionality, method-chaining is built-in.



In [None]:
cov = cov.fillna(0).zscore(cov.columns[:-1])
cov.heatmap()

Concatenate Multiple Design Matrices
------------------------------------

A really nice feature of this class is simplified, but intelligent matrix concatentation. Here it's trivially to horizontally concatenate our convolved onsets and covariates, while keeping our column names and order.



In [None]:
full = dm.append(cov,axis=1)
full.heatmap()

But we can also intelligently vertically concatenate design matrices to handle say, different experimental runs, or subjects. The method enables the user to indicate which columns to keep separated during concatenation or which to treat as extensions along the first dimension. By default the class will keep constant terms separated.



In [None]:
dm2 = dm.append(dm, axis=0, separate=True)
dm2.heatmap()

But specific columns can also be treated as separate (e.g. separate run spikes, polynomial terms, conditions of interest, etc)
As an example, we treat our first experimental regressor as different across our two design matrices
Notice that the class also preserves (as best as possible) column ordering.



In [None]:
dm2 = dm.append(dm, axis=0, separate=True, uniqueCols=['BillyRiggins'])
dm2.heatmap()