This is a quick background to get you up to speed with electrocorticography and the kinds of tasks that are run with it. I'll describe the data that I've provided in these tutorials so that you can get a feel for what kinds of analyses we cover.

# This tutorial
This collection of tutorial notebooks is meant to give a brief overview with concrete examples for encoding and decoding models in human electrophysiology. It's also a reference for how to implement these models with widely available tools in python.

I will broadly cover three topics, each with its own notebook. They are meant to be run in order:

1. [Feature Extraction](./notebooks/FeatureExtraction.ipynb) covers some common features that can be extracted from natural speech. These will be stored and used to fit models in subsequent notebooks.
1. [Fitting Models](./notebooks/FittingModels.ipynb) covers how to fit regression and classification models using these features, both for encoding and decoding. It also briefly covers visualizing the model coefficients, as well as how they are affected by regularization.
1. [Prediction and Validation](./notebooks/PredictionAndValidation.ipynb) covers how to use these models to make predictions on new data points, as well as some best-practices and tips for fitting models properly using cross-validation. Finally, it covers using these methods to tune hyperparameters.

In addition, here are a few extra notebooks that are related to what is discussed in the paper:

1. [Comparing Encoding and Decoding Models](./notebooks/SimulateEncodingAndDecoding.ipynb) makes a quick comparison in a STRF fit using either an encoding or a decoding model.
1. [Miscellaneous examples](./notebooks/Miscellaneous.ipynb) describe some extra concepts in understanding decoding models and encoding models.

In addition, bundled with this package is a tiny collection of helper functions called `modelingtools`. These do some important things under-the-hood, and are included in order to make the notebooks easier to read. We recommend that you investigate these functions to understand what they are doing.

To do this, you can type the function in an open cell, followed by the `?` symbol. When you run the cell, it will display the documentation for the function. If you put `??`, it will display the code for that function. For example:

# Packages we'll use
We'll focus on a few packages for fitting models using python. There are many more out there, but this is a quick overview of some particularly useful ones:

* [MNE-Python](http://martinos.org/mne/stable/index.html) is a powerful tool for storing, analyzing, and visualizing electrophysiology data in Python. We'll rely heavily on this package for all analyses. There is extensive documentation on how to use MNE. Check out their tutorials [here](http://martinos.org/mne/stable/tutorials.html) and a collection of examples (with code) [here](http://martinos.org/mne/stable/auto_examples/index.html).
* [Scikit-Learn](http://scikit-learn.org/stable/) is the most extensive machine learning library in python. It implements most battle-tested machine learning algorithms, and has a great community that is always adding new features and improving functionality. They also have excellent tutorials that cover more general machine learning principles, as well as their API.
* [numpy](http://www.numpy.org/) is a package for both representing multi-dimensional arrays of data, and performing many useful computations on that data. It is the foundation for many excellent packages in python.
* [matplotlib](http://matplotlib.org/) is the most fully-feature package for data visualization and plotting in python. It can interact with plots at a fairly low level, giving the user lots of flexibility over the look of their visualization.

All of these packages are open-source and supported by the scientific (academic and otherwise) community.


# ECoG Background
ECoG stands for electrocorticography - a method for recording electrical activity in the human brain. As opposed to **electroencephalography** (EEG), ECoG places electrodes directly on the surface of a subject's brain. This is generally because they are undergoing surgery for intractable epilepsy.

Here's a sample of what an ECoG grid looks like, and what the data looks like within the grid:

![ECoG Sample](https://upload.wikimedia.org/wikipedia/commons/9/9c/Human_Electrocorticographic_%28ECoG%29_Signals.jpg)

<a href='https://commons.wikimedia.org/wiki/File:Human_Electrocorticographic_(ECoG)_Signals.jpg'>Image Credit</a>

In the above picture, we see in:

(A) the placement of several ECoG electrodes on a single patient.

(B) a surgical photo of the ecog placement. 

(C) activation maps for several tasks, representing neuronal activity in response to the onset of stimuli in each task

(D) the same maps in C mapped onto a 3D version of the brain.

# This task
The data included in this tutorial were collected from one patient who passively listened to sentences in the [TIMIT corpus](https://catalog.ldc.upenn.edu/LDC93S1). This is a collection of spoken english sentences, read by speakers from many different dialects. They were chosen in order to maximize the diversity of linguistic qualities such as spectrotemporal properties, phonemes, and word usage.

Here's a sample of three sentences:

In [1]:
from IPython.display import Audio
stim_folder = './raw_data/sample_stimuli/'

In [2]:
Audio(filename=stim_folder + 'sample_timit_1.wav')

In [3]:
Audio(filename=stim_folder + 'sample_timit_2.wav')

In [4]:
Audio(filename=stim_folder + 'sample_timit_3.wav')

As you can hear, each sentence has many non-overlapping words, which themselves contain linguistic features of differing complexity. The human brain likely responds to *all* of these features at one point or another, and encoding models are a great way to tease apart these differences using a single, naturalistic stimulus set.

For comparison, here are two stimuli that have often been used to probe the brain's response to *low-level* acoustic features (e.g., spectrotemporal features). First is a pink noise stimulus that mimics the frequency power dropoff of natural sound only. Second is a *ripple-modulated* noise stimulus that attempts to more closely reflect the spectro-temporal structure present in natural sound.

In [5]:
# Pink noise
Audio(stim_folder + 'sample_pinknoise.wav')

In [6]:
# Ripple stimulus
Audio(stim_folder + 'sample_ripple.wav')

We use natural speech because it is more akin to what humans experience on a daily basis, and because it is much more complex than these artifical stimuli in terms of high-level features (e.g., phonemes) and their interactions with the low-level features (e.g., spectro-temporal features).  This natural speech passive listening stimulus set will be used in the tutorials.

There are three tutorails in total (see above), and they are just a taste of the kinds of analysis used in encoding / decoding models. Our goal is not completeness, but to point the user in the right direction in order to begin using these methods in their own research.

We encourage the reader to run these tutorials and to learn more about the open-source tools that they use. 

# Credits
These materials were supported by the National Defense Sceince and Engineering Graduate Fellowship (NDSEG) as well as from the Moore-Sloan Foundation (via the Berkeley Institute for Data Science).

Finally, this tutorial (and much of our scientific literature) would not be possible without countless hours of work from the open-source (and open-science) community. This is a collection of researchers, students, programmers, and citizens who are dedicated to making science more efficient, more open, more productive, and more fun. This is a constant source of inspiration and a reminder of the fantastic scientific community we have.