Skip to content
Code for processing brain data
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
encoding_pipeline cicling2019_publication Apr 4, 2019

Encoding fMRI Data

This repository contains code for evaluating language--brain encoding experiments. The experiments are described in our paper:


We provide readers for four datasets:

  • The Words Data by Mitchell et al. (2008)
  • The Alice Data by Brennan et al. (2016)
  • The Harry Potter Data by Wehbe et al. (2018)
  • The Stories Data by Dehghani et al. (2017). The data has not yet been published by the authors. Please contact them directly.


For a simple start, look at

In the paper, we report results from two experimental pipelines, one for isolated stimuli (e.g., single words) and one for continuous stimuli (e.g., a book chapter).
You can run them as follows:

  • python3
  • python3

This will re-run all experiments in the paper (which takes long!).

You might want to first run the experiments only for a single subject. You should then set yourpipeline.subject_ids =[1] or to another subject id for which you have downloaded the data. If you want to better understand the fmri data structure, have a look at

Language Models

We provide a class to add a language model and implementations for querying an Elmo model (Peters et al. (2018)) and a random language model .

Mapping Model

  • The mapping model is standard ridge regression.


We provide code for three common evaluation procedures:

  • pairwise evaluation
  • voxel-wise evaluation
  • representational similarity analysis


  • Numpy
  • Sklearn
  • Allennlp (for Elmo)
  • Spacy for tokenization, python -m spacy download en_core_web_lg
  • Pandas, matplotlib, seaborn and nilearn in case you want to plot the results.
You can’t perform that action at this time.