Skip to content

ShreyDixit/Text2fMRI

Repository files navigation

Text2fMRI: Brain Encoding Models using LLMs

Open In Colab DOI Python License

Text2fMRI is a lightweight, tutorial-driven framework for building brain encoding models. It demonstrates how to predict whole-brain fMRI responses solely from video transcripts using Large Language Models (LLMs).

Originally developed for the Advanced Neuroimaging course at the Max Planck Institute for Human Cognitive and Brain Sciences, this repository serves as both an educational guide and a competitive baseline model for the Algonauts 2025 Challenge.


🚀 Quick Start

The easiest way to explore this project is via Google Colab. No local installation is required, and you get access to free GPUs.

Click here to launch the interactive tutorial

In this notebook, you will:

  1. Extract semantic features from text using a frozen LLM (e.g., Qwen-0.5B).
  2. Align text features to fMRI Time Repetition (TR) windows.
  3. Train a linear encoding model (or a lightweight Transformer) to map semantics to brain activity.
  4. Visualize predictions across different brain regions (Visual, Auditory, Language networks).

🧠 Model Description

Despite its lightweight architecture (approx. 52M trainable parameters), Text2fMRI achieves SOTA-level performance in auditory and language-selective cortices.

  • Input: Text transcripts from movies/videos (No audio or pixel data used).
  • Backbone: Uses feature extraction from pre-trained causal LLMs.
  • Dataset: Trained on the CNeuroMods dataset (Friends and Movie10).
  • Efficiency: Designed to run on consumer hardware while beating standard baselines.

💻 Local Installation

If you prefer to run this locally instead of on Colab, this project uses a pyproject.toml for dependency management.

  1. Clone the repository:

    git clone [https://github.com/ShreyDixit/Text2fMRI.git](https://github.com/ShreyDixit/Text2fMRI.git)
    cd Text2fMRI
  2. Install dependencies: We recommend using pip or uv.

    # Using uv
    uv sync

📚 Course Context

This material was originally created for the Brain Encoding Models lecture/workshop (2025). It bridges the gap between modern NLP (Transformers) and Neuroscience, showing how artificial neural networks can serve as proxy models for biological brains.

Citation

If you use this notebook or code in your research or coursework, please cite:

@software{dixit_2026_text2fmri,
  author       = {Dixit, Shrey},
  title        = {{Text2fMRI: Brain Encoding Models using LLMs (Course Materials)}},
  year         = 2026,
  publisher    = {Zenodo},
  version      = {v0.1.1},
  doi          = {10.5281/zenodo.18369791},
  url          = {https://doi.org/10.5281/zenodo.18369791}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages