GitHub - giocoal/algonauts2023-image-fMRI-encoding-model: Code for my Master's Thesis "Deep Neural Encoding Models of the Human Visual Cortex to Predict fMRI Responses to Natural Visual Scenes" and my submission for the "Algonauts Project 2023 Challenge".

Code for the Master's Thesis "Deep Neural Encoding Models of the Human Visual Cortex to Predict fMRI Responses to Natural Visual Scenes".

Research Internship - MSc in Data Science - University of Milano-Bicocca - Imaging and Vision Laboratory.

This is a repository for my submission for the Algonauts Project 2023 Challenge (id: giorgiocarbone).

Introduction

One of the main objectives of computational neuroscience is to comprehend the biological mechanisms that enable humans to perceive, process, and understand complex visual scenes. Visual neural encoding models are computational models that mimic the hierarchical processes underlying the human visual system and aim to explain the relationship between visual stimuli and corresponding neural activations evoked in the human visual cortex. A visual encoder can serve as a structured system for testing biological hypotheses concerning how visual information is processed, represented, and organized in the human brain.

The main objective of this thesis is to develop a comprehensive voxel-based and subject-specific image-fMRI neural encoding model of the human visual cortex based on Deep Neural Networks (DNNs) and transfer learning for the prediction of local neural blood oxygen level-dependent (BOLD) functional magnetic resonance imaging (fMRI) responses to complex visual stimuli.

We applied a two-step linearizing strategy to visual encoding, based on the use of two separate computational models respectively for the non-linear feature mapping (employing pre-trained computer vision DNNs as feature extractors) of the stimulus image into its latent representations and the subsequent linear activity mapping of the visual features into the BOLD response amplitudes of the individual voxels, using Principal Component Analysis (PCA) to reduce the dimensionality of the visual features and independent ridge regression models to map the PCA components in the activity of each voxel.

Furthermore, in order to meet the criteria of mappability and predictivity that characterize a good encoding model, we adopted a ROI-wise and mixed encoding strategy, modeling the encoding of voxels belonging to different regions of interest (ROIs, groups of voxels that share functional properties) separately to achieve maximum accuracy across the entire visual cortex and within individual ROIs. To determine the best feature mapping method for each region of interest, we tested the extraction of visual features from layers at varying depths of several pre-trained Convolutional Neural Networks (AlexNet, ZFNet, RetinaNet, EfficientNet-B2, VGG-16, VGG-19) and Vision Transformers (ViTs), characterized by different training parameters (training goal, training dataset, and learning method). During this testing phase, the existence of similarity and functional alignment between the hierarchical architecture of the pre-trained DNNs and the structure of the visual cortex emerged, a result that motivated the use of the ROI-wise strategy.

The proposed model achieves, in predicting the neural responses to the images of the test set of the Algonauts Project 2023 Challenge dataset, an overall accuracy score of 0.52, expressed as the Median Noise Normalized Squared Correlation (MNNSC) across all voxels of the cortical surfaces of all subjects, outperforming the baseline model proposed by the challenge organizers (which achieved a score of 0.41). The results of this thesis demonstrate the effectiveness of mixed, ROI-wise, deep, and transfer learning-based approaches in the context of image-fMRI visual encoding modeling.

Dataset

The thesis project was developed using the Algonauts Project 2023 Challenge dataset, a large collection of eight subjects' fMRI responses to visual scenes. During the fMRI scans, each subject viewed 9,000-10,000 colored natural scenes, and the corresponding activations for the 39,548 voxels of the visual cortex were encoded as betas, which are single-value estimates of the amplitude of the BOLD fMRI response, indirectly representing the activation or deactivation of the neurons in a specific voxel evoked by viewing a stimulus.

Requirements

Python 3.9.16
CUDA Toolkit 11.6
CuDNN 8302
Pillow 9.2.0
NiBabel 5.2.0
Nilearn 0.10.3
Plotly 5.14.1
torch 1.13.0
torchvision 0.14.0
Transformers 4.31.0
PyTorchCV 0.0.67
EfficientNet-PyTorch 0.7.1
matplotlib 3.5.2
numpy 1.22.4
pandas 1.5.3
scikit_learn 1.1.1
scipy 1.7.3
tqdm 4.64.1
torchmetrics 0.11.4
plotly 5.14.1

Status

Project is: Done

Contact

Feel free to contact me!

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
files		files
imgs		imgs
scripts		scripts
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.MD		README.MD
encoding_evaluation_full_singlelayer.py		encoding_evaluation_full_singlelayer.py
pycortex_visualizations.ipynb		pycortex_visualizations.ipynb
single_step_encoding_subj1_test.py		single_step_encoding_subj1_test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

files

files

imgs

imgs

scripts

scripts

src

src

.gitattributes

.gitattributes

.gitignore

.gitignore

LICENSE

LICENSE

README.MD

README.MD

encoding_evaluation_full_singlelayer.py

encoding_evaluation_full_singlelayer.py

pycortex_visualizations.ipynb

pycortex_visualizations.ipynb

single_step_encoding_subj1_test.py

single_step_encoding_subj1_test.py

Repository files navigation

Code for the Master's Thesis "Deep Neural Encoding Models of the Human Visual Cortex to Predict fMRI Responses to Natural Visual Scenes".

Research Internship - MSc in Data Science - University of Milano-Bicocca - Imaging and Vision Laboratory.

Table of contents

Introduction

Dataset

Requirements

Status

Contact

About

Releases

Packages

Languages

License

giocoal/algonauts2023-image-fMRI-encoding-model

Folders and files

Latest commit

History

Repository files navigation

Code for the Master's Thesis "Deep Neural Encoding Models of the Human Visual Cortex to Predict fMRI Responses to Natural Visual Scenes".

Research Internship - MSc in Data Science - University of Milano-Bicocca - Imaging and Vision Laboratory.

Table of contents

Introduction

Dataset

Requirements

Status

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Languages