# This Jupyter Notebook is about Bayesian Data Analysis for neuroscience data

## Introduction

This notebook is part of a 20-week internship project carried out at Ulster University.
The main goal of the project is to make advanced Bayesian statistical models more accessible
to experimental neuroscientists through user-friendly code, tutorials, and examples.

Specifically, this notebook focuses on applying existing Bayesian models to neuroscience datasets
using libraries in Python.

## Objectives

1. Apply the existing Bayesian models to a neuroscience dataset from scratch,
   documenting each step as if it were for a beginner user.

2. Design a simple and reproducible analysis pipeline using PyMC.

3. Produce clear, well-documented code that can later be integrated into
   an interactive tutorial or a web application.

## Author

- Mathis DA SILVA
- Ulster University Internship (July–December 2025)
- Supervisors: Dr. Cian O'Donnell & Dr. Conor Houghton

## References

- [Dataset from "Classification of psychedelics and psychoactive drugs based on brain-wide imaging of cellular c-Fos expression"](https://www.nature.com/articles/s41467-025-56850-6#Sec25)
- [Hierarchical Bayesian modeling of multi-region brain cell count data](https://elifesciences.org/reviewed-preprints/102391v1)
- [Statistical Rethinking 2023 PDF](https://civil.colorado.edu/~balajir/CVEN6833/bayes-resources/RM-StatRethink-Bayes.pdf)
- [Statistical Rethinking 2023 Videos](https://www.youtube.com/watch?v=FdnMWdICdRs&list=PLDcUM9US4XdPz-KxHM4XHt7uUVGWWVSus)

Here, we call libraries that we will use in this notebook for the moment.

In [1]:
import math
import numpy as np
import pymc as pm
import pandas as pd



We will use the dataset from the paper "Classification of psychedelics and psychoactive drugs based on brain-wide imaging of cellular c-Fos expression".



In [2]:
dataset = pd.read_excel('data/dataset_neuroscience_vo.xlsx')

dataset

Unnamed: 0,abbreviation,region name,brain area,5MEO1 count,5MEO2 count,5MEO3 count,5MEO4 count,5MEO5 count,5MEO6 count,5MEO7 count,...,PSI7 count,PSI8 count,SAL1 count,SAL2 count,SAL3 count,SAL4 count,SAL5 count,SAL6 count,SAL7 count,SAL8 count
0,FRP,Frontal pole cerebral cortex,Cortex,9574,7781,17598,4425,7428,8302,4288,...,3367,4342,7404,4925,12521,10363,4562,14383,789,6067
1,ILA,Infralimbic area,Cortex,12138,6742,28070,1685,15612,17191,6061,...,7591,5778,9665,8049,10853,2844,15747,15412,11667,21630
2,ORBl,Orbital area lateral part,Cortex,48129,45849,120147,28655,40438,54206,39938,...,8291,14603,56825,30618,58755,14705,26686,59049,5192,36019
3,ORBm,Orbital area medial part,Cortex,17225,8551,34163,6330,14908,23250,7993,...,4878,7177,13035,16101,14017,7855,14478,30999,10051,28545
4,ORBvl,Orbital area ventrolateral part,Cortex,32690,24460,58132,16015,24182,31926,15148,...,5081,9116,37775,26349,33593,10743,21789,37644,9320,23276
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
317,,,,,,,,,,,...,,,,,,,,,,
318,,,,,,,,,,,...,,,,,,,,,,
319,,,,2874169,2741911,6084926,2172441,2213387,3073534,3023897,...,1727876,2008088,3433408,2408396,2498773,916741,2420846,1841269,1139141,3856055
320,,,,,,,,,,,...,,,,,,,,,,


#### Indications:

Previously, we added the dataset. In which, the first three columns represent brain regions with name and abbreviation. Others represent mice group by drugs as **MDMA**, **Ketamine**, **Fluoxetine**, ...

There are **64 mice** in total, and each mouse has a value for each brain region. The values represent the number of cells expressing c-Fos, a marker of **neuronal activity**. Plus, there are **315 brain regions** in the dataset.