Here, we investigate sex differences in structural MRI derived measures of brain volume in both humans and mice.
This repository contains all of the data, code, and text necessary to rerun analyses found in the preprint: Elisa Guma, Antoine Beauchamp, Siyuan Liu, Elizabeth Levitis, Jacob Ellegood, Linh Pham, Rogier B Mars, Armin Raznahan *, Jason P Lerch *, (2023). Comparative neuroimaging of sex differences in human and mouse brain anatomy, Preprint: https://doi.org/10.1101/2023.08.23.554334 *Equal contribution
Article accessible here: https://www.biorxiv.org/content/10.1101/2023.08.23.554334v1
We use cross-species structural magnetic resonance imaging to carry out the first comparative neuroimaging study of sex-biased neuroanatomical organization of the human and mouse brain.
In the input data folder, you will find the raw and pre-processed human and mouse data used for this study.
The human data comes from Human Connectome Project (HCP) 1200 release (3T T1-weighted 0.7mm3 sMRIs from healthy young adults, 597 females/496 males aged 22-35 years).
HCP_demographics_QC.csv: demographics file
df_HCP_volumes_clean.csv (also available as RDS): aggregated volumes used for analysis include cortex (Glasser), subcortex (aseg), amygdala and hippocampal subfields, brainstem subdivisions, and hypothalamus volumes. These were generated from data files found in the raw_data_human folder, which includes the R script used to clean the data
HCP_homologs_aggregated.csv (also available as RDS): aggregated volume for homologous brain regions, derived from the script found in the analysis_scripts/cross_species/ folder.
mz_twins_1.csv: contains info for twin and sibling pairs for HCP
Volumes_150.RData: mouse volumes in tree form as RData
Allen_hierarchy_definitions.json: required to create a tree and ascribe hierarchy to regions (i.e., parents, children, siblings, etc)
DSURQE_40micron_average.mnc: average brain file for DSURQE atlas useful for plotting
DSURQE_40micron_labels.mnc: label file for DSURQE atlas
DSURQE_40micron_R_mapping.mnc: mapping of labels from DSURQE atlas
full_mouse_sampletree.RDS: mouse data in tree form as an RDS
Mouse_demographics: demographics for mice
Mouse_homologs_aggregated_postcombat.csv (also available as RDS): volumes from aggregate homologous regions, derived from the script found in the analysis_scripts/cross_species/ folder.
mouse_volumes_tree.RData: mouse volumes in tree form
Variance_analysis_mouse_2.csv and Variance_analysis_mouse_noTTV_2.csv: necessary for running the Variance analysis
Raw data files:
Mouse_expression_matrix.csv: raw gene expression data from Allen Mouse Brain Atlas for all genes
Human_expression_matrix.csv: raw gene expression data from Allen Human Brain Atlas for all genes
MouseHumanGeneHomologs_edited.csv: list of homologous genes for humans and mice
_Aggregated_human_gene_expression.csv _(also available as .RDS): averaged human gene expression data for homologous brain regions
Chromosome info: Human_chromosome_genes.csv: links human gene names to location on chromosomes
Mouse_chromosome_fullinfo.csv: links mouse gene names to location on chromosomes
Chromosome_key_allen.csv: required for the mouse data
Weights_for_aggregation_human.csv: information required to compute the weighted gene average based on the volume of subregions in the human atlas
Aggregated data:
Aggregated_mouse_gene_expression.csv (also available as .RDS): averaged mouse gene expression data for homologous brain regions
_Transposed_homologous_human_genes_weighted.csv _(also available as .RDS): averaged homologous human gene expression data for homologous brain regions
Transposed_homologous_mouse_genes.csv (also available as .RDS): averaged homologous mouse gene expression data for homologous brain regions
Other analysis files:
Df_corr_expression.csv: correlation of all homologous genes across homologous brain regions
X_chromosome_genes.csv: list of x-chromosome genes that are also homologous
tree_tools.R: tools required to prune the tree, useful for script in analysis_scripts/cross_species/ folder.
Sex hormone files:
Homologous sex hormone genes: sex_hormone_genes_short.csv,
Homologous androgen genes: sex_hormone_genes_male_short.csv,
Homologous estrogen and progesterone genes: sex_hormone_genes_female_short.csv
these scripts perform all analyses included in the manuscript, and are also used to generate tables and figures
Cross_species:
Sex-diffs-homologous-ROIs.Rmd: Aggregates volumes for homologous brain regions in humans and mice; Runs Combat harmonization on mouse data to account for different studies; Runs a linear model in each species testing for sex differences and covarying for total brain volume, age, and euler number for humans, and total brain volume, age, and background strain in mice; Correlates effect size for sex computed from those linear models across all regions, and for cortex and non cortex separately
Gene_expression_scripts
Gene-resampling.Rmd: script to use to generate null distribution as a control for gene subset analyses
Homologous_gene_analysis.Rmd: analysis script used to compute analyses evaluating whether the cross-species similarity of neuroanatomical sex differences in related to the cross-species similarity of homologous gene expression patterns across homologous brain regions
_Human_expression_data_cleanup.Rmd: _script used to filter gene expression data to homologous brain regions, and to filter those genes to only include homologous genes.
Mouse_expression_data_cleanup.Rmd: script used to filter gene expression data to homologous brain regions, and to filter those genes to only include homologous genes.
Sex-hormone-gene-lists.Rmd: script used to query Gene Ontology database to get a list of sex hormone genes (androgen, estrogen, and progesterone). In the script, I also filter by homologous genes.
Human_anatomy
Human_total_regional_volume_analysis.Rmd: script used to run linear model to look at sex differences in total and regional brain volume and variance in volume.
Human_analysis_notwinpairs.Rmd: script used to assess whether including of twin pairs affected sex-difference maps
Mouse_anatomy
Mouse_total_regional_volume_analysis.Rmd: script used to run linear model to look at sex differences in total and regional brain volume and variance in volume.