Skip to content
Code for reproducing results in the paper "Topological gene-expression networks recapitulate brain anatomy and function"
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
code adding the selection of optimal parameters Jan 21, 2019
data Update richiardi_list.txt Nov 20, 2018
README.md Update README.md Mar 11, 2019

README.md

TDA summary for gene-expression: Mapper Algorithm in 2D

Authors: Alice Patania, Pierluigi Selvaggi, Mattia Veronese, Ottavia Dipasquale, Paul Expert, and Giovanni Petri

Abstract

Understanding how gene expression translates to and affects human behaviour is one of the ultimate aims of neuroscience. In this paper, we present a pipeline based on Mapper, a topological simplification tool, to produce and analyze genes co-expression data. We first validate the method by reproducing key results from the literature on the Allen Human Brain Atlas, and the correlations between resting-state fMRI and gene co-expression maps. We then analyze adopamine-related gene-set and find that co-expression networks produced by Mapper returned a structure that matches the well-known anatomy of the dopaminergic pathway. Our results suggest that topological network descriptions can be a powerful tool to explore the relationships between genetic pathways and their association with brain function and its perturbation due to illness and/or pharmacological challenge.

DISCLAIMER: Running all the scripts in this repository is going to give the list of all the results found in the paper, but not the figures or the standard exploratory analysis ( i.e. the histograms and KS tests ). I am willing to change this decision if anyone needs it, write to me or start an issue

To do list:

  • put up the datasets
  • make a parameters selection script
  • make all code into scripts that can be run from command line
  • make a script to compute the agreement matrix
  • make the shortest path script
  • write a tutorial on how to run all the code
  • find a way to put the dataset that are too big for git
  • add the list of sample ids used by Richiardi et al. in their paper
  • add dependencies

Content:

  1. data:
    • dataset normalized: Download the data used in the study here.
    • the two list of genes used in the study
      • dopamine.txt
      • richiardi.txt
  2. code:
    • MapperTools.py: All the functions needed to build the graph
    • parameters.py: Computes the statistics used for the choice of parameters.
      takes as input the dataset id (dopamine, richiardi, or full) and saves the statistics in a csv in the folder output.
    • selection.py: Selects the optimal parameters using the output from parameters.py.
      takes as input the dataset id (dopamine, richiardi, or full) and saves the parameters in a txt in the folder output.
    • run.py: Builds the graph for the optimal parameters found by selection.py.
      takes as input the dataset id (dopamine, richiardi, or full) and saves the adjacency matrix and node information in 2 pickled dictionaries in the folder output.
    • agreement_matrix.py: Computes the agreement matrix for the different graph built by run.py.
      takes as input the dataset id (dopamine, richiardi, or full) and saves the matrix a pickled pandas DataFrame in the folder output.
    • shortest_path.py: Computes the shortest path from the nodes containing samples of VGA and substantia nigra to the rest of the brain.
      takes as input the dataset id (dopamine, richiardi, or full) and saves the information for each node in a pickled dictionary in the folder output.

Citing

If you make use of this work in your research please cite the following paper:

Patania, Alice, Pierluigi Selvaggi, Mattia Veronese, Ottavia DiPasquale, Paul Expert, and Giovanni Petri. "Topological gene-expression networks recapitulate brain anatomy and function." bioRxiv (2018): 476382.

Bibtex

@article{patania2018topological,
title={Topological gene-expression networks recapitulate brain anatomy and function},
author={Patania, Alice and Selvaggi, Pierluigi and Veronese, Mattia and DiPasquale, Ottavia and Expert, Paul and Petri, Giovanni},
journal={bioRxiv},
pages={476382},
year={2018},
publisher={Cold Spring Harbor Laboratory}
}

How to reproduce the results:

$ python parameters.py name_gene_list
$ python selection.py name_gene_list
$ python run.py name_gene_list
$ python agreement matrix.py name_gene_list

with name_gene_list is one of (dopamine, richiardi, or full)

The file shortest_path.py can be run with any output from run.py and selection.py. In the paper we only looked at the outcomes from the dopamine related mappers, but it can be run on any other output.

Dependencies

An up-to-date Python 3.5 distribution, with the standard packages provided by the anaconda distribution is required.
In particular, the code was tested with:
pandas (version) etc

You can’t perform that action at this time.