Skip to content

Jupyter notebooks for reproducing the analyses in the paper 'A novel independence test for somatic alterations in cancer shows that biology drives mutual exclusivity but chance explains most co-occurrence'

License

Notifications You must be signed in to change notification settings

scanisius/discover-notebooks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

DISCOVER supplementary notebooks

The Jupyter notebooks below contain all the code required to reproduce the figures and results of the paper A novel independence test for somatic alterations in cancer shows that biology drives mutual exclusivity but chance explains most co-occurrence.

To work with these files, Jupyter, IPython, and several Python packages should be installed. The easiest way to install these dependencies is by using Miniconda or Anaconda. The following command creates a conda environment that contains all required packages to execute the notebooks.

conda create -n discover-notebooks -c http://ccb.nki.nl/software/discover/repos/conda \
    corclust==0.1 \
    discover==0.9 \
    matplotlib==1.5.1 \
    networkx==1.11 \
    numpy==1.10.4 \
    pandas==0.17.1 \
    pytables==3.2.2 \
    scipy==0.17.0 \
    statsmodels==0.6.1 \
    notebook \
    ipykernel

Only for the notebook named Group test a few more packages need to be installed using the following command.

conda install -n discover-notebooks -c http://ccb.nki.nl/software/discover/repos/conda -c r -c msys2 \
    switching==0.1 \
    ccomet-with-timeout==1.0.2 \
    rpy2 \
    ipyparallel

Next, activate the created environment and start the Jupyter notebook using the following two commands. Make sure <notebook-dir> is replaced by the location of the .ipynb files after unzipping the downloaded file.

source activate discover-notebooks
jupyter notebook --notebook-dir=<notebook-dir>

On Windows, the first command should be replaced by:

activate discover-notebooks

Simulated data analyses

  • Pairwise analyses of simulated data

    Compares the Binomial, Fisher's exact and DISCOVER tests on simulated data.

  • Group test

    Compares the DISCOVER group test to six alternative methods (CoMEt, MEGSA, MEMo, muex, mutex, and TiMEx) on simulated data.

Pan-cancer analyses

  • Download PanCan12 data

    Downloads the mutation and copy number data for the TCGA PANCAN12 studies.

  • Gene selection

    Selects the genes for use in the pairwise analyses.

  • Pairwise analysis

    Performs pairwise co-occurrence and mutual exclusivity analyses.

  • Within-chromosome co-occurrence analysis

    Tests for co-occurrences between genes located on the same chromosome, in order to assess whether the DISCOVER test will detect these 'positive controls'.

  • STRING enrichment

    Determines the overlap of mutually exclusive gene pairs with the STRING functional interaction network.

  • MSigDb group tests

    Identifies significantly mutually exclusive gene sets based on predefined gene sets extracted from MSigDb.

  • De novo gene set identification

    Detects de novo mutually exclusive gene sets based on correlation clustering of pairwise mutual exclusivities.

About

Jupyter notebooks for reproducing the analyses in the paper 'A novel independence test for somatic alterations in cancer shows that biology drives mutual exclusivity but chance explains most co-occurrence'

Resources

License

Stars

Watchers

Forks

Packages

No packages published