This repository contains the analysis scripts used for the paper: "Influence of node abundance on signaling network state and dynamics analyzed by mass cytometry" Xiao-Kang Lun, Vito RT Zanotelli, James D Wade, Denis Schapiro, Marco Tognetti, Nadine Dobberstein & Bernd Bodenmiller
The package can be installed using the python package manager pip. Using a virtual environment or an environment manager such as Anaconda is highly recommended. The code is compatible with Python2.7. The installation was mainly tested on Ubuntu 14.04, but was also found to work on OSX and Windows 7.
clone the github repsitory:
git clone https://github.com/BodenmillerGroup/Adnet.git
install it with pip:
pip install -e ./Adnet
The dependencies ('numpy', 'scipy', 'pandas','matplotlib', 'seaborn', 'configparser2','nose', 'argparse') should be automatically installed.
The analysis consists of a main workflow which is configured using a '.ini' configuration file and the data organized in a specific folder structure. Ipython notebooks are used for the downstream analysis and further visualization of the results.
File structure requirements:
The overall folder structure should be as follows:
Experiment: a folder corresponding to one cyTOF experiment, that can contain multiple barcoding plates.
AcquisitionA: a folder corresponding to an cyTOF acquisition barcoding plate. Contains:
AcquisitionA.csv: A csv file with the metadata, see example for exact structure.
gated: a folder containing the data
- xy_rowcolumn_xy.fcs: fcs files with '_' seperated fields in the name and the rowcolumn information.
AcquisitionB: identical structure
name_dict.csv: a comma seperated files with 2 columns: old, new old: column name as used in the .fcs file new: renamed column name as should be used in the analysis plots. Non ASCII characters can give problems.
config.ini: a configuration file with all the parameters used for the analysis. Please look at the specifications in the example file 'example/config_documentation.ini'
After installation of the package (above) the analysis can be run as follows:
python -m adnet.adnet_analysis /pathto/config.ini
Depending on the settings of the config.ini file, the following output will be generated:
- outfolder: folder defined in the config.ini
config.ini: a copy of the config.ini used for the analysis
bin_dat: a pandas pickle file containing the summary statistics generated by the analysis. Can be loaded as pandas.read_pickle("bin_dat")
complete_dat: a pandas pickle file containing the single cell data used by the analysis. Can be loaded as pandas.read_pickle("complete_dat")
Cutoff.pdf: the histogram showing the cutoff chosen in relation to the negative controls
Plots as png, pdf: plots are strongly depending on the configuration specified in the file.
In the root directory is an 'example' folder which contains configuration files for the 3 analyses from the paper.
Please change the paths in the config ini file to match the current repository location or make sure you are in the example folder.
Afterwards the analysis can be run as follows (assuming you are in the example folder, otherwise adapt the path):
python -m adnet.adnet_analysis ./config_xxx.ini
Runs the bpR2 analysis for the 20 overexpressions. Because of data storage reasons only 2 replicates of 1 ovexpression group is included in this repository. However the other folders are already prepared and the FCS files from the data repository simply need to be copied in. To activate the other folders just uncomment the folder section.
Main analysis allvsall
Calculates bpR2 and correlation for all pairwise marker combinations. The generated data can be used for correlation heatmaps (see Notebooks). Not all data included.
Runs the bpR2 analysis for the mutation data. All data included.
Mutations analysis allvsall
Calculates bpR2 and correlation for all pairwise marker combinations. The generated data can be used for correlation heatmaps (see Notebooks). All data included.
Runs the bpR2 analysis for the comparison of differentially tagged overexpressions, i.e. Flag-N, Flag-C, GFP-N, GFP-C. All data included.
The figures from the paper can be reproduced with the following code:
Fig 1: no actual data shown
- a) via cytobank
- b) from Example 'Mutation analysis', plots 'Trends_EGF_overexpression_marker'
- c) via cytobank
- d) via cytobank
- e) & f) from notebooks/correlation_heatmaps.ipynb
- g) & h) no code provided
Fig 3: no code provided
- a)-h) from example 'Main analysis', plots 'EGF_overexpression_...', 'Trends_EGF_overexpression_marker_BinsoverTP'
- i) from notebooks/kinetic_analysis.ipynb
Table 2: from notebooks/SIGNOR_analysis.ipynb
Sup. Figure 3:
- e) from notebooks/tag_comparison.ipynb
Sup. Figure 9:
- b)-e) from example 'Mutation analysis' plots
Sup. Figure 10:
- c) from example 'Main analysis', plot 'cutoff'
Sup. Figure 11:
- a) from noteboooks/readout_comparison.ipynb
Sup. Figure 13:
- a) & b) from notebooks/supplementary_fig13_heatmaps.ipynb
Sup. Figure 14: -from notebooks/kinetic_analysis.ipynb
- Supplementary File 1, 2, 4: from example 'Main analysis'
- Supplementary File 3: from notebooks/correlation_heatmaps.ipynb
This repository uses code from the following projects:
Matplotlib: John D. Hunter. Matplotlib: A 2D Graphics Environment, Computing in Science & Engineering, 9, 90-95 (2007), DOI:10.1109/MCSE.2007.55
Scipy: Jones E, Oliphant E, Peterson P, et al. SciPy: Open Source Scientific Tools for Python, 2001-, http://www.scipy.org/ [Online; accessed 2017-01-04].
Numpy: Stéfan van der Walt, S. Chris Colbert and Gaël Varoquaux. The NumPy Array: A Structure for Efficient Numerical Computation, Computing in Science & Engineering, 13, 22-30 (2011), DOI:10.1109/MCSE.2011.37
Pandas: Wes McKinney. Data Structures for Statistical Computing in Python, Proceedings of the 9th Python in Science Conference, 51-56 (2010)
Ipython: Fernando Pérez and Brian E. Granger. IPython: A System for Interactive Scientific Computing, Computing in Science & Engineering, 9, 21-29 (2007), DOI:10.1109/MCSE.2007.53