Single-cell based prognosis modeling identifies new breast cancer survival subtypes by cell-cell interactions
This is the github repository for the project Single-cell based prognosis modeling identifies new breast cancer survival subtypes by cell-cell interactions by Shashank Yadav, Shu Zhou, Bing He and Lana Garmire et al.. It contains code and data for generating Figure 1-5 in the paper.
Installing the R kernel on the jupyter
install.packages('IRkernel')
IRkernel::installspec() # to register the kernel in the current R installationUse the Bioconductor to install R packages.
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install(*PackageName* = "*Version*")Use the package manager pip to install python packages.
pip install NumpyThe directories are as follows:
1_HMResultcontains the heatmaps in Figure.1 of the manuscript.2_PrognosisPredcontains the modeling of prognosis prediction using single-cell phenotype features, constructing cox-nnet models5_SankeyResultcontains SankeyPlots descrbing how the NMF-defined classes intersect with the clinicopathological classification5_AtypicalMatchingcontains class label transfer and comparison with TCGA-BRCA and METABRIC datasets.Figurescontains Figure 1-7 showing in the paper.fig4_voilin/phen27_1contains sub figures of Fig4.Pydatacontains python data used for5_sankey.ipynband4_Violin.ipynbrdscontains.rdsand.rdatadata used for1_HMOverview.ipynb,3_NMFPlot.ipynband3_4_HMCircos.ipynb
The other files are as follows.
1_HMOverview.ipynbcontains the ploting process of the general data heatmaps.2_ProgPlot.ipynbcontains the values in combining different sets of information from CP, TMI, and TCI.3_NMFPlot.ipynbcontains the result of NMF clustering and the consensusmap.3_NMFScore.ipynbcontains the visualization procedure of the silhouette and cophenetic score3_kaplanmeier_OS.ipynbcontains the kaplanmeier plot for NMF clustering and Grade, ER, PR, HER2 types.3_4_HMCircos.ipynbcontains the Heatmaps of Grade, ER, PR, HER2 and cell-cell interaction features for the NMF clusters and Circos plots demonstrate the correlation between features associated with each subpopulation. The export dimensions were enlarged to make Figure3e.4sannotations were put later in Adobe Photoshop for better explanation.4_Violin.ipynbcontains scoring and profiling for the seven clusters based on various Cell phenotypes and Cell-Cell interaction features.5_sankey.ipynbcontains procedures of constructing sankey plots
- for
.ipynbfiles: Using jupyter lab to execute the codes - for
.pyfiles:
python3 -m *filename*.py- Shu Zhou - https://github.com/Sukumaru
This project is licensed under the GNU General Public License v3.0 License - see the LICENSE.md file for details