Deep learning and transfer learning identify breast cancer survival subtypes from single-cell imaging data
This is the github repository for the project Deep learning and transfer learning identify breast cancer survival subtypes from single-cell imaging data by Shashank Yadav, Shu Zhou, Bing He and Lana Garmire et al.. It contains code and data for generating Figure 1-5 in the paper.
Installing the R kernel on the jupyter
install.packages('IRkernel')
IRkernel::installspec() # to register the kernel in the current R installation
Use the Bioconductor to install R packages.
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install(*PackageName* = "*Version*")
Use the package manager pip to install python packages.
pip install Numpy
The directories are as follows:
1_HMResult
contains the heatmaps in Figure.1 of the manuscript.2_PrognosisPred
contains the modeling of prognosis prediction using single-cell phenotype features, constructing cox-nnet models5_SankeyResult
contains SankeyPlots descrbing how the NMF-defined classes intersect with the clinicopathological classification5_AtypicalMatching
contains class label transfer and comparison with TCGA-BRCA and METABRIC datasets.Figures
contains Figure 1-5 showing in the paper.SCP_subgroups
contains the data used for plotting the supplementary Fig1.fig4_voilin/phen27_1
contains sub figures of Fig4.Pydata
contains python data used for5_sankey.ipynb
and4_Violin.ipynb
rds
contains.rds
and.rdata
data used for1_HMOverview.ipynb
,3_NMFPlot.ipynb
and3_4_HMCircos.ipynb
The other files are as follows.
1_HMOverview.ipynb
contains the ploting process of the general data heatmaps.2_ProgPlot.ipynb
contains the values in combining different sets of information from CP, TMI, and TCI.3_NMFPlot.ipynb
contains the result of NMF clustering and the consensusmap.3_NMFScore.ipynb
contains the visualization procedure of the silhouette and cophenetic score3_kaplanmeier_OS.ipynb
contains the kaplanmeier plot for NMF clustering and Grade, ER, PR, HER2 types.3_4_HMCircos.ipynb
contains the Heatmaps of Grade, ER, PR, HER2 and cell-cell interaction features for the NMF clusters and Circos plots demonstrate the correlation between features associated with each subpopulation. The export dimensions were enlarged to make Figure3e
.4s
annotations were put later in Adobe Photoshop for better explanation.4_Violin.ipynb
contains scoring and profiling for the seven clusters based on various Cell phenotypes and Cell-Cell interaction features.5_sankey.ipynb
contains procedures of constructing sankey plotsSup_Sankey_Scp.ipynb
contains procedures of constructing sankey plots for comparing the scp subgroups with our clusters.
- for
.ipynb
files: Using jupyter lab to execute the codes - for
.py
files:
python3 -m *filename*.py
- Shu Zhou - https://github.com/Sukumaru
This project is licensed under the GNU General Public License v3.0
License - see the LICENSE.md file for details