Scripts related to results presented in Dufva and Pölönen et al. Immunogenomic landscape of hematological malignancies.
If you use the data, analysis, results, Please cite:
-
Dufva Pölönen et al. https://doi.org/10.1016/j.ccell.2020.06.002
-
Data DOI: 10.7303/syn21991014
If you use Hemap data, cite also:
- Hemap: An interactive online resource for characterizing molecular phenotypes across hematologic malignancies Petri Pölönen, Juha Mehtonen, Jake Lin, Thomas Liuksiala, Sergei Häyrynen, Susanna Teppo, Artturi Mäkinen, Ashwini Kumar, Disha Malani, Virva Pohjolainen, Kimmo Porkka, Caroline A. Heckman, Patrick May, Ville Hautamäki, Kirsi J. Granberg, Olli Lohi, Matti Nykter and Merja Heinäniemi Cancer Res April 2 2019 DOI: 10.1158/0008-5472.CAN-18-2970
If you use other publicly available data sets that were analyzed here (TCGA AML/DLBCL, Chapuy et al., Tyner et al., etc), please also cite the original research.
If you are only interested in the data sets analyzed here, check datasets_synapseID.txt for synapse accession codes.
-
Get synapse credentials https://www.synapse.org
-
Access synapse project syn21991014
-
Download project data:
- Input files individually (see scripts for filenames and download from https://www.synapse.org/#!Synapse:syn21991014/files/) (Recommended)
- Programmatic access (synapse, check synapseID_Filename.txt for accession codes):
pip install synapseclient synapse get synapseID - Synapse bulk (70 gb):
pip install synapseclient synapse get syn21991014 -r
-
clone the git project:
git clone https://github.com/systemsgenomics/ImmunogenomicLandscape-BloodCancers.git -
Install the required R packages
-
Run the analysis:
- set working directory to the folder with the data. setwd("path/data")
- modify GIT_HOME variable in R script (folder where the git folder is cloned). GIT_HOME typically points to common_scripts that contains various statistical and visualisation tools that were used in the analysis.
The processed and intermediate files for these scripts can be downloaded to reproduce the analysis. Inputs for these scripts have been generated using the scripts under preprocessing.
Fig1C_AML_cytscore_flow_RNAseq_comparison.R (Fig1C)
Fig1_plots.R (Figure 1F-H)
Statistical_analysis_Cytolytic_Score_development.R (FigureS1B, G-I)
Fig2_microenvironment_analysis.R (TableS2, Fig2A-B, FigS2C)
FigS2AB_microenvironment_analysis_GSEA_GSVA.R (FigS2A-B)
FigS2DE_microenvironment_validation_CLL_AML.R (FigS2D-E)
Fig3A_DLBCL_cytscore_oncoprint.R (Fig3A)
Fig3BC_DLBCL_cytscore_boxplots.R (Fig3B-C)
Fig3D_TCGA_AML_cytscore_oncoprint.R (Fig3D)
Fig3_FigS3_MDSsignature.R (Fig3E, FigS3D)
Fig3_plots_scRNA.R (Fig3F-H and K, FigS3K and O)
Statistical_analysis_DE_analysis_MDSlike.R (TableS3 tab)
Statistical_analysis_scRNA_MDSlike_analysis.R (TableS3 tabs)
Statistical_analysis_Szabo_TCell_analysis.R (Fig3I, FigS3M)
Statistical_analysis_Yang_NKCell_analysis.R (Fig3J, FigS3N)
FigS3L.R (TableS3 tab, FigS3L)
Statistical_analysis_HLAII_Score_development.R (FigS4A-B)
Fig4D_TCGA_AML_complexheatmap_CIITA.R (Fig4D)
Statistical_analysis_FIMM_AML_RRBS.R (Fig4G and H, FigS4L)
FigS4C_AML_HLAIIscore_flow_RNAseq_comparison.R (FigS4C)
FigS4D_TCGA_AML_global_hypermethylation.R (FigS4D)
FigS4EF_CIITA_methylation_validation_ERRBS.R (FigS4E-F)
FigS4K_CCLE_CIITA_methylation.R (FigS4K)
FigS4J_CIITA_methylation_validation_GSE49031.R (FigS4J)
Fig5A_ligands_heatmap.R (Fig5A)
Fig5_DE_analysis_costim.R (Fig5B, FigS5B, Table S5 tabs)
Fig5C_TCGA_AML_ligand_correlation_volcanoplot.R (Fig5C)
Fig5D_S5G_DLBCL_GSE98588_ligand_boxplots.R (Fig5D, FigS5G)
Fig5E_PDL1_IHC_boxplot.R (Fig5E)
Fig5F_VISTA_IHC_boxplot.R (Fig5F)
Fig5G_CD70_CRISPR_T cell_stimulation.R (Fig5G)
FigS5C.R (FigS5C)
FigS5DE_TCGA_ligand_methylation_AML_DLBCL_comparison.R (FigS5D-E)
Fig6B_S6F_CGA_tSNEplot_hemap.R (Fig6B, FigS6F)
Fig6C_S6B_CGA_Hemap.R (Fig6C, FigS6B)
Fig6D_CCLE_CGA_heatmap.R (Fig6D)
Fig6F_CoMMpass_CGA_boxplot.R (Fig6F)
Fig6G_CoMMpass_CGA_oncoprint.R (Fig6G)
Fig6H_DLBCL_GSE98588_CGA_oncoprint.R (Fig6H)
Fig6H_FigS6H_CGA_heatmap_GSE98588.R (Fig6H, FigS6H, FigS6I)
FigS6G_CGA_GSEA_hemapMM.R (FigS6G)
FigS6A_GTEx_CGA_heatmap.R (FigS6A)
FigS6C_Hemap_CGA_dotplots.R (FigS6C)
FigS6D_TCGA_antigen_methylation.R (FigS6D)
FigS6E_CoMMpass_CGA_heatmap.R (FigS6E)
Statistical_analysis_CGA_discovery_Hemap.R
Fig7_Univariate_Coxph_survival.R (TableS7)
Fig7_univariate_survival_forestplot.R (Fig7A, Fig7B-D, FigS7A-B)
Fig7_multivariable_regression_eNet_survival.R (Fig7E-F, FigS7C-G)
Statistical_analysis_correlations_BeatAML.R (TableS3-5)
Statistical_analysis_correlations_CoMMpass.R (TableS4-6)
Statistical_analysis_correlations_GSE98588_DLBCL.R (TableS3-6)
Statistical_analysis_correlations_Reddy.R (TableS3-5)
Statistical_analysis_correlations_TCGA_AML.R (TableS3-5)
Statistical_analysis_correlations_TCGA_DLBCL.R
These scripts are for reference only. Raw/processed/input data would have to be downloaded and processed for these scripts. Check the publication for data accession codes.
Preprocessing_CIBERSORT_MCPcounter.R
Preprocessing_normalize_hguarray_GSE98588.R
Preprocessing_Hemap_featurematrix_generation.R (TableS1)
Preprocessing_REDDY_DLBCL_featurematrix_generation.R
Preprocessing_TCGA_AML_featurematrix_generation.R
Preprocessing_TCGA_DLBCL_featurematrix_generation.R
Preprocessing_coMMpass_featurematrix_generation.R
Preprocessing_GSE98588_DLBCL_featurematrix_generation.R
Preprocessing_PanALL.R
Preprocessing_MM_subtyping.R
Preprocessing_DLBCL_subtyping.R
Preprocessing_ALL_subtyping.R
Preprocessing_AML_subtyping.R
Preprocessing_TCGA_AML_add_meth_probes.R
Preprocessing_TCGA_meth_data_genelist.R
Preprocessing_FIMM_AML_RRBS.R
Preprocessing_AML_RRBS_meth_de_analysis.R
Preprocessing_scRNA_CLL_GSE111014.R
Preprocessing_scRNA_FIMM_AML.R
Preprocessing_scRNA_Galen_AML.R
Preprocessing_scRNA_HCA.R
Preprocessing_scRNA_PB_Citeseq.R
Preprocessing_scRNA_Szabo_Tcells_dataprocessing.R
Preprocessing_scRNA_Yang_NK_dataprocessing.R
Preprocessing_scRNA_integrate_Tcells_NKcells.R
R version 3.6.0 (2019-04-26)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
Matrix products: default
BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so
locale:
[1] LC_CTYPE=C LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] grid parallel stats4 stats graphics grDevices utils
[8] datasets methods base
other attached packages:
[1] tidyr_1.0.0
[2] dplyr_0.8.3
[3] cowplot_1.0.0
[4] tibble_2.1.3
[5] stringr_1.4.0
[6] survminer_0.4.6
[7] survcomp_1.34.0
[8] prodlim_2018.04.18
[9] Seurat_3.1.1.9021
[10] RnBeads_2.2.0
[11] plyr_1.8.4
[12] methylumi_2.30.0
[13] minfi_1.30.0
[14] bumphunter_1.26.0
[15] locfit_1.5-9.1
[16] iterators_1.0.12
[17] foreach_1.4.7
[18] Biostrings_2.52.0
[19] XVector_0.24.0
[20] SummarizedExperiment_1.14.1
[21] DelayedArray_0.10.0
[22] BiocParallel_1.18.1
[23] FDb.InfiniumMethylation.hg19_2.2.0
[24] org.Hs.eg.db_3.8.2
[25] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
[26] GenomicFeatures_1.36.4
[27] reshape2_1.4.3
[28] scales_1.0.0
[29] illuminaio_0.26.0
[30] matrixStats_0.54.0
[31] gridExtra_2.3
[32] gplots_3.0.1.1
[33] fields_10.0
[34] maps_3.3.0
[35] spam_2.4-0
[36] dotCall64_1.0-0
[37] ff_2.2-14
[38] bit_1.1-14
[39] cluster_2.1.0
[40] MASS_7.3-51.4
[41] GenomicRanges_1.36.0
[42] GenomeInfoDb_1.20.0
[43] RColorBrewer_1.1-2
[44] openxlsx_4.1.4
[45] multipanelfigure_2.0.2
[46] mclust_5.4.5
[47] Matrix_1.2-17
[48] Hmisc_4.2-0
[49] Formula_1.2-3
[50] survival_2.44-1.1
[51] lattice_0.20-38
[52] GSVA_1.32.0
[53] ggridges_0.5.1
[54] ggrastr_0.1.7
[55] ggpubr_0.2.3
[56] future_1.14.0
[57] forestplot_1.9
[58] checkmate_1.9.4
[59] magrittr_1.5
[60] EnhancedVolcano_1.3.5
[61] ggrepel_0.8.1
[62] ggplot2_3.2.1
[63] edgeR_3.26.8
[64] limma_3.40.6
[65] data.table_1.12.4
[66] ComplexHeatmap_2.0.0
[67] circlize_0.4.7
[68] caTools_1.17.1.2
[69] AnnotationDbi_1.46.1
[70] IRanges_2.18.2
[71] S4Vectors_0.22.0
[72] Biobase_2.44.0
[73] BiocGenerics_0.30.0
[74] methylSig_0.1
[75] CePa_0.7.0
[76] viridis_0.5.1