-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create ComplexHeatmaps for Pilot Data plate 1 #27
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,165 @@ | ||
suppressPackageStartupMessages(library(ComplexHeatmap)) | ||
suppressPackageStartupMessages(library(dplyr)) | ||
|
||
# Set paths and constants | ||
input_data_dir <- file.path("..", "..", "..", "4_processing_features", "data") | ||
output_figure_dir <- "figures" | ||
|
||
cp_heatmap_file_noext <- file.path(output_figure_dir, "cp_complex_heatmap") | ||
dp_heatmap_file_noext <- file.path(output_figure_dir, "dp_complex_heatmap") | ||
|
||
# Set heatmap colors | ||
well_cols = c( | ||
"C6" = "#E1DAAE", | ||
"C7" = "#FF934F", | ||
"D6" = "#CC2D35", | ||
"D7" = "#058ED9", | ||
"E6" = "#848FA2", | ||
"E7" = "#2D3142", | ||
"F6" = "#FFC857", | ||
"F7" = "#5f7a12" | ||
) | ||
genotype_cols = c( | ||
"Null" = "#785EF0", | ||
"WT" = "#DC267F" | ||
) | ||
|
||
# Load data | ||
cp_file <- file.path(input_data_dir, "nf1_sc_norm_fs_cellprofiler.csv.gz") | ||
|
||
cp_df <- readr::read_csv( | ||
cp_file, | ||
col_types = readr::cols( | ||
.default="d", | ||
Metadata_WellRow="c", | ||
Metadata_WellCol="c", | ||
Metadata_Well="c", | ||
Metadata_gene_name="c", | ||
Metadata_genotype="c" | ||
) | ||
) %>% dplyr::select(-...1) # Drop index col | ||
|
||
print(dim(cp_df)) | ||
head(cp_df, 3) | ||
|
||
# Split metadata and feature data | ||
cp_metadata_df <- cp_df %>% dplyr::select(tidyr::starts_with("Metadata")) | ||
cp_meta_cols <- colnames(cp_metadata_df) | ||
cp_df <- cp_df %>% dplyr::select(-!!cp_meta_cols) | ||
|
||
# Calculate correlation matrix from feature data | ||
cp_cor_matrix <- t(cp_df) %>% cor() | ||
|
||
print(dim(cp_cor_matrix)) | ||
head(cp_cor_matrix, 3) | ||
|
||
ht <- Heatmap( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So since you make a correlation matrix in R and then make a heatmap, why is the one that I created in Python so different to this one? I see there is a lot of code to make this heatmap so is R better equipped to make a better correlation heatmap? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The scale in python is different from the scale in R. ComplexHeatmap auto scales, which is a caveat to interpretation. It's also possible that the dendrogram ordering is different in python |
||
cp_cor_matrix, | ||
name = "Pearson\nCorrelation", | ||
column_dend_side = "top", | ||
|
||
clustering_method_columns = "average", | ||
clustering_method_rows = "average", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is this clustering method for? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is the method for hierarchical clustering (determines how the dendrogram is ordered). See https://jokergoo.github.io/ComplexHeatmap/reference/Heatmap.html |
||
|
||
top_annotation = HeatmapAnnotation( | ||
Genotype = cp_metadata_df$Metadata_genotype, | ||
CellCount = anno_barplot( | ||
cp_metadata_df$Metadata_number_of_singlecells, | ||
height = unit(1, "cm") | ||
), | ||
Well = cp_metadata_df$Metadata_Well, | ||
|
||
col = list( | ||
Genotype = genotype_cols, | ||
Well = well_cols | ||
) | ||
) | ||
) | ||
|
||
draw(ht) | ||
|
||
# Save heatmap to file | ||
pdf(paste0(cp_heatmap_file_noext, ".pdf")) | ||
draw(ht) | ||
dev.off() | ||
|
||
png(paste0(cp_heatmap_file_noext, ".png"), width = 6.5, height = 6, units = "in", res = 500) | ||
draw(ht) | ||
dev.off() | ||
|
||
|
||
|
||
# Load data | ||
dp_file <- file.path(input_data_dir, "nf1_sc_norm_fs_deepprofiler_nuc.csv.gz") | ||
|
||
dp_df <- readr::read_csv( | ||
dp_file, | ||
col_types = readr::cols( | ||
.default="d", | ||
Metadata_Plate="c", | ||
Metadata_Well="c", | ||
Metadata_Site="c", | ||
Metadata_Plate_Map_Name="c", | ||
Metadata_DNA="c", | ||
Metadata_ER="c", | ||
Metadata_Actin="c", | ||
Metadata_Genotype="c", | ||
Metadata_Genotype_Replicate="c", | ||
Metadata_Model="c" | ||
) | ||
) | ||
|
||
print(dim(dp_df)) | ||
head(dp_df, 3) | ||
|
||
# Split metadata and feature data | ||
dp_metadata_df <- dp_df %>% dplyr::select(tidyr::starts_with("Metadata")) | ||
dp_meta_cols <- colnames(dp_metadata_df) | ||
dp_meta_cols <- c(dp_meta_cols, c("Location_Center_X", "Location_Center_Y")) | ||
|
||
dp_df <- dp_df %>% dplyr::select(-!!dp_meta_cols) | ||
|
||
# Calculate number of single cells per well in DP data | ||
dp_metadata_df <- dp_metadata_df %>% | ||
dplyr::group_by(Metadata_Well) %>% | ||
dplyr::add_tally(name = "Metadata_cell_count") | ||
|
||
# Calculate correlation matrix from feature data | ||
dp_cor_matrix <- t(dp_df) %>% cor() | ||
|
||
print(dim(dp_cor_matrix)) | ||
head(dp_cor_matrix, 3) | ||
|
||
ht <- Heatmap( | ||
dp_cor_matrix, | ||
name = "Pearson\nCorrelation", | ||
column_dend_side = "top", | ||
|
||
clustering_method_columns = "average", | ||
clustering_method_rows = "average", | ||
|
||
top_annotation = HeatmapAnnotation( | ||
Genotype = dp_metadata_df$Metadata_Genotype, | ||
CellCount = anno_barplot( | ||
dp_metadata_df$Metadata_cell_count, | ||
height = unit(1, "cm") | ||
), | ||
Well = dp_metadata_df$Metadata_Well, | ||
|
||
col = list( | ||
Genotype = genotype_cols, | ||
Well = well_cols | ||
) | ||
) | ||
) | ||
|
||
draw(ht) | ||
|
||
# Save heatmap to file | ||
pdf(paste0(dp_heatmap_file_noext, ".pdf")) | ||
draw(ht) | ||
dev.off() | ||
|
||
png(paste0(dp_heatmap_file_noext, ".png"), width = 6.5, height = 6, units = "in", res = 500) | ||
draw(ht) | ||
dev.off() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you decide on these color specifically? Would it not be easy to use a cmap or preset color palette to use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could also be an R related thing I don't know about now that I think of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided on these colors for a separate project 😂 (https://github.com/broadinstitute/profiling-resistance-mechanisms/blob/master/3.resistance-signature/scripts/nbconverted/7.profile-heatmaps.r)
I picked them because they are compatible with different kinds of color blindness. I don't remember exactly, but I think I used https://colorbrewer2.org