In [2]:
library(treeio)
library(ggtree)
library(ggplot2)
library(ape)
library(ComplexHeatmap)
library(cowplot)
library(circlize)
library(ggtreeExtra)
library(ggnewscale)

[90mtreeio v1.26.0 For help: https://yulab-smu.top/treedata-book/

If you use the ggtree package suite in published research, please cite
the appropriate paper(s):

LG Wang, TTY Lam, S Xu, Z Dai, L Zhou, T Feng, P Guo, CW Dunn, BR
Jones, T Bradley, H Zhu, Y Guan, Y Jiang, G Yu. treeio: an R package
for phylogenetic tree input and output with richly annotated and
associated data. Molecular Biology and Evolution. 2020, 37(2):599-603.
doi: 10.1093/molbev/msz240

Guangchuang Yu, David Smith, Huachen Zhu, Yi Guan, Tommy Tsan-Yuk Lam.
ggtree: an R package for visualization and annotation of phylogenetic
trees with their covariates and other associated data. Methods in
Ecology and Evolution. 2017, 8(1):28-36. doi:10.1111/2041-210X.12628

Guangchuang Yu, Tommy Tsan-Yuk Lam, Huachen Zhu, Yi Guan. Two methods
for mapping and visualizing associated data on phylogeny using ggtree.
Molecular Biology and Evolution. 2018, 35(12):3041-3043.
doi:10.1093/molbev/msy194 [39m

[90mggtree v3.10.1 For hel

In [3]:
path = '/data3/wangkun/mtsim_res/20240903/const_100/34468/'
gen = 30
lines <- readLines(paste0(path, "tree_color_", gen, ".txt"))
# Extract the tip_label and category
data <- do.call(rbind, lapply(lines, function(line) {
  parts <- strsplit(line, " ")[[1]]
  tip_label <- parts[1]
  category <- parts[2]
  return(c(tip_label, category))
}))
# Convert to data.frame
df <- data.frame(tip_label = data[, 1], category = data[, 2], stringsAsFactors = FALSE)
nwk_files <- list.files(path, pattern =  paste0('*._', gen, ".*\\.nwk$"), full.names = FALSE)
file = nwk_files[1]
    tree <- read.tree(paste0(path, file))
    tree_plot <- ggtree(tree,layout = 'fan', open.angle = 0,branch.length='none',size=0.5)
    tree_plot$data[tree_plot$data$isTip == TRUE, 'cluster'] <- df$category[match(tree_plot$data$label[tree_plot$data$isTip == TRUE], df$tip_label)]
    
    categories <- unique(df$category)
    color_vector <- setNames(categories, categories)
    
    options(repr.plot.width=6, repr.plot.height=6)
    plot <- tree_plot +
         new_scale_fill() +
         geom_fruit(
             geom=geom_tile,
             mapping=aes(fill=cluster),
             pwidth=max(tree_plot$data$x)*0.25,
             offset=0.15
         ) + scale_fill_manual(
             name="cluster",
             breaks=categories,
             values=categories
         ) + theme(
             legend.position = "none",
             plot.margin = unit(c(0, 0, 0, 0), "cm")
         )

ggsave(paste0("../figs/tree_100/", 'test.pdf'), plot = plot, width = 6, height = 6, device = cairo_pdf, units = "in", limitsize = FALSE)




[1m[22mScale for [32my[39m is already present.
Adding another scale for [32my[39m, which will replace the existing scale.
“[1m[22mThe column of [32mx[39m aesthetic only have one unique value with `geom = geom_tile`,
and the `width` of `geom_tile()` is not provided, the `pwidth` will be as
`width`.”


In [13]:
path = '/data3/wangkun/mtsim_res/20240903/const_100/34468/'
for (gen in c(30, 130, 330)) {
    lines <- readLines(paste0(path, "tree_color_", gen, ".txt"))
    # Extract the tip_label and category
    data <- do.call(rbind, lapply(lines, function(line) {
      parts <- strsplit(line, " ")[[1]]
      tip_label <- parts[1]
      category <- parts[2]
      return(c(tip_label, category))
    }))
    # Convert to data.frame
    df <- data.frame(tip_label = data[, 1], category = data[, 2], stringsAsFactors = FALSE)
    nwk_files <- list.files(path, pattern =  paste0('*._', gen, ".*\\.nwk$"), full.names = FALSE)
    for (file in nwk_files) {
        tree <- read.tree(paste0(path, file))
        tree_plot <- ggtree(tree,layout = 'fan', open.angle = 0,branch.length='none',size=0.5)
        tree_plot$data[tree_plot$data$isTip == TRUE, 'cluster'] <- df$category[match(tree_plot$data$label[tree_plot$data$isTip == TRUE], df$tip_label)]
        
        categories <- unique(df$category)
        color_vector <- setNames(categories, categories)
        
        options(repr.plot.width=6, repr.plot.height=6)
        plot <- tree_plot +
             new_scale_fill() +
             geom_fruit(
                 geom=geom_tile,
                 mapping=aes(fill=cluster),
                 pwidth=max(tree_plot$data$x)*0.25,
                 offset=0.15
             ) + scale_fill_manual(
                 name="cluster",
                 breaks=categories,
                 values=categories
             ) + theme(
                 legend.position = "none",
                 plot.margin = unit(c(0, 0, 0, 0), "cm")
             )

        ggsave(paste0("../figs/tree_100/", sub("\\.nwk$", ".pdf", file)), plot = plot, width = 6, height = 6, device = cairo_pdf, units = "in", limitsize = FALSE)
    }
}


[1m[22mScale for [32my[39m is already present.
Adding another scale for [32my[39m, which will replace the existing scale.
“[1m[22mThe column of [32mx[39m aesthetic only have one unique value with `geom = geom_tile`,
and the `width` of `geom_tile()` is not provided, the `pwidth` will be as
`width`.”
[1m[22mScale for [32my[39m is already present.
Adding another scale for [32my[39m, which will replace the existing scale.
“[1m[22mThe column of [32mx[39m aesthetic only have one unique value with `geom = geom_tile`,
and the `width` of `geom_tile()` is not provided, the `pwidth` will be as
`width`.”
[1m[22mScale for [32my[39m is already present.
Adding another scale for [32my[39m, which will replace the existing scale.
“[1m[22mThe column of [32mx[39m aesthetic only have one unique value with `geom = geom_tile`,
and the `width` of `geom_tile()` is not provided, the `pwidth` will be as
`width`.”
[1m[22mScale for [32my[39m is already present.
Adding another scal

In [14]:
path = '/data3/wangkun/mtsim_res/20240903/const_10/243093/'
for (gen in c(30, 130, 330)) {
    lines <- readLines(paste0(path, "tree_color_", gen, ".txt"))
    # Extract the tip_label and category
    data <- do.call(rbind, lapply(lines, function(line) {
      parts <- strsplit(line, " ")[[1]]
      tip_label <- parts[1]
      category <- parts[2]
      return(c(tip_label, category))
    }))
    # Convert to data.frame
    df <- data.frame(tip_label = data[, 1], category = data[, 2], stringsAsFactors = FALSE)
    nwk_files <- list.files(path, pattern =  paste0('*._', gen, ".*\\.nwk$"), full.names = FALSE)
    for (file in nwk_files) {
        tree <- read.tree(paste0(path, file))
        tree_plot <- ggtree(tree,layout = 'fan', open.angle = 0,branch.length='none',size=0.5)
        tree_plot$data[tree_plot$data$isTip == TRUE, 'cluster'] <- df$category[match(tree_plot$data$label[tree_plot$data$isTip == TRUE], df$tip_label)]
        
        categories <- unique(df$category)
        color_vector <- setNames(categories, categories)
        
        options(repr.plot.width=6, repr.plot.height=6)
        plot <- tree_plot +
             new_scale_fill() +
             geom_fruit(
                 geom=geom_tile,
                 mapping=aes(fill=cluster),
                 pwidth=max(tree_plot$data$x)*0.25,
                 offset=0.15
             ) + scale_fill_manual(
                 name="cluster",
                 breaks=categories,
                 values=categories
             ) + theme(
                 legend.position = "none",
                 plot.margin = unit(c(0, 0, 0, 0), "cm")
             )

        ggsave(paste0("../figs/tree_10/", sub("\\.nwk$", ".pdf", file)), plot = plot, width = 6, height = 6, device = cairo_pdf, units = "in", limitsize = FALSE)
    }
}
# # Remove blank edges using grid
# pdf("test_no_blank_edges.pdf", width = 6, height = 6)
# grid.draw(plot)
# dev.off()

[1m[22mScale for [32my[39m is already present.
Adding another scale for [32my[39m, which will replace the existing scale.
“[1m[22mThe column of [32mx[39m aesthetic only have one unique value with `geom = geom_tile`,
and the `width` of `geom_tile()` is not provided, the `pwidth` will be as
`width`.”
[1m[22mScale for [32my[39m is already present.
Adding another scale for [32my[39m, which will replace the existing scale.
“[1m[22mThe column of [32mx[39m aesthetic only have one unique value with `geom = geom_tile`,
and the `width` of `geom_tile()` is not provided, the `pwidth` will be as
`width`.”
[1m[22mScale for [32my[39m is already present.
Adding another scale for [32my[39m, which will replace the existing scale.
“[1m[22mThe column of [32mx[39m aesthetic only have one unique value with `geom = geom_tile`,
and the `width` of `geom_tile()` is not provided, the `pwidth` will be as
`width`.”
[1m[22mScale for [32my[39m is already present.
Adding another scal

In [37]:
path = '/data3/wangkun/mtsim_res/20240903/const_100/292066/'
# path =  '/data3/wangkun/mtsim_res/20240903/const_10/243093/'

for (gen in c(30, 130, 330)) {
    lines <- readLines(paste0(path, "tree_color_", gen, ".txt"))
    data <- do.call(rbind, lapply(lines, function(line) {
      parts <- strsplit(line, " ")[[1]]
      tip_label <- parts[1]
      category <- parts[2]
      return(c(tip_label, category))
    }))
    df <- data.frame(tip_label = data[, 1], category = data[, 2], stringsAsFactors = FALSE)
    

    
    nwk_files <- list.files(path, pattern =  paste0('*._', gen, ".*\\.nwk$"), full.names = FALSE)
    for (file in nwk_files) {
        # file = nwk_files[1]
        
        lines_mt <- readLines(paste0(path, "rf_color_", file, ".txt"))
        data_mt <- do.call(rbind, lapply(lines_mt, function(line) {
          parts <- strsplit(line, " ")[[1]]
          tip_label <- parts[1]
          category <- parts[2]
          return(c(tip_label, category))
        }))
        df_mt <- data.frame(tip_label = data_mt[, 1], category = data_mt[, 2], stringsAsFactors = FALSE)
        
        tree <- read.tree(paste0(path, file))
        tree_plot <- ggtree(tree, layout = 'fan', open.angle = 0, branch.length = 'none', size = 0.5)
        tree_plot$data[tree_plot$data$isTip == TRUE, 'cluster'] <- df$category[match(tree_plot$data$label[tree_plot$data$isTip == TRUE], df$tip_label)]
        tree_plot$data[tree_plot$data$isTip == TRUE, 'cluster_mt'] <- df_mt$category[match(tree_plot$data$label[tree_plot$data$isTip == TRUE], df_mt$tip_label)]
        
        categories <- unique(df$category)
        categories_mt <- unique(df_mt$category)
        
        options(repr.plot.width=6, repr.plot.height=6)
        if (grepl('gt', file)){
            plot <- tree_plot +
            new_scale_fill() +
             geom_fruit(
                 geom = geom_tile,
                 mapping = aes(fill = cluster),
                 pwidth = max(tree_plot$data$x) * 0.25,
                 offset = 0.25
             ) + scale_fill_manual(
                 name = "cluster",
                 breaks = categories,
                 values = categories
             ) + theme(
                 legend.position = "none",
                 plot.margin = unit(c(0, 0, 0, 0), "cm")
             )
            }
        else {
            plot <- tree_plot +
                 new_scale_fill() +
                 geom_fruit(
                     geom = geom_tile,
                     mapping = aes(fill = cluster_mt),
                     pwidth = max(tree_plot$data$x) * 0.25,
                     offset = 0.15
                 ) + scale_fill_manual(
                     name = "cluster_mt",
                     breaks = categories_mt,
                     values = categories_mt
                 ) + 
                new_scale_fill() +
                 geom_fruit(
                     geom = geom_tile,
                     mapping = aes(fill = cluster),
                     pwidth = max(tree_plot$data$x) * 0.25,
                     offset = 0.25
                 ) + scale_fill_manual(
                     name = "cluster",
                     breaks = categories,
                     values = categories
                 ) + theme(
                     legend.position = "none",
                     plot.margin = unit(c(0, 0, 0, 0), "cm")
                 )
            }
        # print(plot)
        ggsave(paste0("../figs/tree_100/", sub("\\.nwk$", ".pdf", file)), plot = plot, width = 6, height = 6, device = cairo_pdf, units = "in", limitsize = FALSE)
    }
}


[1m[22mScale for [32my[39m is already present.
Adding another scale for [32my[39m, which will replace the existing scale.
“[1m[22mThe column of [32mx[39m aesthetic only have one unique value with `geom = geom_tile`,
and the `width` of `geom_tile()` is not provided, the `pwidth` will be as
`width`.”
[1m[22mScale for [32my[39m is already present.
Adding another scale for [32my[39m, which will replace the existing scale.
“[1m[22mThe column of [32mx[39m aesthetic only have one unique value with `geom = geom_tile`,
and the `width` of `geom_tile()` is not provided, the `pwidth` will be as
`width`.”
“[1m[22mThe column of [32mx[39m aesthetic only have one unique value with `geom = geom_tile`,
and the `width` of `geom_tile()` is not provided, the `pwidth` will be as
`width`.”
[1m[22mScale for [32my[39m is already present.
Adding another scale for [32my[39m, which will replace the existing scale.
“[1m[22mThe column of [32mx[39m aesthetic only have one unique val

In [36]:
# path = '/data3/wangkun/mtsim_res/20240903/const_100/292066/'
path =  '/data3/wangkun/mtsim_res/20240903/const_10/243093/'

for (gen in c(30, 130, 330)) {
    lines <- readLines(paste0(path, "tree_color_", gen, ".txt"))
    data <- do.call(rbind, lapply(lines, function(line) {
      parts <- strsplit(line, " ")[[1]]
      tip_label <- parts[1]
      category <- parts[2]
      return(c(tip_label, category))
    }))
    df <- data.frame(tip_label = data[, 1], category = data[, 2], stringsAsFactors = FALSE)
    

    
    nwk_files <- list.files(path, pattern =  paste0('*._', gen, ".*\\.nwk$"), full.names = FALSE)
    for (file in nwk_files) {
        # file = nwk_files[1]
        
        lines_mt <- readLines(paste0(path, "rf_color_", file, ".txt"))
        data_mt <- do.call(rbind, lapply(lines_mt, function(line) {
          parts <- strsplit(line, " ")[[1]]
          tip_label <- parts[1]
          category <- parts[2]
          return(c(tip_label, category))
        }))
        df_mt <- data.frame(tip_label = data_mt[, 1], category = data_mt[, 2], stringsAsFactors = FALSE)
        
        tree <- read.tree(paste0(path, file))
        tree_plot <- ggtree(tree, layout = 'fan', open.angle = 0, branch.length = 'none', size = 0.5)
        tree_plot$data[tree_plot$data$isTip == TRUE, 'cluster'] <- df$category[match(tree_plot$data$label[tree_plot$data$isTip == TRUE], df$tip_label)]
        tree_plot$data[tree_plot$data$isTip == TRUE, 'cluster_mt'] <- df_mt$category[match(tree_plot$data$label[tree_plot$data$isTip == TRUE], df_mt$tip_label)]
        
        categories <- unique(df$category)
        categories_mt <- unique(df_mt$category)
        
        options(repr.plot.width=6, repr.plot.height=6)
        if (grepl('gt', file)){
            plot <- tree_plot +
            new_scale_fill() +
             geom_fruit(
                 geom = geom_tile,
                 mapping = aes(fill = cluster),
                 pwidth = max(tree_plot$data$x) * 0.25,
                 offset = 0.25
             ) + scale_fill_manual(
                 name = "cluster",
                 breaks = categories,
                 values = categories
             ) + theme(
                 legend.position = "none",
                 plot.margin = unit(c(0, 0, 0, 0), "cm")
             )
            }
        else {
            plot <- tree_plot +
                 new_scale_fill() +
                 geom_fruit(
                     geom = geom_tile,
                     mapping = aes(fill = cluster_mt),
                     pwidth = max(tree_plot$data$x) * 0.25,
                     offset = 0.15
                 ) + scale_fill_manual(
                     name = "cluster_mt",
                     breaks = categories_mt,
                     values = categories_mt
                 ) + 
                new_scale_fill() +
                 geom_fruit(
                     geom = geom_tile,
                     mapping = aes(fill = cluster),
                     pwidth = max(tree_plot$data$x) * 0.25,
                     offset = 0.25
                 ) + scale_fill_manual(
                     name = "cluster",
                     breaks = categories,
                     values = categories
                 ) + theme(
                     legend.position = "none",
                     plot.margin = unit(c(0, 0, 0, 0), "cm")
                 )
            }
        # print(plot)
        ggsave(paste0("../figs/tree_10/", sub("\\.nwk$", ".pdf", file)), plot = plot, width = 6, height = 6, device = cairo_pdf, units = "in", limitsize = FALSE)
    }
}


[1m[22mScale for [32my[39m is already present.
Adding another scale for [32my[39m, which will replace the existing scale.
“[1m[22mThe column of [32mx[39m aesthetic only have one unique value with `geom = geom_tile`,
and the `width` of `geom_tile()` is not provided, the `pwidth` will be as
`width`.”
[1m[22mScale for [32my[39m is already present.
Adding another scale for [32my[39m, which will replace the existing scale.
“[1m[22mThe column of [32mx[39m aesthetic only have one unique value with `geom = geom_tile`,
and the `width` of `geom_tile()` is not provided, the `pwidth` will be as
`width`.”
“[1m[22mThe column of [32mx[39m aesthetic only have one unique value with `geom = geom_tile`,
and the `width` of `geom_tile()` is not provided, the `pwidth` will be as
`width`.”
[1m[22mScale for [32my[39m is already present.
Adding another scale for [32my[39m, which will replace the existing scale.
“[1m[22mThe column of [32mx[39m aesthetic only have one unique val