Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add prune_cutree_to_dendlist? #61

Closed
talgalili opened this issue Jan 10, 2018 · 6 comments
Closed

Add prune_cutree_to_dendlist? #61

talgalili opened this issue Jan 10, 2018 · 6 comments

Comments

@talgalili
Copy link
Owner

Answers:
https://stackoverflow.com/questions/48167369/r-getting-subtrees-from-dendrogram-based-on-cutree-labels/48180908#48180908

# install.packages("dendextend")

library(dendextend)
dend <- iris[,-5] %>% dist %>% hclust %>% as.dendrogram %>% 
  set("labels_to_character")
dend <- dend %>% color_branches(k=5)

# plot(dend)

prune_cutree_to_dendlist <- function(dend, k) {
  clusters <- cutree(dend,k, order_clusters_as_data = FALSE)
  # unique_clusters <- unique(clusters) # could also be 1:k but it would be less robust
  # k <- length(unique_clusters)
  # for(i in unique_clusters) { 
  dends <- vector("list", k)
  for(i in 1:k) { 
    leves_to_prune <- labels(dend)[clusters != i]
    dends[[i]] <- prune(dend, leves_to_prune)
    
  }
  
  class(dends) <- "dendlist"
  
  dends
}

prunned_dends <- prune_cutree_to_dendlist(dend, 5)
sapply(prunned_dends, nleaves)

par(mfrow = c(2,3))
plot(dend, main = "original dend")
sapply(prunned_dends, plot)

@lucacerone
Copy link

lucacerone commented Jan 10, 2018

Hi Tal, thanks again for this function! Just one more question.

Say I want to make an heatmap of cluster 1 using heatmap.2,
I get the subset with

sub_data = data[clusters == 1, ]
# not I get the relevant dendrogram as output by your function
sub_dent = prunned_dends[[1]]

What about the ordering? In your function you use order_clusters_as_data = FALSE, how can I re-arrange the dendrogram so that when I add it to the rows in heatmap.2 the dendrogram actually corresponds to the data (apology for the silly question, I am very new to dendextend).

@talgalili
Copy link
Owner Author

Hi @lucacerone
It is better to sample the data and re-run the hclust on it (or just let heatmap.2 do the trick). Prunned trees have values in their leaves that may lead to odd behaviors, so it is not advised. Also, have a look at the heatmaply R package, for an interactive version of heatmap.2. Cheers.

@lucacerone
Copy link

Hi @talgalili, actually what I wanted to do (which also originated my original question in stackoverflow) is to zoom "as is" on a heatmap. The fact is that if I re-run hclust some sub-pattern that appear in the heatmap with the overall data, can change position and it is harder in a presentation to explain that what they are looking for is exactly the same as before, just with a different re-arrangement.

I'll surely have a look at heatmaply soon :)

@talgalili
Copy link
Owner Author

Hi @lucacerone the best way to zoom in into the heatmap (unless using heatmaply, which allows for interactive zooming) is to pick the subset of items in the cluster and run the heatmap with that sub dendrogram and subset of the data. Here is an example code for how to do it using the new get_dendrograms function:


# needed packages:
# install.packages(gplots)
# install.packages(viridis)
# install.packages(devtools)
# devtools::install_github('talgalili/dendextend') # dendextend from github

# define dendrogram object to play with:
dend <- iris[,-5] %>% dist %>% hclust %>% as.dendrogram %>%  set("labels_to_character") %>% color_branches(k=5)
dend_list <- get_subdendrograms(dend, 5)

# Plotting the result
par(mfrow = c(2,3))
plot(dend, main = "Original dendrogram")
sapply(dend_list, plot)

# plot a heatmap of only one of the sub dendrograms
par(mfrow = c(1,1))
library(gplots)
sub_dend <- dend_list[[1]] # get the sub dendrogram
# make sure of the size of the dend
nleaves(sub_dend)
length(order.dendrogram(sub_dend))
# get the subset of the data
subset_iris <- as.matrix(iris[order.dendrogram(sub_dend),-5])
# update the dendrogram's internal order so to not cause an error in heatmap.2
order.dendrogram(sub_dend) <- rank(order.dendrogram(sub_dend))
heatmap.2(subset_iris, Rowv = sub_dend, trace = "none", col = viridis::viridis(100))


image

@lucacerone
Copy link

Thank you so much Tal! This is really helpful!
Cheers!

@talgalili
Copy link
Owner Author

talgalili commented Jan 22, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants