-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Reporting the intersection results #85
Comments
I would also appreciate this feature! |
Yes, I'd love this feature, too and would be very keen to get the lists of elements for intersections. |
I would also love to have this feature! Or if anyone @rotifergirl @sivarajankarunanithi @janstrauss1 has found a work around please let me know! |
The only work-around I've found is
This way, |
I have found a work-around using a web tool developed by the group of Yves van de Peer at the University of Ghent, Belgium allowing to print lists of elements which are in each intersection or are unique to a certain list. The tool is accessible at http://bioinformatics.psb.ugent.be/webtools/Venn/. |
@janstrauss1 Thanks for that solution! |
@ngehlenborg @JakeConway @alexsb |
While it would be best if upset() would return a container with the row numbers of the members of each set, in the meantime, I wrote some code using
For convenience, if you only have your data in the form of a named list (like me) here is a modified version of the
|
Also interested in this feature. @docmanny's solution worked very well for me in the meantime though, thanks a lot!! |
This feature would be very helpful. |
Hi! First post here, hope I make myself clear :) In order to find the list of intersected values I used "Reduce" and "combn". I reuse a previous post for this (https://stackoverflow.com/questions/24748170/finding-all-possible-combinations-of-vector-intersections). The code:
Where "v" is the It's important to say that UpSetR only plots the unique values. This means that the values in the intersection between list 2, 3 and 4 would be only the values that appear in this condition and not the intersected values appeared also in list 1, 2, 3 and 4. |
I tried using the great Tried first with the movies DF
and later I did this:
But... the docmanny FX
So, Am I missing something Any suggestions welcome...
|
I also think this would be very useful. |
+1 I'd love it too |
@sfd99 Hi! Sorry I just saw the @ mention, but I do have an answer for you if it still helps. Note that the
However, this does:
You can also see in the UpSetR graph that Goblin and Unicorn only appear together in larger sets, so there are no members that are exclusive to that intersection. As a result, Hope that helps! |
Maybe I overlooked it, but having played around with the above solutions, I was still missing a all-in-one function producing a list with all occurring group combination. Note that is best works with a named matrix created by the modified The below function takes is a bit bulky due to documentation which attempts to show intermediate results for understand whats going on: overlapGroups <- function (listInput, sort = TRUE) {
# listInput could look like this:
# $one
# [1] "a" "b" "c" "e" "g" "h" "k" "l" "m"
# $two
# [1] "a" "b" "d" "e" "j"
# $three
# [1] "a" "e" "f" "g" "h" "i" "j" "l" "m"
listInputmat <- fromList(listInput) == 1
# one two three
# a TRUE TRUE TRUE
# b TRUE TRUE FALSE
#...
# condensing matrix to unique combinations elements
listInputunique <- unique(listInputmat)
grouplist <- list()
# going through all unique combinations and collect elements for each in a list
for (i in 1:nrow(listInputunique)) {
currentRow <- listInputunique[i,]
myelements <- which(apply(listInputmat,1,function(x) all(x == currentRow)))
attr(myelements, "groups") <- currentRow
grouplist[[paste(colnames(listInputunique)[currentRow], collapse = ":")]] <- myelements
myelements
# attr(,"groups")
# one two three
# FALSE FALSE TRUE
# f i
# 12 13
}
if (sort) {
grouplist <- grouplist[order(sapply(grouplist, function(x) length(x)), decreasing = TRUE)]
}
attr(grouplist, "elements") <- unique(unlist(listInput))
return(grouplist)
# save element list to facilitate access using an index in case rownames are not named
} How to use it (use case):library(UpSetR)
# example of list input (list of named vectors)
listInput <- list(one = letters[ c(1, 2, 3, 5, 7, 8, 11, 12, 13) ],
two = letters[ c(1, 2, 4, 5, 10) ],
three = letters[ c(1, 5, 6, 7, 8, 9, 10, 12, 13) ])
### that's pretty much all that's needed..
li <- overlapGroups(listInput)
###
# list of all elements:
attr(li, "elements")
# [1] "a" "b" "c" "e" "g" "h" "k" "l" "m" "d" "j" "f" "i"
# which elements are in the biggest group?
li[1]
# $`one:three`
# g h l m
# 5 6 8 9
# attr(,"groups")
# one two three
# TRUE FALSE TRUE
names(li[[1]])
# [1] "g" "h" "l" "m"
attr(li, "elements")[li[[1]]]
# [1] "g" "h" "l" "m"
# full list
li
# $`one:three`
# g h l m
# 5 6 8 9
# attr(,"groups")
# one two three
# TRUE FALSE TRUE
#
# $`one:two:three`
# a e
# 1 4
# attr(,"groups")
# one two three
# TRUE TRUE TRUE
#
##### cut out a bit #####
# $`two:three`
# j
# 11
# attr(,"groups")
# one two three
# FALSE TRUE TRUE
#
# attr(,"elements")
# [1] "a" "b" "c" "e" "g" "h" "k" "l" "m" "d" "j" "f" "i" |
Hey everyone, I was also struggling with this, so I also came up with a solution. I'm sure it could be more elegant, but I think it does the trick (improvements welcome, of course!). In my case, docmanny's solution did not work. I work with a long list of genes (>800) across several disease types (mutated vs no mutated). The function I provide works with a dataframe as an input, each row a gene and each column a disease (i.e. set). It is not thought for working with more columns than these, so please exclude those that contain extra annotations (for example in the movies.csv, columns like AvgRating should be excluded). The user can provide sets of interest to be included or excluded (or both) from the intersection of interest.
Hope it helps, David |
This would be a valuable addition. I have been using "Vennerable" package to get interactions for any number of groups.
"temp" will contain all interaction values. Hope this is helpful. ST |
@docmanny , your's was a great suggestion! @docmanny On my dataset doesn't work instead, even though I have similar data formatting.
@docmanny do you have any idea, where the problem might be? |
@efr3m It was actually a really subtle error born from an assumption on my part. I assumed people would give dataframes with integer values, but 1 and 0 can also be numeric. Because in Use this variant of the function instead:
You should get:
|
Awesome! Very helpful |
Hi! Also struggling with this one. I found some useful hints here! I ended up doing a Tidyverse approach using the After calling the UpSetR::fromList() I obtain a data frame with 0 and 1 values, similarly to the
To get entries overlapping in all:
To get entries only present in A:
If we want to preserve the row id it could be included in a separate column.
|
@Jakob37 Great. But recently, In your 1st example, I get: d %>% filter_at(vars(c("A", "B", "C"), ~.==1) + ) Error: Formula shorthand must be wrapped in |
@sfd99 Thank you for your comment! I think I made a typo. It should be an additional end parenthesis i.e. '"C"))' instead of '"C")' . This one runs for me (also using dplyr 0.8.5):
I corrected it in the example above. Thanks! Edit: Regarding the |
Works 100% now. |
last quick question,
good! A result like this: |
Hi! If I understand you correctly you want rows where B and C exclusively are 1. For this I did a separate filtering step.
Or more general (for any non-A/B column, even if there are more).
Hope this helps! |
Very clear, works perfectly. |
Further simplify output of overlapGroups
|
Javier Herrero nailed what was asked for brilliantly, check here: https://stackoverflow.com/questions/65027133/extract-intersection-list-from-upset-object Cheers, |
I couldn't find a way to extract the intersecting values. For instance, reporting the names of movies that fall under more than one genre (action, thriller, drama). At least for me, extracting those names which fall in just one category or more than one specific categories will be a nice feature for the package.
The text was updated successfully, but these errors were encountered: