-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing overlaps with type='start' or type='end' in findOverlaps #46
Comments
From
Plotting the ranges should help clarify the behaviour (see below). suppressPackageStartupMessages(library(IRanges))
# Taken from IRanges vignette
plotRanges <- function(x, xlim = x, main = deparse(substitute(x)), col = "black",
sep = 0.5, ...) {
height <- 1
if (is(xlim, "IntegerRanges"))
xlim <- c(min(start(xlim)), max(end(xlim)))
bins <- disjointBins(IRanges(start(x), end(x) + 1))
plot.new()
plot.window(xlim, c(0, max(bins)*(height + sep)))
ybottom <- bins * (sep + height) - height
rect(start(x) - 0.5, ybottom, end(x) + 0.5, ybottom + height, col = col, ...)
title(main)
axis(1)
}
# reprex from https://github.com/Bioconductor/IRanges/issues/46#issue-1450599624
query <- IRanges(start=638752, end=639346)
subject <- IRanges(start=c(635246, 636459, 637282, 638191, 639306,
640686, 641814, 643721, 643992, 645102,
645499),
end=c(636343, 637097, 638187, 639150, 640670, 641798,
643592, 643978, 644858, 645485, 647505))
# Plot regions (query in red, subject alternating in black and grey).
plotRanges(
c(query, subject),
col = c(rep("red", length(query)), rep(c("black", "grey"), length(subject) / 2 + 1))) Created on 2022-11-16 with reprex v2.0.2 Session infosessionInfo()
#> R version 4.2.2 Patched (2022-11-10 r83330)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 20.04.5 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3
#>
#> locale:
#> [1] LC_CTYPE=en_AU.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_AU.UTF-8 LC_COLLATE=en_AU.UTF-8
#> [5] LC_MONETARY=en_AU.UTF-8 LC_MESSAGES=en_AU.UTF-8
#> [7] LC_PAPER=en_AU.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C
#>
#> attached base packages:
#> [1] stats4 stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] IRanges_2.32.0 S4Vectors_0.36.0 BiocGenerics_0.44.0
#>
#> loaded via a namespace (and not attached):
#> [1] rstudioapi_0.14 xml2_1.3.3 knitr_1.40 magrittr_2.0.3
#> [5] R6_2.5.1 rlang_1.0.6 fastmap_1.1.0 stringr_1.4.1
#> [9] highr_0.9 httr_1.4.4 tools_4.2.2 xfun_0.34
#> [13] cli_3.4.1 withr_2.5.0 htmltools_0.5.3 yaml_2.3.6
#> [17] digest_0.6.30 lifecycle_1.0.3 purrr_0.3.5 fs_1.5.2
#> [21] curl_4.3.3 mime_0.12 glue_1.6.2 evaluate_0.18
#> [25] rmarkdown_2.18 reprex_2.0.2 stringi_1.7.8 compiler_4.2.2 BTW Questions like these are better asked on https://support.bioconductor.org/. |
Hi @PeteHaitch |
Hi folks 😃
I recently encountered an odd behaviour in the
findOverlaps
function.I have a single query for which I want to find overlapping genes, stratified by the type of overlap. So, I want to know, if the overlap is within the gene or overlapping the C or N-terminus. Therefore, I used the
type
parameters in thefindOverlap
function.I first checked if there are any overlaps and found two genes with overlap, with my region overlapping the start of one and the end of another one of the two hits. However, when I run it with
type='end'
or'start'
, I do not get the expected overlaps.I tried to understand the underlying code, but it seems to call some
C
function which I cannot really figure out, unfortunately.This should be a reproducible example with a subset of the gene coordinates (using
IRanges_2.32.0
andR 4.2.1
):This is the output:
Just to clarify, the subjects
4
and5
should be returned as overlaps with the query. This is the query:and this the expected overlaps:
I hope I am not missing something obvious, but it seems like the function would not return the expected results.
Your help will be much appreciated!
Cheers,
Jakob
The text was updated successfully, but these errors were encountered: