-
Notifications
You must be signed in to change notification settings - Fork 12
Closed
Description
Hello,bioinformaticians.
I want to write a function called findovelaps like bioconductor R. I tried like the following.
using GenomicFeatures
using DataFrames
function number_interval(tp::Tuple)
# Unpack Tuple.
(i, interval) = tp
# Setup numbered metadata.
new_metadata = (
i = i,
original = GenomicFeatures.metadata(interval)
)
# Create new interval with numbered metadata.
return Interval(
seqname(interval),
leftposition(interval),
rightposition(interval),
strand(interval),
new_metadata
)
end
function findoverlaps(query,subject)
query_numbered= query|> enumerate .|> number_interval
subject_numbered=subject |> enumerate .|> number_interval
df = Vector{Tuple{Int64, Int64}}()
for (q,r) in eachoverlap(query_numbered,subject_numbered)
result=(
GenomicFeatures.metadata(q).i,
GenomicFeatures.metadata(r).i
)
push!(df,result)
end
rename!(DataFrame(df),[:queryHits,:subjectHits])
end
col = [
Interval("chr1", 10628, 10683, '?', "abc")
Interval("chr1", 10643, 10779, '?', "abc")
Interval("chr1", 10645, 10748, '?', "abc")
Interval("chr1", 10648, 10786, '?', "abc")
] |> IntervalCollection
hhh = [
Interval("chr1", 10631, 10638)
Interval("chr1", 10633, 10635)
Interval("chr1", 10636, 10650)
Interval("chr1", 10638, 10649)
Interval("chr1", 10641, 10651)
] |> IntervalCollection
I ran the function findoverlaps,it returned following.
julia> overlap=findoverlaps(col,hhh)
14×2 DataFrame
Row │ queryHits subjectHits
│ Int64 Int64
─────┼────────────────────────
1 │ 1 1
2 │ 1 2
3 │ 1 3
4 │ 1 4
5 │ 1 5
6 │ 2 3
7 │ 2 4
8 │ 2 5
9 │ 3 3
10 │ 3 4
11 │ 3 5
12 │ 4 3
13 │ 4 4
14 │ 4 5
But I can not solve the problem with minoverlap=5.For example,hhh’s first line has more than 5 overlaps with col’s first ,I will output the index of col’s index and hhh’s index.hhh’s second line does not have 5 overlaps,it will not occur in the final dataframe.The function above seems to solve minoverlap=1.What should I do to solve this problem? Thank all guys for helping me!
Metadata
Metadata
Assignees
Labels
No labels