Adding an overlaps function to IntervalList. #877

Merged
merged 2 commits into from Jun 7, 2017

Conversation

Projects
None yet
3 participants
Contributor

nh13 commented May 22, 2017

Finds all intervals in the first list that overlap any interval in the
second list.

@yfarjoun would you have a chance to review? I know @tfenne is on vacation this week and I know you are familiar with this code. Otherwise, I can ask @lbergelson or @jacarey to review?

Relates to nh13/picard#7

@nh13 nh13 Adding an overlaps function to IntervalList.
Finds all intervals in the first list that overlap any interval in the
second list.
3111147

yfarjoun was assigned by nh13 May 22, 2017

codecov-io commented May 22, 2017 edited

Codecov Report

Merging #877 into master will increase coverage by 0.109%.
The diff coverage is 100%.

@@               Coverage Diff               @@
##              master      #877       +/-   ##
===============================================
+ Coverage     65.006%   65.115%   +0.109%     
- Complexity      7229      7283       +54     
===============================================
  Files            528       528               
  Lines          31894     31999      +105     
  Branches        5444      5478       +34     
===============================================
+ Hits           20733     20836      +103     
- Misses          9013      9021        +8     
+ Partials        2148      2142        -6
Impacted Files Coverage Δ Complexity Δ
...c/main/java/htsjdk/samtools/util/IntervalList.java 67.846% <100%> (+2.933%) 66 <8> (+8) ⬆️
.../main/java/htsjdk/tribble/index/AbstractIndex.java 51.675% <0%> (+0.096%) 35% <0%> (+5%) ⬆️
...main/java/htsjdk/samtools/SAMRecordSetBuilder.java 89.113% <0%> (+0.806%) 64% <0%> (+2%) ⬆️
...c/main/java/htsjdk/samtools/fastq/FastqReader.java 82.432% <0%> (+1.351%) 26% <0%> (+2%) ⬆️
src/main/java/htsjdk/samtools/util/StringUtil.java 67.692% <0%> (+1.538%) 67% <0%> (+2%) ⬆️
src/main/java/htsjdk/samtools/util/Interval.java 63.462% <0%> (+1.923%) 23% <0%> (+1%) ⬆️
...a/htsjdk/samtools/cram/encoding/rans/Decoding.java 85.714% <0%> (+3.571%) 12% <0%> (+1%) ⬆️
...samtools/util/AsyncBlockCompressedInputStream.java 76% <0%> (+4%) 13% <0%> (+1%) ⬆️
...in/java/htsjdk/tribble/index/tabix/TabixIndex.java 81.416% <0%> (+4.308%) 70% <0%> (+32%) ⬆️
+ * overlap with any interval in the second lists.
+ */
+ public static IntervalList overlaps(final Collection<IntervalList> lists1, final Collection<IntervalList> lists2) {
+ final SAMFileHeader header = lists1.iterator().next().getHeader().clone();
@yfarjoun

yfarjoun May 26, 2017

Contributor

will break on empty list....but then you'll have no header to work with...you could then grab the header from an intervalList in lists2...but that might be empty as well..

Do you really need Collections of IntervalLists? Wouldn't it be better to make this robust on overlaps(final IntervalList lhs, final IntervalList rhs) and have the caller addAll outside?

@nh13

nh13 Jun 3, 2017

Contributor

If you look at the rest of the "actions", you'll see either they return null (intersection) or throw an exception (concatenate). And then some of the other actions (ex. union) depend on these. So I think it's fine the way it is (albeit I need to add a check for an empty list).

+ }
+ final OverlapDetector<Interval> detector = new OverlapDetector<>(0, 0);
+ for (final Interval interval : overlapIntervals.sorted().uniqued()) {
+ detector.addLhs(interval, interval);
@yfarjoun

yfarjoun May 26, 2017

Contributor

can put the same (dummy) Object in the second position of addLhs. Might be more efficient that way?

@nh13

nh13 Jun 3, 2017

Contributor

Ok, but it does feel like a pre-mature optimization.

+ final IntervalList one_overlaps_three = new IntervalList(fileHeader);
+
+ two_overlaps_one.add(new Interval("1", 50, 150));
+ //two_overlaps_one.add(new Interval("1", 301, 500));
@yfarjoun

yfarjoun May 26, 2017

Contributor

why the commented lines?

@nh13

nh13 Jun 3, 2017 edited

Contributor

Will add this comment:

 // NB: commented lines below are there to show the intervals in the first list that will not be in the resulting list
+ };
+ }
+
+ @DataProvider(name = "overlapsData")
@yfarjoun

yfarjoun May 26, 2017

Contributor

please add a test with an empty collection on the lhs and another on the rhs.

@nh13

nh13 Jun 3, 2017

Contributor

Done.

nh13 referenced this pull request in nh13/picard Jun 3, 2017

Closed

Supporting auxiliary baits for bait re-design #7

@nh13 nh13 @yfarjoun's fixups
b5de09c
Contributor

nh13 commented Jun 3, 2017

@yfarjoun back to you

Contributor

nh13 commented Jun 7, 2017

@yfarjoun bump

@yfarjoun

thanks for humoring me.

@nh13 nh13 merged commit dcd20ff into master Jun 7, 2017

4 of 5 checks passed

codecov/changes 2 files have unexpected coverage changes not visible in diff.
Details
codecov/patch 100% of diff hit (target 65.006%)
Details
codecov/project 65.115% (+0.109%) compared to 1b27398
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

nh13 deleted the nh_overlaps_interval_list branch Jun 7, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment