Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
storage: Handle gaps in VCFs and multiple overlapps at FillGaps opera…
…tion. #877
- Loading branch information
Showing
9 changed files
with
289 additions
and
100 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
247 changes: 168 additions & 79 deletions
247
...adoop-core/src/main/java/org/opencb/opencga/storage/hadoop/variant/gaps/FillGapsTask.java
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
13 changes: 13 additions & 0 deletions
13
...encga-storage-hadoop/opencga-storage-hadoop-core/src/test/resources/gaps/file1.genome.vcf
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
##fileformat=VCFv4.2 | ||
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype"> | ||
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Filtered basecall depth used for site genotyping"> | ||
##INFO=<ID=END,Number=1,Type=Integer,Description="End position of the region described in this record"> | ||
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT s1 | ||
1 1 . N . . . END=10003 GT:DP .:. | ||
1 10004 . C . . . END=10010 GT:DP 0/0:3 | ||
1 10011 . ATTT A 2 . . GT:DP 0/1:40 | ||
1 10015 . A . . . END=10020 GT:DP 0/0:7 | ||
1 10020 . A T 2 . . GT:DP 0/1:41 | ||
1 10021 . A . . . END=10030 GT:DP 0/0:7 | ||
1 10031 . T TAAA 1 . . GT:DP 0/1:42 | ||
1 10032 . A . . . END=10043 GT:DP 0/0:5 |
15 changes: 15 additions & 0 deletions
15
...encga-storage-hadoop/opencga-storage-hadoop-core/src/test/resources/gaps/file2.genome.vcf
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
##fileformat=VCFv4.2 | ||
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype"> | ||
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Filtered basecall depth used for site genotyping"> | ||
##INFO=<ID=END,Number=1,Type=Integer,Description="End position of the region described in this record"> | ||
##GAPS=1:10015-10030 with 1:10020:A:T | ||
##MULTI_OVERLAP=1:10013:T:C and 1:10014:A:T with 1:10011:ATTT:A | ||
##INSERTION_GAP=1:10031:T:TAAA does not overlap with any from here | ||
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT s2 | ||
1 1 . N . . . END=10003 GT:DP .:. | ||
1 10004 . C . . . END=10012 GT:DP 0/0:3 | ||
1 10013 . T C 2 . . GT:DP 0/1:30 | ||
1 10014 . T A 2 . . GT:DP 0/1:31 | ||
1 10031 . T G 1 . . GT:DP 0/1:32 | ||
1 10032 . A G 1 . . GT:DP 0/1:33 | ||
1 10033 . A . . . END=10043 GT:DP 0/0:5 |