Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Sai Chen
committed
Aug 22, 2018
1 parent
d664550
commit 6d40c6a
Showing
13 changed files
with
114 additions
and
58 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,7 @@ | ||
# Created by .ignore support plugin (hsz.mobi) | ||
# PyCharm | ||
.DS_Store | ||
.idea/ | ||
Polaris.wiki | ||
Polaris.archive.*.tar.bz2 | ||
ena_submissions/* | ||
ena_submissions/* |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,98 @@ | ||
# Cascadia v0.6 Release Notes (June 2018) | ||
|
||
# Table of Contents | ||
- [Overview](#Overview) | ||
- [Dataset Summary](#v06-truthset) | ||
- [Validation Scheme](#validation-scheme) | ||
- [Merging and Refining Scheme](#merging-and-refining-scheme) | ||
|
||
## Overview | ||
|
||
A **hg38** truth set of simple deletion and insertions built from: | ||
|
||
- Manta deletion and insertion calls from NovaSeq NA12877 & NA12878 (NSV4 pipeline on hg38) | ||
- This part of is the same as in v0.5 (except for minor VCF format fix) | ||
|
||
- Refined Sniffles (v1.0.8) deletion and insertion calls from PacBio NA12878 on hg38 | ||
- Insertion sequence was assembled and refined from PacBio + ONT reads | ||
|
||
- copy number variants and large deletions curated from population and Platinum Genome pedigree on *hg19*, coming from deletion calls of Manta/Canvas and Sniffles. | ||
- Please see Xiao's CNV truth set repository for details: | ||
- https://git.illumina.com/xchen2/CNVTruthSet | ||
|
||
Hg19 truth set is generated by lift-over on hg38 truth set. | ||
|
||
12,374 (98%) passed entries and 24,405 (88%) failed entries were successfully converted to hg19. | ||
|
||
The release vcf contains genotypes of NA12877 and NA12878 (re-genotyped with our targeted graph genotyper *Paragraph*). | ||
|
||
Illumina cluster users can find VCF containing full Platinum Genome pedigree genotypes on Illumina cluster. | ||
|
||
### Unvalidated variants | ||
|
||
In addition, VCFs named with "all_merge.include_unvalidated" contain unvalidated variants from Sniffles v1.0.8 calls, they are: | ||
|
||
- All calls on chrX, chrY and mitochondria | ||
|
||
- Inversions, duplications and translocations | ||
|
||
These unvalidated variants are labled as *UNVALIDATED* in their filter fields. | ||
|
||
For now we haven't established a robust pipeine for validating these variants, but finally they will be validated in future release. | ||
|
||
### Data format | ||
|
||
Variants that pass our pedigree and population check were labeled as *PASS* in their filter fields. | ||
|
||
Variants that fail any filter were labled with the specific filter name(s) in their filter fields. | ||
|
||
*SOURCE* key in *INFO* field indicates where the variant originally comes from. All unvalidated variants do not have *SOURCE* key because they all come from Sniffles. | ||
|
||
## Dataset Summary | ||
|
||
### Merged variants partitioned by SVLEN and type | ||
|
||
| SV Type | INS | INS | DEL | DEL | CNV | CNV | | ||
|:----------------------|:---------:|:---------:|:---------:|:---------:|:---------:|:---------:| | ||
| Filter | PASS | FAIL | PASS | FAIL | PASS | FAIL | | ||
| _L_ \< 50 | 905 | 1,934 | 1,178 | 1,390 | 0 | 0 | | ||
| 50 \< _L_ \< 100 bp | 1,503 | 3,571 | 1,939 | 3,553 | 0 | 0 | | ||
| 100 \< _L_ \< 1kb | 2,240 | 4,785 | 2,389 | 4,417 | 0 | 0 | | ||
| 1kb \< _L_ \< 10kb | 18 | 155 | 1,978 | 2,930 | 105 | 139 | | ||
| _L_ > 10kb | 0 | 0 | 337 | 1,065 | 89 | 219 | | ||
| __Total__ | __4,666__ | __10,445__| __7,821__ | __13,355__| __194__ | __358__ | | ||
|
||
### **Non-reference** calls in NA12878 partitioned by SVLEN and type | ||
|
||
| SV Type | INS | DEL | CNV | | ||
|:----------------------|:---------:|:---------:|:---------:| | ||
| _L_ \< 50 | 878 | 1,169 | 0 | | ||
| 50 \< _L_ \< 100 bp | 1,274 | 1,521 | 0 | | ||
| 100 \< _L_ \< 1kb | 1,958 | 1,926 | 0 | | ||
| 1kb \< _L_ \< 10kb | 17 | 626 | 19 | | ||
| _L_ > 10kb | 0 | 140 | 11 | | ||
| __Total__ | __4,127__ | __5,382__ | __30__ | | ||
|
||
|
||
|
||
### Merged variants partitioned by SVLEN, type and source | ||
|
||
| SV Type | INS | INS | DEL | DEL | | ||
|:----------------------|:--------------|:--------------|:--------------|:--------------| | ||
| Source | Manta PASS | Sniffles PASS | Manta PASS | Sniffles PASS | | ||
| _L_ \< 50 | 6 (24%) | 901 (32%) | 7 (100%) | 1,178 (46%) | | ||
| 50 \< _L_ \< 100 bp | 1,151 (47%) | 620 (22%) | 1,598 (44%) | 726 (32%) | | ||
| 100 \< _L_ \< 1kb | 1,451 (55%) | 1,189 (25%) | 2,256 (45%) | 1,492 (47%) | | ||
| 1kb \< _L_ \< 10kb | 1 (100%) | 18 (10%) | 736 (29%) | 267 (66%) | | ||
| _L_ > 10kb | 0 (n/a) | 0 (n/a) | 156 (17%) | 0 (n/a) | | ||
| __Total__ |__2,606 (51%)__|__2,728 (26%)__|__4,753 (39%)__|__3,663 (43%)__| | ||
|
||
Using refined Sniffles calls as input, v0.6 has a significant improved validate rate for Sniffles calls, with 2,604 validated insertions and a validation rate of 26%, compared to v0.5 (589 validated insertions and a validation rate of 5%). | ||
|
||
## Merging Scheme | ||
|
||
For simple SVs validated by Paragraph, we followed the same merging scheme for deletions as [v0.5](../v0.5/README.md). | ||
|
||
For CNVs, we do not try to merge them with simple SVs. | ||
|
||
For large deletions labeled as PASS, we prioritize Paragraph validated ones over Xiao's CNV truth set. |