-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Visualization of genome track #441
Changes from all commits
1dc7529
f11b82a
cdd8226
c288d42
b7e0d6a
95f6810
e3d7ba4
9627645
637d9f9
cf381ae
f32698c
1746654
8f50dd4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -16,6 +16,13 @@ export type Variant = { | |
position: number; | ||
ref: string; | ||
alt: string; | ||
id: string; | ||
//this is the bigest allel frequency for single vcf entry | ||
//single vcf entry might contain more than one variant like the example below | ||
//20 1110696 rs6040355 A G,T 67 PASS NS=2;DP=10;AF=0.333,0.667;AA=T;DB | ||
majorFrequency: ?number; | ||
//this is the smallest allel frequency for single vcf entry | ||
minorFrequency: ?number; | ||
vcfLine: string; | ||
} | ||
|
||
|
@@ -41,12 +48,33 @@ function extractLocusLine(vcfLine: string): LocusLine { | |
|
||
function extractVariant(vcfLine: string): Variant { | ||
var parts = vcfLine.split('\t'); | ||
var maxFrequency = null; | ||
var minFrequency = null; | ||
if (parts.length>=7){ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What if this is a multi-sample VCF and the first sample is not relevant for the user? I am asking this; because I know that mutect can sometimes order the samples in a way that the normal sample comes before the tumor one. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As far as I know vcf can contain multiple variants regarding single nucleotide. It can be achieved in two ways:
Regarding the first issue - there is nothing you can do in term of visualization apart from providing information in the popup. We could think about coloring or some fancy way of highlighting the situation, but in my opinion it's not intuitive. When you have two (or more) overlaying variants I would also suggest to put it in the popup. I will fix my PR to handle this situation (right now you have only first element that match click in the popup). If the behaviour is not the one user expected I think the best way to go is suggest user to filter data from vcf file and provide filtered results. Anyway, this reminds me about one other issue: when you present gene variant it should span through the whole modified region. Right now every variant is visualized by rectangle on the first nucleotide only. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. gotcha! I think your solution regarding multiple variants makes sense for the first pass as long as we provide the additional information back to the callback function so users will have a way to know if such is the case. Another option is to parse the variant header and provide the developer a way to only use information from a particular column. For example the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wouldn't go that far. In most cases users know what data they visualize and when they know that there are two overlapping data sets they should separate them in different tracks (at least this is what I would do). |
||
var params = parts[7].split(';'); | ||
for (var i=0;i<params.length;i++) { | ||
var param = params[i]; | ||
if (param.startsWith("AF=")) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I took it from standard definition: http://samtools.github.io/hts-specs/VCFv4.3.pdf There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Cool - thanks for checking that! Was just curious whether we should be more inclusive, but looks like not 👍 |
||
maxFrequency = 0.0; | ||
minFrequency = 1.0; | ||
var frequenciesStrings = param.substr(3).split(","); | ||
for (var j=0;j<frequenciesStrings.length;j++) { | ||
var currentFrequency = parseFloat(frequenciesStrings[j]); | ||
maxFrequency = Math.max(maxFrequency, currentFrequency); | ||
minFrequency = Math.min(minFrequency, currentFrequency); | ||
} | ||
} | ||
} | ||
} | ||
|
||
return { | ||
contig: parts[0], | ||
position: Number(parts[1]), | ||
id: parts[2], | ||
ref: parts[3], | ||
alt: parts[4], | ||
majorFrequency: maxFrequency, | ||
minorFrequency: minFrequency, | ||
vcfLine | ||
}; | ||
} | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
/** | ||
* @flow | ||
*/ | ||
'use strict'; | ||
|
||
import {expect} from 'chai'; | ||
|
||
import pileup from '../../main/pileup'; | ||
import dataCanvas from 'data-canvas'; | ||
import {waitFor} from '../async'; | ||
|
||
import ReactTestUtils from 'react-addons-test-utils'; | ||
|
||
describe('VariantTrack', function() { | ||
var testDiv = document.getElementById('testdiv'); | ||
|
||
beforeEach(() => { | ||
testDiv.style.width = '700px'; | ||
dataCanvas.RecordingContext.recordAll(); | ||
}); | ||
|
||
afterEach(() => { | ||
dataCanvas.RecordingContext.reset(); | ||
// avoid pollution between tests. | ||
testDiv.innerHTML = ''; | ||
}); | ||
var {drawnObjects} = dataCanvas.RecordingContext; | ||
|
||
function ready() { | ||
return testDiv.getElementsByTagName('canvas').length > 0 && | ||
drawnObjects(testDiv, '.variants').length > 0; | ||
} | ||
|
||
it('should render variants', function() { | ||
var variantClickedData = null; | ||
var variantClicked = function (data) { | ||
variantClickedData = data; | ||
}; | ||
var p = pileup.create(testDiv, { | ||
range: {contig: '17', start: 9386380, stop: 9537390}, | ||
tracks: [ | ||
{ | ||
viz: pileup.viz.genome(), | ||
data: pileup.formats.twoBit({ | ||
url: '/test-data/test.2bit' | ||
}), | ||
isReference: true | ||
}, | ||
{ | ||
data: pileup.formats.vcf({ | ||
url: '/test-data/test.vcf' | ||
}), | ||
viz: pileup.viz.variants(), | ||
options: {onVariantClicked: variantClicked}, | ||
} | ||
] | ||
}); | ||
|
||
return waitFor(ready, 2000) | ||
.then(() => { | ||
var variants = drawnObjects(testDiv, '.variants'); | ||
expect(variants.length).to.be.equal(1); | ||
var canvasList = testDiv.getElementsByTagName('canvas'); | ||
var canvas = canvasList[1]; | ||
expect(variantClickedData).to.be.null; | ||
|
||
//check clicking on variant | ||
ReactTestUtils.Simulate.click(canvas,{nativeEvent: {offsetX: -0.5, offsetY: -15.5}}); | ||
|
||
expect(variantClickedData).to.not.be.null; | ||
p.destroy(); | ||
}); | ||
}); | ||
|
||
}); |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
##fileformat=VCFv4.1 | ||
##source=VarScan2 | ||
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total depth of quality bases"> | ||
##INFO=<ID=SOMATIC,Number=0,Type=Flag,Description="Indicates if record is a somatic mutation"> | ||
##INFO=<ID=SS,Number=1,Type=String,Description="Somatic status of variant (0=Reference,1=Germline,2=Somatic,3=LOH, or 5=Unknown)"> | ||
##INFO=<ID=SSC,Number=1,Type=String,Description="Somatic score in Phred scale (0-255) derived from somatic p-value"> | ||
##INFO=<ID=GPV,Number=1,Type=Float,Description="Fisher's Exact Test P-value of tumor+normal versus no variant for Germline calls"> | ||
##INFO=<ID=SPV,Number=1,Type=Float,Description="Fisher's Exact Test P-value of tumor versus normal for Somatic/LOH calls"> | ||
##FILTER=<ID=str10,Description="Less than 10% or more than 90% of variant supporting reads on one strand"> | ||
##FILTER=<ID=indelError,Description="Likely artifact due to indel reads at this position"> | ||
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype"> | ||
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality"> | ||
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth"> | ||
##FORMAT=<ID=RD,Number=1,Type=Integer,Description="Depth of reference-supporting bases (reads1)"> | ||
##FORMAT=<ID=AD,Number=1,Type=Integer,Description="Depth of variant-supporting bases (reads2)"> | ||
##FORMAT=<ID=FREQ,Number=1,Type=String,Description="Variant allele frequency"> | ||
##FORMAT=<ID=DP4,Number=4,Type=Integer,Description="Strand read counts: ref/fwd, ref/rev, var/fwd, var/rev"> | ||
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NORMAL TUMOR | ||
20 61795 . G T . PASS DP=81;SS=1;SSC=2;GPV=4.6768E-16;SPV=5.4057E-1;AF=0.7 GT:GQ:DP:RD:AD:FREQ:DP4 0/1:.:44:22:22:50%:16,6,9,13 0/1:.:37:18:19:51.35%:10,8,10,9 | ||
20 62731 . C A,G . PASS DP=68;SS=1;SSC=1;GPV=1.4855E-11;SPV=7.5053E-1;AF=0.4,0.5 GT:GQ:DP:RD:AD:FREQ:DP4 0/1:.:32:17:15:46.88%:9,8,9,6 0/1:.:36:21:15:41.67%:8,13,8,7 | ||
20 61731 . C A,G,T . PASS DP=68;SS=1;SSC=1;GPV=1.4855E-11;SPV=7.5053E-1;AF=0.4,0.6,0.3 GT:GQ:DP:RD:AD:FREQ:DP4 0/1:.:32:17:15:46.88%:9,8,9,6 0/1:.:36:21:15:41.67%:8,13,8,7 |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
##fileformat=VCFv4.1 | ||
##source=VarScan2 | ||
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total depth of quality bases"> | ||
##INFO=<ID=SOMATIC,Number=0,Type=Flag,Description="Indicates if record is a somatic mutation"> | ||
##INFO=<ID=SS,Number=1,Type=String,Description="Somatic status of variant (0=Reference,1=Germline,2=Somatic,3=LOH, or 5=Unknown)"> | ||
##INFO=<ID=SSC,Number=1,Type=String,Description="Somatic score in Phred scale (0-255) derived from somatic p-value"> | ||
##INFO=<ID=GPV,Number=1,Type=Float,Description="Fisher's Exact Test P-value of tumor+normal versus no variant for Germline calls"> | ||
##INFO=<ID=SPV,Number=1,Type=Float,Description="Fisher's Exact Test P-value of tumor versus normal for Somatic/LOH calls"> | ||
##FILTER=<ID=str10,Description="Less than 10% or more than 90% of variant supporting reads on one strand"> | ||
##FILTER=<ID=indelError,Description="Likely artifact due to indel reads at this position"> | ||
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype"> | ||
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality"> | ||
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth"> | ||
##FORMAT=<ID=RD,Number=1,Type=Integer,Description="Depth of reference-supporting bases (reads1)"> | ||
##FORMAT=<ID=AD,Number=1,Type=Integer,Description="Depth of variant-supporting bases (reads2)"> | ||
##FORMAT=<ID=FREQ,Number=1,Type=String,Description="Variant allele frequency"> | ||
##FORMAT=<ID=DP4,Number=4,Type=Integer,Description="Strand read counts: ref/fwd, ref/rev, var/fwd, var/rev"> | ||
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NORMAL TUMOR | ||
17 9386385 . G T . PASS DP=81;SS=1;SSC=2;GPV=4.6768E-16;SPV=5.4057E-1;AF=0.7 GT:GQ:DP:RD:AD:FREQ:DP4 0/1:.:44:22:22:50%:16,6,9,13 0/1:.:37:18:19:51.35%:10,8,10,9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you remove this stylesheet since we are not making use of it anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind - let me do that to save some time.