Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visualization of genome track #441

Merged
merged 13 commits into from
May 17, 2017
Merged

Conversation

piotr-gawron
Copy link
Contributor

@piotr-gawron piotr-gawron commented Nov 2, 2016

This PR contains 3 main things:

  1. fix on the issue with popups when clicking on variant in Variant track - right now when you click on variant in variant track you will get an error message in a console (supposed outcome should be popup with information about gene variant)
  2. popups for gene variants use bootstrap model window to present detailed information about variant from vcf file
  3. when visualizing vcf file I added possibility to use variant bar height as an indicator of allel frequency - this is something requested by biologist at our center - this is optional, not used by default

Few comments regaring changes:

  • I decided to use bootstrap as a popup implementation because it was the easiest chocie for me. If you don't like the solution then I can use any other if you provide me with links to some examples/documentation. One of the reasons why you might not like it is the fact that it requires bootstrap css to be included (if not the modal popup is useless). Of course this can be included in pileup css if you want to keep the whole css stuff in one file, but I'm not sure what are you rstandards here.

  • user (programmer) can decide if he wants to use allel frequency as a hight of variant bar by providng options parameter when creating gene variant view. This requires a lot of passing config parameters: pileup->Root->VisualizationWrapper->VariantTrack. I couldn't find better solution, so if you know one please let me know and I will refactor.

{
    viz: pileup.viz.variants(),
    data: pileup.formats.vcf({
      url: '/test-data/snv.chr17.vcf'
    }),
    options: {variantHeightByFrequency: true},
    name: 'Variants'
  }
  • I'm not sure why some of my commits from June are in this PR (I couldn't find any changes that wasn't introduced this week in the code)

This change is Reviewable

…o attach dynamically generated data (instead from files it will be taken from a string source that represents content of the file)
* fix on popup when clicking on variant (up to now there was a console error)
* popup is a bootstrap modal window (not just old alert function)
* if possible (and requested by user), gene variant height depends on the allel frequency of specific variant
@coveralls
Copy link

Coverage Status

Coverage increased (+0.2%) to 91.86% when pulling f32698c on piotr-gawron:vcf-allel-vis into 0182b22 on hammerlab:master.

Copy link
Member

@armish armish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this PR, @piotr-gawron! This is a great improvement over the standard variant track and the variant details dialog.

However, pulling in bootstrap seems like an overkill just to show a dialog for the variants. Although it provides nice UI elements, this will come with the need to always include bootstrap CSS in all pileup.js applications, which might not be desired by the developer for the particular tool.

We were originally trying to keep the size of the pileup.js library as small as possible (hence the current custom/barebones UI elements) and bootstrap-react is not the tiniest UI library.

I know that we should have probably done this earlier, but I think a better way to approach this issue would be to allow users to register callback functions on key events (such as clicks). This way, pileup.js could be more environment-agnostic and more customizable from the developer perspective. If we had this before, you could have just register your own functions that render bootstrap modals with variant info in them without the need to change the pileup.js code.

I am more than happy to implement this if you are not planning to do it.

The custom variant box height is a nice addition, so we can always keep that part. I have a few minor comments regarding the code that deals with the parsing.

Let me know what you think and sorry that it took me so long to get to code review.

Keep these PRs coming, please! We do appreciate them.

ref: parts[3],
alt: parts[4],
significantFrequency: frequency,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we call this significantFrequency again?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is that single vcf entry can contain information about more than one gene variant:
20 1110696 rs6040355 A G,T 67 PASS NS=2;DP=10;AF=0.333,0.667;AA=T;DB
In such situation we need to decide which allel frequency value use for the visualization.
I know about two strategies, but there might be more:

  • take the smalles
  • take the biggest (this one is currently implemented)

The alternative to significantFrequency field is to store all the 'allel frequency' values and allow user to decide. But again we have to agree about the possible strategies.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. Can we name those two alternatives as minor vs major allele frequency just to be in line with the common population biology terminology? It would also be great to mention why pick either of these in the comments right at the top of the code block so that future developers won't get confused.

var params = parts[7].split(';');
for (var i=0;i<params.length;i++) {
var param = params[i];
if (param.startsWith("AF=")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think AF is an almost standard annotation for a majority of the VCF variants out there, but do you know if there are other abbreviations that people use as a synonym? If so, it might worth looking for those.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took it from standard definition: http://samtools.github.io/hts-specs/VCFv4.3.pdf
I'm not aware of any other abbreviation that might refer to the frequency.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool - thanks for checking that! Was just curious whether we should be more inclusive, but looks like not 👍

@@ -41,12 +43,28 @@ function extractLocusLine(vcfLine: string): LocusLine {

function extractVariant(vcfLine: string): Variant {
var parts = vcfLine.split('\t');
var frequency = null;
if (parts.length>=7){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if this is a multi-sample VCF and the first sample is not relevant for the user? I am asking this; because I know that mutect can sometimes order the samples in a way that the normal sample comes before the tumor one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I know vcf can contain multiple variants regarding single nucleotide. It can be achieved in two ways:

  1. one entry in vcf can correspond to more than one variant, take a look at this entry:
    20 1234567 microsat1 GTC G,GTCT 50 PASS NS=3;DP=9;AA=G GT:GQ:DP 0/1:35:4 0/2:17:2 1/1:40:3
  2. two entries can refer to the same region in the genome (I think this is what you mentioned in the comment)

Regarding the first issue - there is nothing you can do in term of visualization apart from providing information in the popup. We could think about coloring or some fancy way of highlighting the situation, but in my opinion it's not intuitive.

When you have two (or more) overlaying variants I would also suggest to put it in the popup. I will fix my PR to handle this situation (right now you have only first element that match click in the popup).

If the behaviour is not the one user expected I think the best way to go is suggest user to filter data from vcf file and provide filtered results.

Anyway, this reminds me about one other issue: when you present gene variant it should span through the whole modified region. Right now every variant is visualized by rectangle on the first nucleotide only.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gotcha! I think your solution regarding multiple variants makes sense for the first pass as long as we provide the additional information back to the callback function so users will have a way to know if such is the case.

Another option is to parse the variant header and provide the developer a way to only use information from a particular column. For example the NORMAL TUMOR column names in the example VCF files. AFAIK, those column names are arbitrary, so we can let the developer determine which one we pick for visualization. Do you think this makes sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't go that far. In most cases users know what data they visualize and when they know that there are two overlapping data sets they should separate them in different tracks (at least this is what I would do).

@piotr-gawron
Copy link
Contributor Author

piotr-gawron commented Nov 10, 2016

I understand - when library is light it's much easier to maintain (and it's more flexible).

I will reimplement it accordingly - will remove bootstrap from dependencies and correct example with the registered callback function.

However, there is still a question of passing options to pileup. Do you agree with the methodology that I used? (passing configuration options from top classes to bottom classes).

@armish
Copy link
Member

armish commented Nov 10, 2016

thanks for the clarifications, @piotr-gawron! I think the way the way you implemented things for passing options to visualization tracks look reasonable. If this proves not that useful in the future, we can change the things to better accommodate other use cases.

* bootstrap library removed
* when user click on gene variant instead of poup user defined function will be called (if such method is not defined log information will appear in the console)
* allel frequency is parsed in a way that maximum and minimum values are stored in vcf variant entry
* user can define strategy that should be used when hight of the gene variant bars should depend on the alle frequency, two defined strategies are:
- take max value of allel frequency for vcf entry
- take min value of allel frequency for vcf entry
@piotr-gawron
Copy link
Contributor Author

piotr-gawron commented Nov 11, 2016

I reimplemented the fucntionality as disscussed.

Below you can find my original comment which now is a little obsolete :). Due to some issues with nested default properties I had to modify, and flow doesn't fail anymore.

However, I still think that you missuse props field in the react component:
you shouldn't specify prop value for typechecking. Here you can find article about propTypes and defaultProps standard used in react: https://facebook.github.io/react/docs/typechecking-with-proptypes.html
The way you do it prevents setting default values to the prop.


My original comment:

Let me first explain why travis failed:
flow library doesn't support get/set properties in ES2015, you can find more information about it here: facebook/flow#572

I used get method because I needed to specify default prop value for allel frequency strategy and this is the right way to do. Alternatively you can do something like: http://stackoverflow.com/a/36778923/1127920. However this extracts specificiation of the prop default outside of the class and I don't like the approach.

I noticed also one more issue with your code:
you shouldn't specify prop value for typechecking. Here you can find article about propTypes and defaultProps standard used in react: https://facebook.github.io/react/docs/typechecking-with-proptypes.html
The way you did it prevented setting default value to the prop.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.1%) to 91.535% when pulling 8f50dd4 on piotr-gawron:vcf-allel-vis into 0182b22 on hammerlab:master.

@piotr-gawron
Copy link
Contributor Author

@armish I see that my PR has some pending requestes changes. Can you tell me what is it, or point to the point on github, because I cannot find it...

@@ -3,6 +3,7 @@
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<link rel="stylesheet" href="../style/pileup.css" />
<link rel="stylesheet" href="demo.css" />
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/latest/css/bootstrap.min.css">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you remove this stylesheet since we are not making use of it anymore?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind - let me do that to save some time.

@armish armish merged commit 6399869 into hammerlab:master May 17, 2017
armish added a commit that referenced this pull request May 17, 2017
@piotr-gawron
Copy link
Contributor Author

@armish
Thanks for merging. Do you plan by any chance to release it?
It would make my life easier if I could include npm package with merged changes :).

@piotr-gawron piotr-gawron deleted the vcf-allel-vis branch May 2, 2019 12:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants