Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Variants and Features GA4GH, with GA4GH example file #464

Merged
merged 3 commits into from
Jul 23, 2017

Conversation

akmorrow13
Copy link
Collaborator

@akmorrow13 akmorrow13 commented Jul 4, 2017

This PR contains the following:

  • GA4GH Variants
  • GA4GV Features
  • Features Viz Track
  • GA4GH Example File for Features and Variants

What's left for this PR:

  • Add a GA4GH Reads example on the GA4GH example page.
  • There is an issue with using sinon server and dataCanvas at the same time. Therefore, the tests for FeatureTrack will fail. We can either add in the Feature JSON and rewrite the tests on the JSON datasource, or add these tests in a separate PR once the JSON formats are done.
  • Test in python to verify GA4GH schemas

This change is Reviewable

Copy link
Member

@armish armish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for all these changes, @akmorrow13! I have a few minor comments here and there but this looks great overall!

re: failing tests: I would rather not have any of them failing if you can think of a quick workaround for the JSON issue. I also don't want this to be blocked by the JSON features, so you can decide on the way to work around the issue in a way that won't take too much of your time and won't compromise the tests for now.

Again - thanks a lot for the amazing work! Can't wait to make a new release featuring all of these new additions.

@@ -98,6 +98,8 @@ To play with the demo, start an [http-server][hs]:

Then open [http://localhost:8080/examples/index.html](http://localhost:8080/examples/index.html) in your browser of choice.

To view integration with GA4GH schemas, view [http://localhost:8080/examples/ga4gh-example.html](http://localhost:8080/examples/ga4gh-example.html).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kudos for isolating all these GA4GH integration parts into a separate example 🙇

{
viz: pileup.viz.variants(),
data: pileup.formats.GAVariant({
endpoint: 'http://1kgenomes.ga4gh.org',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you mind extracting this and the following endpoint as a variable that is defined at the top of the file with some pointers to the 1kgenomes server and a simple explanation to why this server is special? That would make it easier for a random person to compare this with the standard data example.

// Expand range to begin and end on multiples of size
expand(size: number, zeroBased: boolean): Interval {
var minimum = zeroBased ? 0 : 1;
var roundDown = x => x - x % size;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a simple example to what this line would do as a comment? And maybe encapsulate the modulo operation in parentheses just for clarity?

This is just me thinking loud: the expand name seems a bit misleading here: I first thought that this was just expanding the region by the given size, but this might actually not expand one of the ends if either the start or stop are already multiples of a size - e.g.: (10, 20) -> expand(5) -> (10, 20). Maybe using round in the name would be better but I leave it up to you since this was what it was named originally.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok sounds good! I will rename it. Also, I am still not sure on the zeroBased parameter. Another option is changing zeroBased to an integer, and passing that in as the minimum. So instead you would have:

round(size: number, minimum: number): Interval {
    var roundDown = x => x - x % size;
    var newStart = Math.max(minimum, roundDown(this.start)),
        newStop = roundDown(this.stop + size - 1);
    return new Interval(newStart, newStop);
}

This is pretty minor, but it could be cleaner.

@@ -77,7 +70,7 @@ function createFromBamFile(remoteSource: BamFile): AlignmentDataSource {
return Q.when();
}

interval = expandRange(interval);
interval = interval.expand(BASE_PAIRS_PER_FETCH, ZERO_BASED);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

// Flow type for export.
export type BigBedSource = {
rangeChanged: (newRange: GenomeRange) => void;
getGenesInRange: (range: ContigInterval<string>) => Gene[];
getFeaturesInRange: (range: ContigInterval<string>) => Gene[];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for renaming this — feature would better speak to the bioinformaticians in this context.

@@ -52,6 +41,9 @@ function create(spec: GA4GHSpec): AlignmentDataSource {
var coveredRanges: ContigInterval<string>[] = [];

function addReadsFromResponse(response: Object) {
if (response.alignments === undefined) {
return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious why/when this happens. I would have expected the alignments to be an empty array for the semantics of it, but is this how GA4GH server responds when no reads match the criteria?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This occurs when you query a region without data. For example, for variants, if you query the GA4GH server:
curl --data '{"variantSetId": "WyIxa2dlbm9tZXMiLCJ2cyIsInBoYXNlMy1yZWxlYXNlIl0", "end": "10200", "pageSize": "0", "pageToken": "", "start": "10000", "callSetIds": ["WyIxa2dlbm9tZXMiLCJ2cyIsInBoYXNlMy1yZWxlYXNlIiwiSEcwMDA5NiJd"], "referenceName": "10"}' --header 'Content-Type: application/json' http://1kgenomes.ga4gh.org/variants/search

It will return {}

Therefore, if trying to call .variants (or .alignments in the alignment case), on the empty JSON object, the code will error.

@@ -126,7 +126,7 @@ class VariantTrack extends React.Component {
if (variants && variants.length>0) {
var data = [];
for (var i=0;i<variants.length;i++) {
data.push({id: variants[i].id,vcfLine:variants[i].vcfLine});
data.push({id:variants[i].id,vcfLine:variants[i].vcfLine,ref:variants[i].ref,alt:variants[i].alt});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for the styling inconsistency here; if you had a chance, would you mind fixing the style here? OK if you don't get to it:

// ...

for (var i=0; i < variants.length; i++) {
  data.push({
    id: variants[i].id,
    vcfLine: variants[i].vcfLine,
    ref: variants[i].ref,
    alt: variants[i].alt
  });
}

// ...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Styling nits are appreciated! Will fix!

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.3%) to 90.663% when pulling a0c0a25 on akmorrow13:otherGA4GH into 3859912 on hammerlab:master.

@akmorrow13
Copy link
Collaborator Author

@armish Thanks for all the comments! I decided that it would be easiest to just add the JSON stuff required to pass the FeatureTrack tests, so those should all be in! Let me know if there are any other comments we should address.

One thing to note is the Flow issues: In VcfDataSource.js and BigBedDataSource.js I had to add an export flag to make Flow happy with some imports. Although this is not the cleanest solution, it seems to be a bug with jsxhint that can only be fixed this way.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.3%) to 90.663% when pulling 0bff6d2 on akmorrow13:otherGA4GH into 3859912 on hammerlab:master.

@armish
Copy link
Member

armish commented Jul 23, 2017

thank you, @akmorrow13! LGTM.

@armish armish merged commit 5c78282 into hammerlab:master Jul 23, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants