-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Variants and Features GA4GH, with GA4GH example file #464
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for all these changes, @akmorrow13! I have a few minor comments here and there but this looks great overall!
re: failing tests: I would rather not have any of them failing if you can think of a quick workaround for the JSON issue. I also don't want this to be blocked by the JSON features, so you can decide on the way to work around the issue in a way that won't take too much of your time and won't compromise the tests for now.
Again - thanks a lot for the amazing work! Can't wait to make a new release featuring all of these new additions.
@@ -98,6 +98,8 @@ To play with the demo, start an [http-server][hs]: | |||
|
|||
Then open [http://localhost:8080/examples/index.html](http://localhost:8080/examples/index.html) in your browser of choice. | |||
|
|||
To view integration with GA4GH schemas, view [http://localhost:8080/examples/ga4gh-example.html](http://localhost:8080/examples/ga4gh-example.html). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kudos for isolating all these GA4GH integration parts into a separate example 🙇
examples/data-ga4gh.js
Outdated
{ | ||
viz: pileup.viz.variants(), | ||
data: pileup.formats.GAVariant({ | ||
endpoint: 'http://1kgenomes.ga4gh.org', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would you mind extracting this and the following endpoint
as a variable that is defined at the top of the file with some pointers to the 1kgenomes
server and a simple explanation to why this server is special? That would make it easier for a random person to compare this with the standard data example.
src/main/Interval.js
Outdated
// Expand range to begin and end on multiples of size | ||
expand(size: number, zeroBased: boolean): Interval { | ||
var minimum = zeroBased ? 0 : 1; | ||
var roundDown = x => x - x % size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a simple example to what this line would do as a comment? And maybe encapsulate the modulo operation in parentheses just for clarity?
This is just me thinking loud: the expand
name seems a bit misleading here: I first thought that this was just expanding the region by the given size, but this might actually not expand one of the ends if either the start
or stop
are already multiples of a size - e.g.: (10, 20) -> expand(5) -> (10, 20)
. Maybe using round
in the name would be better but I leave it up to you since this was what it was named originally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok sounds good! I will rename it. Also, I am still not sure on the zeroBased
parameter. Another option is changing zeroBased
to an integer, and passing that in as the minimum. So instead you would have:
round(size: number, minimum: number): Interval {
var roundDown = x => x - x % size;
var newStart = Math.max(minimum, roundDown(this.start)),
newStop = roundDown(this.stop + size - 1);
return new Interval(newStart, newStop);
}
This is pretty minor, but it could be cleaner.
src/main/sources/BamDataSource.js
Outdated
@@ -77,7 +70,7 @@ function createFromBamFile(remoteSource: BamFile): AlignmentDataSource { | |||
return Q.when(); | |||
} | |||
|
|||
interval = expandRange(interval); | |||
interval = interval.expand(BASE_PAIRS_PER_FETCH, ZERO_BASED); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
src/main/sources/BigBedDataSource.js
Outdated
// Flow type for export. | ||
export type BigBedSource = { | ||
rangeChanged: (newRange: GenomeRange) => void; | ||
getGenesInRange: (range: ContigInterval<string>) => Gene[]; | ||
getFeaturesInRange: (range: ContigInterval<string>) => Gene[]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for renaming this — feature
would better speak to the bioinformaticians in this context.
@@ -52,6 +41,9 @@ function create(spec: GA4GHSpec): AlignmentDataSource { | |||
var coveredRanges: ContigInterval<string>[] = []; | |||
|
|||
function addReadsFromResponse(response: Object) { | |||
if (response.alignments === undefined) { | |||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious why/when this happens. I would have expected the alignments
to be an empty array for the semantics of it, but is this how GA4GH
server responds when no reads match the criteria?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This occurs when you query a region without data. For example, for variants, if you query the GA4GH server:
curl --data '{"variantSetId": "WyIxa2dlbm9tZXMiLCJ2cyIsInBoYXNlMy1yZWxlYXNlIl0", "end": "10200", "pageSize": "0", "pageToken": "", "start": "10000", "callSetIds": ["WyIxa2dlbm9tZXMiLCJ2cyIsInBoYXNlMy1yZWxlYXNlIiwiSEcwMDA5NiJd"], "referenceName": "10"}' --header 'Content-Type: application/json' http://1kgenomes.ga4gh.org/variants/search
It will return {}
Therefore, if trying to call .variants
(or .alignments
in the alignment case), on the empty JSON object, the code will error.
src/main/viz/VariantTrack.js
Outdated
@@ -126,7 +126,7 @@ class VariantTrack extends React.Component { | |||
if (variants && variants.length>0) { | |||
var data = []; | |||
for (var i=0;i<variants.length;i++) { | |||
data.push({id: variants[i].id,vcfLine:variants[i].vcfLine}); | |||
data.push({id:variants[i].id,vcfLine:variants[i].vcfLine,ref:variants[i].ref,alt:variants[i].alt}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for the styling inconsistency here; if you had a chance, would you mind fixing the style here? OK if you don't get to it:
// ...
for (var i=0; i < variants.length; i++) {
data.push({
id: variants[i].id,
vcfLine: variants[i].vcfLine,
ref: variants[i].ref,
alt: variants[i].alt
});
}
// ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Styling nits are appreciated! Will fix!
@armish Thanks for all the comments! I decided that it would be easiest to just add the JSON stuff required to pass the FeatureTrack tests, so those should all be in! Let me know if there are any other comments we should address. One thing to note is the Flow issues: In VcfDataSource.js and BigBedDataSource.js I had to add an export flag to make Flow happy with some imports. Although this is not the cleanest solution, it seems to be a bug with jsxhint that can only be fixed this way. |
thank you, @akmorrow13! LGTM. |
This PR contains the following:
What's left for this PR:
This change is![Reviewable](https://camo.githubusercontent.com/23b05f5fb48215c989e92cc44cf6512512d083132bd3daf689867c8d9d386888/68747470733a2f2f72657669657761626c652e696f2f7265766965775f627574746f6e2e737667)