-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate variant browser to postgres #528
Conversation
…many other things
…t_minimal_representation as it seems unused
…. Most hacks can be removed once database is fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've gone over most of the python code. None of the JS so far.
Apart from the specific comments I have left I have a few general ones.
-
The beacon style dataset reference parsing. Wouldn't it be better to put that in a function and call that in all handlers that get the dataset name. That way we are both future proof and it's easier to find all the places to fix the beacon-version parsing bugs.
-
There's a lot of
[x for x in db.query.dicts()]
, esp. in the lookups module, is this really necessary? In one or two cases maybe, but I feel that the conversion from peewee objects into dicts should be made as late as possible. Maybe this requires a larger refactor than what we should do at the moment, but I would like to make sure. -
This is even more general. Should the exception handling happen close to where the exceptions happen or should the lookups just let them pass and then we deal with them in the handlers. The latter approach would (possibly) reduce the amount of exception handling. I'm not super convinced either way, but as it is now the exceptions are handled in both the lookup and the handler (although, there exceptions are the case when there's a
None
return from the underlying lookup function).
For example GetGene
in browser_handlers.py. Line 89 checks wheter gene in None
, which only happens in case get_gene
in lookups.py
(line 196) gets an exception. To me this looks like we are doing double work.
(db.Coverage.pos <= end_pos) & | ||
(db.Coverage.chrom == chrom) & | ||
(db.Coverage.dataset_version == dataset_version.id)) | ||
.dicts())] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't you just return the db.Coverage...where()
witihout .dicts()
and without a list-comprehension?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The list comprehension is needed; otherwise it only returns a SelectQuery
object that needs to be executed on the database.
return db.Gene.select().where((db.Gene.gene_id == gene_id) & | ||
(db.Gene.reference_set == ref_set)).dicts().get() | ||
except db.Gene.DoesNotExist: | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't this be in just one try:
with two except clauses instead, would be a bit clearer to me at least.
(db.Gene.start <= stop_pos) & | ||
(db.Gene.stop >= start_pos) & | ||
(db.Gene.chrom == chrom)).dicts() | ||
return [gene for gene in gene_query] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I'm not really sure you need to do the dicts()
and list comprehension here.
if variant['rsid']: | ||
if not str(variant['rsid']).startswith('rs'): | ||
variant['rsid'] = 'rs{}'.format(variant['rsid']) | ||
return variant |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this method should instead be:
variant = get_raw_variant(dataset, pos, chrom, ref, alt, ds_version)
if variant and 'rsid' in variant and not str(variant['rsid']).startswith('rs'):
variant['rsid'] = f"{rs{variant['rsid']}"
return variant
(the main point was the updated ifs, not the f-string)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Went through the js as well.
<p>Your region is too large. Please submit a region of at most 100 kb.</p> | ||
<p>If you require larger regions, please see our <a href="/dataset/{{ ctrl.dataset.shortName }}/download">Download</a> page for raw data.</p> | ||
</div> | ||
<div ng-if='ctrl.region.statusText=="The region is too large"'> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about comparing with a string in this way here. Isn't there a statusCode for too large region?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm using 400 (bad request
) for too large region. 400 is also used for problems with parsing the region (which in my opinion also fits bad request
). Thus the need to check the actual text. But maybe I should change it to ===
?
|
013e547
to
61c7be5
Compare
af2e9ff
to
421c279
Compare
b779b2b
to
40b10cf
Compare
…the same reference set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good. Just a small comment about the parse_beacon_dataset
.
9a86973
to
5c17392
Compare
5c17392
to
cab3370
Compare
Describe the pull request:
Pull request long description:
Migration of the variant browser from mongo to postgres. Also includes a lot of bug fixes in both backend and frontend. A lot of browser tests are also added.
Changes made:
A lot.