New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Neostore Zip datatype support #4178

Merged
merged 11 commits into from Jan 2, 2018

Conversation

@zipho
Contributor

zipho commented Jun 10, 2017

Changes to the neostore datatype, adding support for the zip archive neostore datatype. This also includes the converter which unpacks the noestore zip ready for processing.

@galaxybot galaxybot added the triage label Jun 10, 2017

@galaxybot galaxybot added this to the 17.09 milestone Jun 10, 2017

@@ -99,36 +121,73 @@ def __init__(self, **kwd):
self.add_composite_file('neostore.counts.db.a', is_binary=True)
self.add_composite_file('neostore.counts.db.b', is_binary=True)

This comment has been minimized.

@pvanheus

pvanheus Jun 11, 2017

Contributor

Can we make this optional=True because I've seen Neo4j datastores where this file is missing.

This comment has been minimized.

@zipho

zipho Jun 11, 2017

Contributor

Thanks @pvanheus, that is implemented as per suggestion.

self.add_composite_file('neostore.schemastore.db', is_binary=True)
self.add_composite_file('neostore.schemastore.db.id', is_binary=True)
self.add_composite_file('neostore.transaction.db.0', is_binary=True)
class Neo4jDBzip(Neo4j, Data):

This comment has been minimized.

@pvanheus

pvanheus Jun 12, 2017

Contributor

Isn't it simpler to make Neo4jDBzip a descendant of Binary (e.g. see https://github.com/galaxyproject/galaxy/blob/dev/lib/galaxy/datatypes/binary.py#L86)? It is a single file, so it doesn't need to be a composite type, right?

This comment has been minimized.

@zipho

zipho Jun 12, 2017

Contributor

Though a vaild point, I model the neostore.zip datatype on the Fphe, Lped , Pphe datatypes which inherit from the RGenetics and all are composite datatypes with single file upload. Hence Neo4jDBzip class inherits functions from the Neo4j class.

This comment has been minimized.

@pvanheus

pvanheus Jun 13, 2017

Contributor

Noted, thanks!

@@ -5,6 +5,7 @@
# to be reflected in galaxy.web.controllers.tool_runner and galaxy.tools
from __future__ import print_function
import logging

This comment has been minimized.

@pvanheus

pvanheus Jun 13, 2017

Contributor

This is triggering the flake8 I100 message which is causing build failures. From https://pypi.python.org/pypi/flake8-import-order:

"In general stdlib comes first, then 3rd party, then local packages, and that each group is individually alphabetized" - so that's

import codecs
import gzip
import logging
import os

This comment has been minimized.

@zipho

zipho Jun 13, 2017

Contributor

Your comment is noted, the "import" statement was committed unintentional. I have removed the line, but your input is noted.

@pvanheus

This comment has been minimized.

Contributor

pvanheus commented Jun 13, 2017

@jmchilton this is a follow on to #2605 that will allow easier upload. Can you take a look please?

@jmchilton

This comment has been minimized.

Member

jmchilton commented Jun 14, 2017

@galaxybot test this

@jmchilton

This comment has been minimized.

Member

jmchilton commented Jun 14, 2017

I was almost +1 on this - but it seems to be that based on this PR Galaxy is so close to supporting direct uploads of composite files as zips the way it is. There is some magic in that converters output definition right that takes a directory of files and turns it into a zipped datatype - and I know there is a lot of logic for looking at the selected datatype in upload.py as well as looking at if the file appears compressed or not. Can we glue these pieces together - can the upload.py code say -

if I am looking at a composite datatype and this file is compressed:
    unzip the file to a directory
    have the composite datatype consume the directory and produce a dataset

Are y'all coming to the GCC again? How about this - try to take a look at doing this, if you can't get it before the GCC we will try to find some time to hack on it, and if there isn't time during the GCC I'll revisit this PR and probably merge in early July?

@jmchilton

This comment has been minimized.

Member

jmchilton commented Jun 14, 2017

That above comment was assuming this datatype is only a hack to get zip uploads to work - if there is some good use case for compressed neostore databases in Galaxy histories - by all means I'm happy to merge this PR as is.

@zipho

This comment has been minimized.

Contributor

zipho commented Jun 15, 2017

@jmchilton Thanks for the feedback John. Yes, we are coming to GCC this year. I think this would make a good candidate to work on for the hackathon.

@jmchilton

This comment has been minimized.

Member

jmchilton commented Sep 11, 2017

Were making some big changes to composite uploads and such that may be helpful - I hope it is okay if I delay this for a bit longer. We will definitely revisit it during the next release cycle.

@jmchilton jmchilton modified the milestones: 18.01, 17.09 Sep 11, 2017

@jmchilton

This comment has been minimized.

Member

jmchilton commented Dec 12, 2017

@galaxybot test this

@jmchilton

This comment has been minimized.

Member

jmchilton commented Dec 12, 2017

Merged this against the latest dev branch and tweaked the binary handling, I'll merge this if the tests look good.

@martenson

This comment has been minimized.

Member

martenson commented Jan 2, 2018

This is marked as WIP so I am pushing it to 18.05.

@martenson martenson modified the milestones: 18.01, 18.05 Jan 2, 2018

@jmchilton jmchilton modified the milestones: 18.05, 18.01 Jan 2, 2018

@jmchilton jmchilton merged commit 75283c7 into galaxyproject:dev Jan 2, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@jmchilton

This comment has been minimized.

Member

jmchilton commented Jan 2, 2018

Thanks @martenson - I meant to merge this a few weeks ago but forgot. Thanks for the contribution @zipho!

@martenson martenson added status/review and removed status/WIP labels Jan 2, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment