Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the trackhub composite datatype #2348

Merged
merged 9 commits into from Jul 22, 2016

Conversation

Projects
None yet
9 participants
@remimarenco
Copy link
Contributor

commented May 12, 2016

Hi!

Some of you are maybe aware we are building, at goeckslab, a workflow for genome annotation which is very tied with Galaxy: G-OnRamp.

One of the tools of G-OnRamp workflow is called HubArchiveCreator and allows to visualize tracks generated inside Galaxy, onto UCSC TrackHub.

To enable this, I needed to add a datatype, and a display_application.

With @jgoecks, we though it could be really useful to have TrackHub UCSC / G-OnRamp compatibility being directly available on Galaxy installation, so here is the PR :).

Here is a print screen of the result:
screen shot 2016-05-11 at 10 00 45 pm

To test the datatype:

(Remove the .txt extension => GitHub does not accept .fa or .bed)

  • Run HubArchiveCreator with dbia3.fa as reference genome and:
    • Format: BED
    • Bed Choice: BED Simple Repeats
    • Bed Simple Repeats (Bed4+12) File: trfBig.bed

This image might help:
screen shot 2016-05-11 at 10 08 45 pm

And you're done! You have huba datatype installed, and you can visualize:

  • BED
  • BED simple repeats
  • GTF
  • GFF3
  • BigWig
  • Bam

files into UCSC TrackHub :)

@galaxybot galaxybot added the triage label May 12, 2016

@galaxybot galaxybot added this to the 16.07 milestone May 12, 2016

@@ -0,0 +1,6 @@
<display id="ucsc_huba" version="1.0.0" name="display at Track Hub UCSC">
<link id="main" name="main">
<url>https://genome.ucsc.edu/cgi-bin/hgHubConnect?hubUrl=${qp($hub_file.url + '/myHub/hub.txt')}&amp;hgHub_do_firstDb=on&amp;hgHub_do_redirect=on&amp;hgHubConnect.remakeTrackHub=on</url>

This comment has been minimized.

Copy link
@bgruening

bgruening May 12, 2016

Member

Should this be configurable?
There are many UCSC instances which could be used.

This comment has been minimized.

Copy link
@remimarenco

remimarenco May 12, 2016

Author Contributor

I will be happy to set this configurable, how do we set this up?

@@ -0,0 +1,59 @@
"""
HubAssembly datatype
"""

This comment has been minimized.

Copy link
@bgruening

bgruening May 12, 2016

Member

Can we squeeze this datatype into an other file? tracks.py maybe?

This comment has been minimized.

Copy link
@remimarenco

remimarenco May 12, 2016

Author Contributor

Sure, sounds logical!

if version.VERSION_MAJOR <= "16.01":
from galaxy.datatypes.images import Html
else:
from galaxy.datatypes.text import Html

This comment has been minimized.

Copy link
@bgruening

bgruening May 12, 2016

Member

Oo, why is this needed?
Can we have one central import to avoid such things in the future?

This comment has been minimized.

Copy link
@remimarenco

remimarenco May 12, 2016

Author Contributor

Technically, importing from galaxy.datatyes.images should work in any versions because of this.
But I preferred to use the text lib, to drop the images one in the future (once we drop support for < 16.04...so in a long time heh).

How do you think we should solve this?

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo May 12, 2016

Member

You don't need to care for older Galaxy versions, since this PR will be merged in the dev branch and the datatype released only in Galaxy 16.07.

This comment has been minimized.

Copy link
@remimarenco

remimarenco May 13, 2016

Author Contributor

Ok, I was afraid about the case where somebody wanted to use the datatype on 16.01 or earlier (and it was my case for various reasons). So as you wish :)

This comment has been minimized.

Copy link
@bgruening

bgruening Jun 3, 2016

Member

@remimarenco this still needs to be addressed, isn't it?

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Jun 3, 2016

Member

New features are not cherry-picked. If you will backport locally, than you can also add the workaround!

This comment has been minimized.

Copy link
@remimarenco

remimarenco Jun 3, 2016

Author Contributor

I don't get why theses lines prevent the merging in Galaxy codebase. Does it harm something? Just to know so I won't do the same errors in the future.

This comment has been minimized.

Copy link
@jxtx

jxtx Jun 3, 2016

Contributor

I believe the argument is that since datatypes are distributed within the Galaxy code, this case should never happen. It's dead code.

This comment has been minimized.

Copy link
@jmchilton

jmchilton Jun 3, 2016

Member

We distribute datatypes but they are a plugin - I'd be totally willing to include work arounds like this in datatypes code in core.

This comment has been minimized.

Copy link
@remimarenco

remimarenco Jun 3, 2016

Author Contributor

Thanks for the explanation @jxtx. I got that point, and as I said, I had the case where I needed my datatype whereas I was not in 16.04.

But I guess this is a corner case, so I understand :).

Thanks for the support @jmchilton, I removed it as it seems to be a majority not in favor :).

'<html><head><title>Files for Composite Dataset (%s)</title></head><p/>\
This composite dataset is composed of the following files:<p/><ul>' % (
self.file_ext)]
for composite_name, composite_file in self.get_composite_files( dataset=dataset ).iteritems():

This comment has been minimized.

Copy link
@bgruening

bgruening May 12, 2016

Member

just items()

This comment has been minimized.

Copy link
@dannon

dannon May 12, 2016

Member

Why items() over iteritems(), out of curiosity? Use of the generator seems fine practice here to me.

This comment has been minimized.

Copy link
@bgruening

bgruening May 12, 2016

Member

@dannon we do try to port Galaxy slowly to Python3 isn't it? I was assuming this is more portable and speed is not a concern in this call for python2 installations.
But yes it is not a must.

This comment has been minimized.

Copy link
@dannon

dannon May 12, 2016

Member

Ahh, I see, you were thinking about python3. Good to have the rationale here instead of simply 'just items()' :)

Definitely not a requirement to change this now @remimarenco, up to you.

This comment has been minimized.

Copy link
@bgruening

bgruening May 12, 2016

Member

Sorry, was a breakfast-review ;)

This comment has been minimized.

Copy link
@remimarenco

remimarenco May 12, 2016

Author Contributor

I like the idea of being more Python 3 compatible, will change it! :)

@bgruening

This comment has been minimized.

Copy link
Member

commented May 12, 2016

@remimarenco good to start a day with such a great idea! Thanks a lot!
I would maybe rename the datatype to make it more obvious that it is UCSC specific.

@remimarenco

This comment has been minimized.

Copy link
Contributor Author

commented May 12, 2016

Thanks for all the reviews @bgruening and @dannon! Will take care of this soon :)

remimarenco added some commits May 14, 2016

Following advices: renamed the extension to 'trackhub', renamed the h…
…uba.xml to trackhub.xml, changed iteritems to items, removed hubAssembly.py to port it into tracks.py

remimarenco added a commit to goeckslab/hub-archive-creator that referenced this pull request May 16, 2016

remimarenco added a commit to goeckslab/hub-archive-creator that referenced this pull request May 16, 2016

@remimarenco remimarenco changed the title Add the composite datatype Add the trackhub composite datatype May 18, 2016

remimarenco and others added some commits May 18, 2016

Update datatypes_conf.xml.sample
Typo in the extension

@dannon dannon referenced this pull request May 18, 2016

Merged

Remove blank lines in file. #1

remimarenco added some commits May 18, 2016

Merge pull request #1 from dannon/huba_datatype
Remove blank lines in file.
<display id="ucsc_trackhub" version="1.0.0" name="display at Track Hub UCSC">
<link id="main" name="main">
<url>https://genome.ucsc.edu/cgi-bin/hgHubConnect?hubUrl=${qp($hub_file.url + '/myHub/hub.txt')}&amp;hgHub_do_firstDb=on&amp;hgHub_do_redirect=on&amp;hgHubConnect.remakeTrackHub=on</url>
<param type="data" name="hub_file" url="galaxy_${DATASET_HASH}" allow_extra_files_access="True" strip_https="True"/>

This comment has been minimized.

Copy link
@blankenberg

blankenberg Jun 3, 2016

Member

Is strip_https="True" actually needed here?

This comment has been minimized.

Copy link
@remimarenco

remimarenco Jun 3, 2016

Author Contributor

I just followed the wiki about this.

What is advised for this / what are the cases?

This comment has been minimized.

Copy link
@blankenberg

blankenberg Jun 3, 2016

Member

It swaps https to http (when Galaxy is under https) for the url that Galaxy gives to the external display application. At one point the UCSC genome browser was not able to fetch https URLs; I do not think this is the case any longer.

This comment has been minimized.

Copy link
@remimarenco

remimarenco Jun 3, 2016

Author Contributor

Ok perfect! I will change also that on the wiki then?

remimarenco added a commit to goeckslab/hub-archive-creator that referenced this pull request Jun 4, 2016

@jmchilton

This comment has been minimized.

Copy link
Member

commented Jul 21, 2016

@galaxybot test this

@jmchilton jmchilton merged commit 40fde4e into galaxyproject:dev Jul 22, 2016

4 checks passed

api test Build finished. 224 tests run, 0 skipped, 0 failed.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
framework test Build finished. 111 tests run, 0 skipped, 0 failed.
Details
toolshed test Build finished. 582 tests run, 0 skipped, 0 failed.
Details
@jmchilton

This comment has been minimized.

Copy link
Member

commented Jul 22, 2016

Thanks @remimarenco - good luck with G-OnRamp. Any progress on testing composite datatypes?

nsoranzo added a commit to nsoranzo/galaxy that referenced this pull request Jul 22, 2016

@nsoranzo

This comment has been minimized.

Copy link
Member

commented Jul 22, 2016

@jmchilton This was merged with 3 unaddressed comments, see #2646.

@remimarenco

This comment has been minimized.

Copy link
Contributor Author

commented Jul 25, 2016

@jmchilton @nsoranzo Thank you for taking care of this. (Yes @jmchilton, workshop about it tomorrow, for two days! Exciting :) )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.