New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Command to convert tool_dependencies.xml recipes into shell scripts #303

Closed
peterjc opened this Issue Sep 18, 2015 · 22 comments

Comments

Projects
None yet
4 participants
@peterjc
Copy link
Contributor

peterjc commented Sep 18, 2015

Related to #19 (testing tool_dependencies.xml without a tool shed), I would like to be able to run an install recipe from a tool_dependencies.xml file locally and/or turn it into a simple shell script for the current platform.

(The platform specific actions could be turned into bash if statements if preferred)

This seems to overlap with https://github.com/jmchilton/shed2tap

e.g. https://github.com/peterjc/pico_galaxy/blob/master/tools/effectiveT3/tool_dependencies.xml

<?xml version="1.0"?>
<tool_dependency>
    <package name="effectiveT3" version="1.0.1">
        <install version="1.0">
            <actions>
                <!-- Set environment variable so Python script knows where to look -->
                <action type="set_environment">
                    <environment_variable name="EFFECTIVET3" action="set_to">$INSTALL_DIR</environment_variable>
                </action>
                <!-- Main JAR file -->
                <action type="shell_command">wget http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_GUI-1.0.1.jar</action>
                <!-- If using action type download_file will need to move the file,
                <action type="move_file"><source>TTSS_GUI-1.0.1.jar</source><destination>$INSTALL_DIR/</destination></action>
                -->
                <!-- Three model JAR files -->
                <action type="make_directory">$INSTALL_DIR/module</action>
                <action type="shell_command">wget http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_ANIMAL-1.0.1.jar</action>
                <action type="move_file"><source>TTSS_ANIMAL-1.0.1.jar</source><destination>$INSTALL_DIR/module/</destination></action>        
                <action type="shell_command">wget http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_PLANT-1.0.1.jar</action>
                <action type="move_file"><source>TTSS_PLANT-1.0.1.jar</source><destination>$INSTALL_DIR/module/</destination></action>
                <action type="shell_command">wget http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_STD-1.0.1.jar</action>
                <action type="move_file"><source>TTSS_STD-1.0.1.jar</source><destination>$INSTALL_DIR/module/</destination></action>
                <action type="shell_command">wget http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_STD-2.0.1.jar</action>
                <action type="move_file"><source>TTSS_STD-2.0.1.jar</source><destination>$INSTALL_DIR/module/</destination></action>
            </actions>
        </install>
        <readme>
Downloads effectiveT3 v1.0.1 and the three models from http://effectors.org/ aka http://effectors.csb.univie.ac.at/
        </readme>
    </package>
</tool_dependency>

Would become something like this (assuming already in install directory as per XML convention):

#!/bin/bash
#House keeping: strict bash mode, etc
set -euo pipefail
export INSTALL_DIR=$PWD
#Start of conversion from XML recipe:
echo "Installing effectiveT3 version 1.0.1"
export EFFECTIVET3=$INSTALL_DIR
wget http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_GUI-1.0.1.jar
mkdir $INSTALL_DIR/module
wget http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_ANIMAL-1.0.1.jar
mv TTSS_ANIMAL-1.0.1.jar $INSTALL_DIR/module/
wget http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_PLANT-1.0.1.jar
mv TTSS_PLANT-1.0.1.jar $INSTALL_DIR/module/
http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_STD-1.0.1.jar
mv TTSS_STD-1.0.1.jar $INSTALL_DIR/module/
wget http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_STD-2.0.1.jar
mv TTSS_STD-2.0.1.jar $INSTALL_DIR/module/

I would then be able to run this within TravisCI with the advantages that the install recipe is not repeated (tool_dependencies.xml and .travis.yml) and moreover I would actually be able to test tool_dependencies.xml, e.g. peterjc/pico_galaxy@243311c

@erasche

This comment has been minimized.

Copy link
Member

erasche commented Sep 18, 2015

+1

fre. 18. sep. 2015, 13.39 skrev Peter Cock notifications@github.com:

Related to #19 #19
(testing tool_dependencies.xml without a tool shed), I would like to be
able to run an install recipe from a tool_dependencies.xml file locally
and/or turn it into a simple shell script for the current platform.

(The platform specific actions could be turned into bash if statements if
preferred)

This seems to overlap with https://github.com/jmchilton/shed2tap

e.g.
https://github.com/peterjc/pico_galaxy/blob/master/tools/effectiveT3/tool_dependencies.xml

<tool_dependency>





<environment_variable name="EFFECTIVET3" action="set_to">$INSTALL_DIR</environment_variable>


wget http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_GUI-1.0.1.jar


$INSTALL_DIR/module
wget http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_ANIMAL-1.0.1.jar
TTSS_ANIMAL-1.0.1.jar$INSTALL_DIR/module/
wget http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_PLANT-1.0.1.jar
TTSS_PLANT-1.0.1.jar$INSTALL_DIR/module/
wget http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_STD-1.0.1.jar
TTSS_STD-1.0.1.jar$INSTALL_DIR/module/
wget http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_STD-2.0.1.jar
TTSS_STD-2.0.1.jar$INSTALL_DIR/module/



Downloads effectiveT3 v1.0.1 and the three models from http://effectors.org/ aka http://effectors.csb.univie.ac.at/


</tool_dependency>

Would become something like this:

#!/bin/bash
#House keeping: strict bash mode, etc
set -euo pipefail
#TODO - move to a temp dir, check $INSTALL_DIR is set and exists
#Start of conversion from XML recipe:
echo "Installing effectiveT3 version 1.0.1"
wget http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_GUI-1.0.1.jar
mv TTSS_GUI-1.0.1.jar $INSTALL_DIR/
mkdir $INSTALL_DIR/module
wget http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_ANIMAL-1.0.1.jar
mv TTSS_ANIMAL-1.0.1.jar $INSTALL_DIR/module/
wget http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_PLANT-1.0.1.jar
mv TTSS_PLANT-1.0.1.jar $INSTALL_DIR/module/http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_STD-1.0.1.jar
mv TTSS_STD-1.0.1.jar $INSTALL_DIR/module/
wget http://effectors.csb.univie.ac.at/sites/eff/files/others/TTSS_STD-2.0.1.jar
mv TTSS_STD-2.0.1.jar $INSTALL_DIR/module/

I would then be able to run this within TravisCI with the advantages that
the install recipe is not repeated (tool_dependencies.xml and .travis.yml)
and moreover I would actually be able to test tool_dependencies.xml, e.g.
peterjc/pico_galaxy@243311c
peterjc/pico_galaxy@243311c


Reply to this email directly or view it on GitHub
#303.

@erasche

This comment has been minimized.

Copy link
Member

erasche commented Sep 18, 2015

Alternatively there is the shed2tap code if installing from a brew recipe would be acceptable

@bgruening

This comment has been minimized.

Copy link
Member

bgruening commented Sep 18, 2015

ping @davebx;
As far as I know he was looking at this already. We had this idea some month ago to make migration to brew or whatever we will use easier.

@peterjc

This comment has been minimized.

Copy link
Contributor Author

peterjc commented Sep 18, 2015

I'm out of time now, but having spent some time this afternoon hacking https://github.com/jmchilton/shed2tap I think I can turn @jmchilton's Action.to_ruby() method into something to produce a bash script.

That might be enough for a stand alone tool, or a new planemo command - but waiting to hear from @davebx etc about how best to proceed to avoid duplication of effort.

@jmchilton

This comment has been minimized.

Copy link
Member

jmchilton commented Sep 18, 2015

Just a heads up (maybe way to late), the newest shed2tap code is actually in planemo itself. https://github.com/galaxyproject/planemo/blob/master/planemo/shed2tap/base.py.

@peterjc

This comment has been minimized.

Copy link
Contributor Author

peterjc commented Sep 21, 2015

Thanks @jmchilton. https://github.com/galaxyproject/planemo/blob/master/planemo/shed2tap/base.py has an extensive to_ruby() method on the base Action class (essentially a large switch statement), but there is nothing similar on https://github.com/galaxyproject/planemo/blob/master/planemo/shed2tap/base.py which instead has a far more complete heirachy of Action subclasses. I would think adding small to_ruby() or to_bash() methods to each Action subclasses would make sense here?

e.g. https://github.com/peterjc/planemo/tree/shed2bash

@jmchilton

This comment has been minimized.

Copy link
Member

jmchilton commented Sep 21, 2015

This might be the most updated thing I was working on... jmchilton@52f7866.
https://github.com/jmchilton/planemo/commits/shed2tap

Whatever you get working is fine. My code is sprawled all over it seems and that is my own fault so I will adapt it to whatever you get into planemo :).

@jmchilton

This comment has been minimized.

Copy link
Member

jmchilton commented Sep 21, 2015

I was thinking implementing a visitor pattern for ruby/bash conversion - but to_bash or to_ruby will be find also.

@peterjc

This comment has been minimized.

Copy link
Contributor Author

peterjc commented Sep 21, 2015

My first attempt is using to_bash on the action classes...

@peterjc

This comment has been minimized.

Copy link
Contributor Author

peterjc commented Sep 21, 2015

My first example for effectiveT3 seems to work - but that is a simple tool_dependencies.xml file, perhaps unusually simple.

The action type download_by_url and friends is proving tricky. The problem is the Galaxy magic in lib/tool_shed/galaxy_install/tool_dependencies/recipe/step_handler.py class CompressedFile where the .extract method will work out the common prefix of a tar-bar's contents in order to change into that directory. e.g.

<action type="download_by_url">ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.2.30/ncbi-blast-2.2.30+-x64-linux.tar.gz</action>

should become:

$ wget ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.2.30/ncbi-blast-2.2.30+-x64-linux.tar.gz
$ tar -zxvf ncbi-blast-2.2.30+-x64-linux.tar.gz
$ cd ncbi-blast-2.2.30+

I'm almost wondering if something like this would be simplest (which can call the same Galaxy code):

$ planemo download_by_url ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.2.30/ncbi-blast-2.2.30+-x64-linux.tar.gz
@peterjc

This comment has been minimized.

Copy link
Contributor Author

peterjc commented Sep 21, 2015

Another niggle, consider a recipe which boils down to something like this:

#!/bin/bash
#downloads stuff, then sets a new environment variable like NEW_TOOL,
#or edits an exiting environment variable like PATH
export NEW_TOOL=/some/path

If executed directly like ./example.sh or bash example.sh then we loose access to the environment variable $NEW_TOOL. Alternatively source example.sh or . example.sh work in terms of exposing the new/changed environment variables, but they make exit (or failures if using strict bash mode with set -euo pipefail or similar) terminate the user's shell session.

I think this means we need to turn the tool_dependencies.xml file(s) into an install.sh file (or similarly named file) to be run once, plus a second shell script which only sets the environment variables, to be run via source prior to running the tool tests via the dependency mechanisms. Galaxy calls those env.sh, doesn't it?

@bgruening

This comment has been minimized.

Copy link
Member

bgruening commented Sep 21, 2015

@peterjc I like your second idea. And yes Galaxy calls the env.sh files before executing a tool.
Thanks for working on this. I think this will make so much things easier.

@peterjc

This comment has been minimized.

Copy link
Contributor Author

peterjc commented Sep 22, 2015

I'm increasingly finding I am reimplementing things already in the Galaxy Tool Shed code (with the risk of potentially interpreting the XML recipe slightly differently, which on the plus side could highlight some ambiguities in the recipe format).

e.g. turning the environment variable actions into env.sh entries is done in https://github.com/galaxyproject/galaxy/blob/dev/lib/tool_shed/galaxy_install/tool_dependencies/recipe/env_file_builder.py

Planemo already bundles part of the Galaxy python library under planemo_ext/ so might adding planemo_ext/tool_shed/galaxy_install/tool_dependencies/recipe/env_file_builder.py etc might be a practical way forward?

@jmchilton

This comment has been minimized.

Copy link
Member

jmchilton commented Sep 22, 2015

@peterjc That directory aims to be a subset of the Galaxy's code base, feel free to bring stuff over. The stuff should be sufficiently isolated though. galaxy.util also isn't yet a true subset so be careful about that as well.

@peterjc

This comment has been minimized.

Copy link
Contributor Author

peterjc commented Sep 23, 2015

I have a plan for the action type download_by_url etc actions consistent with producing as simple as possible a bash script.

While generating the bash script, I will download the file (to a temp directory by default) where I can examine it to determine how to decompress it and what folder (if any) Galaxy would automatically change into. This might use a bundled copy of lib/tool_shed/galaxy_install/tool_dependencies/recipe/step_handler.py.

To avoid the overheads and waste of repeated downloads, the key information (decompression method and folder to change into) can be cached. I am planning to use the MD5 hash of the URL as the key. e.g. "~/.planemo/dependency_downloads/%s.json" % md5(url)

In the context of continuous integration with TravisCI, I plan to re-use the cached downloaded files. i.e. include an if statement to link to the cached file if present.

peterjc added a commit to peterjc/galaxy_blast that referenced this issue Sep 24, 2015

No NCBI 32 bit binaries as of BLAST+ 2.2.31
Spotted while working on a planemo extension to parse
tool_dependencies.xml files into bash install scripts:
galaxyproject/planemo#303
@peterjc

This comment has been minimized.

Copy link
Contributor Author

peterjc commented Sep 24, 2015

I have a working prototype planemo depbash command here: https://github.com/peterjc/planemo/tree/depbash

This is hard coded to produce a single file dep_install.sh and matching env.sh combining all the tool_dependencies.xml files processed (you can recurse over a folder) which all will use $INSTALL_DIR as their destination (which is a problem if you have name clashes between tool binaries, e.g. multiple versions of BLAST+).

Right now it uses a single flat folder $DOWNLOAD_CACHE (defaulting to ./download_cache) to cache downloads (nothing clever with checksums), so that the decompression and folder structure can be determined while generating the shell script. The dep_install.sh will also use this cache so that in a continuous integration setup the file is only fetched once.

Example usage assuming you don't have to worry about multiple tools clashing:

$ planemo depbash -r ~/my_tools/
$ bash dep_install.sh
$ source env.sh
$ planemo test -r ~/my_tools

Note this does nothing about resolving dependencies!

This is able to parse all my tool_dependencies.xml in https://github.com/peterjc/galaxy_blast , https://github.com/peterjc/pico_galaxy and https://github.com/peterjc/galaxy_mira

Not all the actions are supported yet, e.g.

$ planemo depbash --fail_fast -r ../tools-iuc ../tools-devteam/ ; echo "Returned $?"
...
Processing requirements from /mnt/galaxy/repositories/tools-iuc/packages/package_abyss_1_9_0/tool_dependencies.xml
Downloading https://github.com/bcgsc/abyss/releases/download/1.9.0/abyss-1.9.0.tar.gz
Error processing /mnt/galaxy/repositories/tools-iuc/packages/package_abyss_1_9_0/tool_dependencies.xml - No to_bash defined for Action[type=set_environment_for_install]
...
Error processing one or more tool_dependencies.xml files.
Returned 1
@erasche

This comment has been minimized.

Copy link
Member

erasche commented Sep 24, 2015

@peterjc awesome!

@peterjc

This comment has been minimized.

Copy link
Contributor Author

peterjc commented Sep 25, 2015

Successful TravisCI usage with galaxy_mira to install MIRA 3.4.1.1, 4.0.2 and 4.9.5 via planemo depbash rather than a manual install recipe:

peterjc/galaxy_mira@b71e8a4
https://travis-ci.org/peterjc/galaxy_mira/builds/82117820

In the above TravisCI ran no tests as nothing had changed compared to the Test Tool Shed. Here's the following test run here where I requested all the tests be run (magic keyword in the git commit):

peterjc/galaxy_mira@70cad4e
https://travis-ci.org/peterjc/galaxy_mira/builds/82123804

See also #7 where I described the planemo + TravisCI approach I'm trying on this galaxy_mira branch.

peterjc added a commit that referenced this issue Oct 7, 2015

Adding new command: planemo dependency_script
This is a squashed commit of pull request #310 for issue #303,
for converting tool_dependencies.xml install recipes into bash
scripts.

There is a lot that could be done better, or added, including:

- refactor to use a visitor pattern instead of my .to_bash() methods
- expand the command line API with options for paths and filenames
- setting defaults like download cache via ~/.planemo.yml
- complete the action coverage (especially the R/Python/Perl environments)
- avoid collisions in the download cache which currently assumes unique filenames

However, this is enough to help with automating dependency
installation in a continuous integration setup like TravisCI.
@peterjc

This comment has been minimized.

Copy link
Contributor Author

peterjc commented Oct 7, 2015

Should we leave this open for finishing some of the missing functionality as of f798c7e or file separate issues?

@jmchilton

This comment has been minimized.

Copy link
Member

jmchilton commented Oct 7, 2015

Whichever you'd prefer, but my vote is for new issues, I like churn :).

@peterjc

This comment has been minimized.

Copy link
Contributor Author

peterjc commented Oct 7, 2015

OK. I've filed issues for what I consider to the top priorities.

@peterjc

This comment has been minimized.

Copy link
Contributor Author

peterjc commented Oct 12, 2015

Quoting myself from earlier in this discussion: I'm increasingly finding I am reimplementing things already in the Galaxy Tool Shed code (with the risk of potentially interpreting the XML recipe slightly differently, which on the plus side could highlight some ambiguities in the recipe format).

Here's an example of the kind of ambiguity I was expecting: #321 and galaxyproject/galaxy#896

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment