-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Command to compare repositories in different Tool Sheds #27
Comments
Python script doing the above quick-and-dirty hack using |
A related task is comparing the current directory (possibly under git version control) with a remote Tool Shed, e.g. to see if you need to upload a new tar-ball or not. However, this runs into issues flagged on #26 about how to determine which files to look at. |
I updated my gist so the Python script can now also compare a report repository to local files (ignoring local files not present in the remote repository which is generally what I need with my Galaxy Tool development setup). Sample output, perhaps too verbose, comparing my development repository to the Test Tool Shed (identical bar trivial differences in
Sample output comparing the Test Tool Shed and main Tool Shed repositories, showing I might want to push the v0.0.9 release to the main Tool Shed which is missing several updates:
|
Thanks for laying this all out Peter - this is definitely in scope and something I wanted to work on so this is perfect. Things are a bit hectic right now - but I am definitely going to look at this in detail at some point soon. Thanks again! |
Inspired by script from @peterjc - https://gist.github.com/peterjc/13653e6907d75c470d01. By default compares the local changes against the main Tool Shed repository defined by [.][tool][_]shed.yml, but with command line options can be made to do all sorts of comparisons. Some of these are demonstrated below: Default against main tool shed: ``` % planemo shed_diff wget -q --recursive -O - 'https://toolshed.g2.bx.psu.edu/repository/download?repository_id=b6b97c236de89252&changeset_revision=default&file_type=gz' | tar -xzf - -C /tmp/tool_shed_diff_CuRq5U/_toolshed_ --strip-components 1 mkdir "/tmp/tool_shed_diff_CuRq5U/_local_"; tar -xzf "/tmp/tmpdVW07c" -C "/tmp/tool_shed_diff_CuRq5U/_local_"; rm -rf /tmp/tmpdVW07c cd "/tmp/tool_shed_diff_CuRq5U"; diff -r _local_ _toolshed_ diff -r _local_/count_covariates.xml _toolshed_/count_covariates.xml 7d6 < <version_command>echo "A REALLY OLD OPEN SOURCE VERSION OF GATK"</version_command> diff -r _local_/tool_dependencies.xml _toolshed_/tool_dependencies.xml 4c4 < <repository name="package_gatk_1_4" owner="devteam" prior_installation_required="False" /> --- > <repository changeset_revision="ec95ec570854" name="package_gatk_1_4" owner="devteam" prior_installation_required="False" toolshed="http://toolshed.g2.bx.psu.edu" /> 7c7 < <repository name="package_samtools_0_1_18" owner="devteam" prior_installation_required="False" /> --- > <repository changeset_revision="171cd8bc208d" name="package_samtools_0_1_18" owner="devteam" prior_installation_required="False" toolshed="http://toolshed.g2.bx.psu.edu" /> ``` Check local diff against test tool shed. ``` % planemo shed_diff --shed_target testtoolshed /home/john/workspace/planemo/.venv/local/lib/python2.7/site-packages/requests-2.4.3-py2.7.egg/requests/packages/urllib3/connectionpool.py:730: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html (This warning will only appear once by default.) InsecureRequestWarning) wget -q --recursive -O - 'https://testtoolshed.g2.bx.psu.edu/repository/download?repository_id=4dd15c58c2ade087&changeset_revision=default&file_type=gz' | tar -xzf - -C /tmp/tool_shed_diff_LWnNZt/_testtoolshed_ --strip-components 1 mkdir "/tmp/tool_shed_diff_LWnNZt/_local_"; tar -xzf "/tmp/tmpNKEpuO" -C "/tmp/tool_shed_diff_LWnNZt/_local_"; rm -rf /tmp/tmpNKEpuO cd "/tmp/tool_shed_diff_LWnNZt"; diff -r _local_ _testtoolshed_ diff -r _local_/count_covariates.xml _testtoolshed_/count_covariates.xml 7d6 < <version_command>echo "A REALLY OLD OPEN SOURCE VERSION OF GATK"</version_command> diff -r _local_/tool_dependencies.xml _testtoolshed_/tool_dependencies.xml 4c4 < <repository name="package_gatk_1_4" owner="devteam" prior_installation_required="False" /> --- > <repository changeset_revision="0cc94f66d00e" name="package_gatk_1_4" owner="devteam" prior_installation_required="False" toolshed="http://testtoolshed.g2.bx.psu.edu" /> 7c7 < <repository name="package_samtools_0_1_18" owner="devteam" prior_installation_required="False" /> --- > <repository changeset_revision="c0f72bdba484" name="package_samtools_0_1_18" owner="devteam" prior_installation_required="False" toolshed="http://testtoolshed.g2.bx.psu.edu" /> ``` Check difference between test and main for this repository. ``` % planemo shed_diff --shed_target_source testtoolshed /home/john/workspace/planemo/.venv/local/lib/python2.7/site-packages/requests-2.4.3-py2.7.egg/requests/packages/urllib3/connectionpool.py:730: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html (This warning will only appear once by default.) InsecureRequestWarning) wget -q --recursive -O - 'https://toolshed.g2.bx.psu.edu/repository/download?repository_id=b6b97c236de89252&changeset_revision=default&file_type=gz' | tar -xzf - -C /tmp/tool_shed_diff_Aa9wj3/_toolshed_ --strip-components 1 wget -q --recursive -O - 'https://testtoolshed.g2.bx.psu.edu/repository/download?repository_id=4dd15c58c2ade087&changeset_revision=default&file_type=gz' | tar -xzf - -C /tmp/tool_shed_diff_Aa9wj3/_testtoolshed_ --strip-components 1 cd "/tmp/tool_shed_diff_Aa9wj3"; diff -r _testtoolshed_ _toolshed_ diff -r _testtoolshed_/tool_dependencies.xml _toolshed_/tool_dependencies.xml 4c4 < <repository changeset_revision="0cc94f66d00e" name="package_gatk_1_4" owner="devteam" prior_installation_required="False" toolshed="http://testtoolshed.g2.bx.psu.edu" /> --- > <repository changeset_revision="ec95ec570854" name="package_gatk_1_4" owner="devteam" prior_installation_required="False" toolshed="http://toolshed.g2.bx.psu.edu" /> 7c7 < <repository changeset_revision="c0f72bdba484" name="package_samtools_0_1_18" owner="devteam" prior_installation_required="False" toolshed="http://testtoolshed.g2.bx.psu.edu" /> --- > <repository changeset_revision="171cd8bc208d" name="package_samtools_0_1_18" owner="devteam" prior_installation_required="False" toolshed="http://toolshed.g2.bx.psu.edu" /> ``` Ignore YAML file and just check difference between main and test tool shed for arbitrary repository. ``` % planemo shed_diff --owner peterjc --name blast_rbh --shed_target_source testtoolshed /home/john/workspace/planemo/.venv/local/lib/python2.7/site-packages/requests-2.4.3-py2.7.egg/requests/packages/urllib3/connectionpool.py:730: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.org/en/latest/security.html (This warning will only appear once by default.) InsecureRequestWarning) wget -q --recursive -O - 'https://toolshed.g2.bx.psu.edu/repository/download?repository_id=d5dd1c5d2070513e&changeset_revision=default&file_type=gz' | tar -xzf - -C /tmp/tool_shed_diff_II0eAD/_toolshed_ --strip-components 1 wget -q --recursive -O - 'https://testtoolshed.g2.bx.psu.edu/repository/download?repository_id=c053d26daf6271bf&changeset_revision=default&file_type=gz' | tar -xzf - -C /tmp/tool_shed_diff_II0eAD/_testtoolshed_ --strip-components 1 cd "/tmp/tool_shed_diff_II0eAD"; diff -r _testtoolshed_ _toolshed_ diff -r _testtoolshed_/tools/blast_rbh/blast_rbh.py _toolshed_/tools/blast_rbh/blast_rbh.py 35c35 < print "BLAST RBH v0.1.6" --- > print "BLAST RBH v0.1.5" 110c110 < if blast_type not in ["blastp", "blastp-fast", "blastp-short"]: --- > if blast_type not in ["blastp", "blastp-short"]: 332c332 < sys.stderr.write("Warning: Sequences with tied best hits found, you may have duplicates/clusters\n") --- > sys.stderr.write("Warning: Sequencies with tied best hits found, you may have duplicates/clusters\n") diff -r _testtoolshed_/tools/blast_rbh/blast_rbh.xml _toolshed_/tools/blast_rbh/blast_rbh.xml 1c1 < <tool id="blast_reciprocal_best_hits" name="BLAST Reciprocal Best Hits (RBH)" version="0.1.6"> --- > <tool id="blast_reciprocal_best_hits" name="BLAST Reciprocal Best Hits (RBH)" version="0.1.5"> 48d47 < <option value="blastp-fast">blastp-fast - Uses longer words as described by Shiryev et al (2007)</option> 167c166 < <param name="nucl_type" value="blastp-fast"/> --- > <param name="nucl_type" value="blastp"/> diff -r _testtoolshed_/tools/blast_rbh/README.rst _toolshed_/tools/blast_rbh/README.rst 65d64 < v0.1.6 - Offer the new blastp-fast task added in BLAST+ 2.2.30. diff -r _testtoolshed_/tools/blast_rbh/tool_dependencies.xml _toolshed_/tools/blast_rbh/tool_dependencies.xml 4c4 < <repository changeset_revision="268128adb501" name="package_biopython_1_64" owner="biopython" toolshed="https://testtoolshed.g2.bx.psu.edu" /> --- > <repository changeset_revision="5477a05cc158" name="package_biopython_1_64" owner="biopython" toolshed="https://toolshed.g2.bx.psu.edu" /> 7c7 < <repository changeset_revision="f69b90d89b62" name="package_blast_plus_2_2_30" owner="iuc" toolshed="https://testtoolshed.g2.bx.psu.edu" /> --- > <repository changeset_revision="0fe5d5c28ea2" name="package_blast_plus_2_2_30" owner="iuc" toolshed="https://toolshed.g2.bx.psu.edu" /> ``` Closes #27.
I'm not sure if this falls under the planemo scope, but posting it here for discussion at least.
As part of my workflow of initially releasing tools on the Test Tool Shed, and then if there are no problems with the functional test, uploading them to the main Tool Shed, I would like a "ToolShed diff" command which could be used as follows:
I would like this to output something along the following lines (a bit of a hack using command line tools hg and diff to fetch and compare the files from the ToolShed), here showing a harmless diff in the dependencies:
In terms of the tool command line API, alternative ways to specify the tool sheds might make sense here too? I'd probably setup an alias like this for the typical case where the same author ID owns both:
This would help greatly in spotting when I have forgotten to push an update from the Test Tool Shed to the main Tool Shed.
However, it would be nice to compare any two tools (e.g. alternative versions of a wrapper from two different authors) which would work with the full URL style.
The text was updated successfully, but these errors were encountered: