Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Update cd-hit to 4.8.1, add cd-hit-454, cd-hit-dup, cd-hit-lap #3328

Open
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

stephanflemming
Copy link
Contributor

@stephanflemming stephanflemming commented Nov 21, 2020

FOR CONTRIBUTOR:

  • - I have read the CONTRIBUTING.md document and this tool is appropriate for the tools-iuc repo.
  • - License permits unrestricted use (educational + commercial)
  • - This PR adds a new tool or tool collection
  • - This PR updates an existing tool or tool collection
  • - This PR does something else (explain below)

Update: cd-hit, cd-hit-2d, cd-hit-est, cd-hit-est-2d to 4.8.1
Add: cd-hit-454, cd-hit-dup, cd-hit-lap


This current version of this wrapper (version 4.6.8.1) is not included on usegalaxy.eu, usegalaxy.org and usegalaxy.org.au. But there are multiple old wrappers available

image

https://github.com/galaxyproject/tools-devteam/tree/master/tools/cd_hit_dup

  • cd-hit-dup
  • cd-hit-auxtools, 0.5-2012-03-07-fix-dan-gh-0.0.1
  • Nov 10, 2015

https://github.com/galaxyproject/tools-iuc/blob/master/tools/cdhit/cd_hit.xml

  • cd-hit, cd-hit-est, cd-hit-2d, cd-hit-est-2d
  • cd-hit, 4.6.8
  • Oct 15, 2018

https://github.com/ASaiM/galaxytools/tree/master/tools/cdhit/

  • cd-hit, cd-hit-est
  • cd-hit, 4.6.4
  • May 4, 2018

And there is an additional tool for cd-hit outputs: https://github.com/ASaiM/galaxytools/tree/master/tools/format_cd_hit_output/ . I am not sure if this is still required and didn't include it here.

There are recipes available for cd-hit and -cd-hit-auxtools. Both are based on the same project repository. Maybe it's a good idea to merge them?

Following subcommands are mentioned in the tool wiki, but skipped here:

  • cd-hit-clstr_2_blm8: running, no output
  • cd-hit-clstr_2_blm8.pl: running, no output
  • cd-hit-div: not needed in Galaxy
  • cd-hit-div.pl: see cd-hit-div
  • cd-hit-otu: not running
  • cd-hit-para.pl: not longer supported, since the multi-threaded cd-hit become available
  • cd-hit-2d-para.pl: not longer supported, since the multi-threaded cd-hit become available
  • h-cd-hit: not running
  • psi-cd-hit: not running

@stephanflemming stephanflemming changed the title [WIP] Update cd-hit to 4.8.1 [WIP] Update cd-hit to 4.8.1, add cd-hit-454, cd-hit-dup, cd-hit-lap Nov 21, 2020
Copy link
Contributor

@bernt-matthias bernt-matthias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good already.

I guess we should remove https://github.com/ASaiM/galaxytools/tree/master/tools/cdhit once this is merged?

</xml>
<xml name="citations">
<citations>
<citation type="doi">10.1093/bioinformatics/17.3.282</citation>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation seems off here... and throughout the file

@@ -0,0 +1,169 @@
<tool id="cd_hit_454" name="CD-HIT 454" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a recent profile

edam_topics, edam_operations and xrefs would be nice to have as well

-gap $gap
-gap-ext $gapext
$out.bak
-M \${GALAXY_MEMORY_MB:-0}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is 0 a useful default?


<token name="@LOG@"><![CDATA[
#if $out.log
|& tee -a '$out_log'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need set -eo pipefail

@bernt-matthias
Copy link
Contributor

What is the state here? Could you fix the conflicts?

Once this is in we could also deprecate cd-hit-dup in the devteam repo (xref galaxyproject/tools-devteam#616)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants