Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mothur Unique.seqs -- fails when an optional input is not provided at ORG #5163

Closed
jennaj opened this issue Feb 27, 2023 · 17 comments
Closed
Assignees

Comments

@jennaj
Copy link
Member

jennaj commented Feb 27, 2023

MTS https://toolshed.g2.bx.psu.edu/view/iuc/mothur_unique_seqs/d466580cd706
Tool_id toolshed.g2.bx.psu.edu/repos/iuc/mothur_unique_seqs/mothur_unique_seqs/1.39.5.0

Is the version of the tool installed different between the servers? Appears to be the same but with different behavior, and is why I am posted here.

This tool form might also need to be adjusted: an input name/count is not required for the "name" output choice but it is required for the "count" output choice. Was this introduced when the option for "count" output was added in? The form doesn't change based on that toggle, and probably should.

If the form is updated, could we please have the "name" option be the default instead of "count" -- that is how students following the tutorial use it first: https://training.galaxyproject.org/training-material/topics/metagenomics/tutorials/mothur-miseq-sop/tutorial.html#optimize-files-for-computation

One of the test cases reports an error with automated tests: mvdbeek/usegalaxy-tests#79

Message from the error:

Job Information
Galaxy Tool ID: toolshed.g2.bx.psu.edu/repos/iuc/mothur_unique_seqs/mothur_unique_seqs/1.39.5.0
Command Line
set -o pipefail; export TERM=vt100; ln -s 'None' names.dat && ln -s '/corral4/main/objects/7/2/7/dataset_72710128-ce30-4034-95ac-81412e5ad681.dat' fasta.dat && echo 'unique.seqs( name=names.dat, fasta=fasta.dat, format=count )' | sed 's/ //g' | mothur | tee mothur.out.log
Tool Standard Output
�[H�[J mothur v.1.39.5 Last updated: 3/20/2017 by Patrick D. Schloss Department of Microbiology & Immunology University of Michigan http://www.mothur.org When using, please cite: Schloss, P.D., et al., Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol, 2009. 75(23):7537-41. Distributed under the GNU General Public License Type 'help()' for information on the commands that are available For questions and analysis support, please visit our forum at https://www.mothur.org/forum Type 'quit()' to exit program mothur > unique.seqs(name=names.dat,fasta=fasta.dat,format=count) Unable to open names.dat. Trying default /usr/local/bin/mothur/names.dat Unable to open /usr/local/bin/mothur/names.dat. Trying mothur's location /usr/local/bin/mothurnames.dat Unable to open /usr/local/bin/mothurnames.dat [ERROR]: did not complete unique.seqs. mothur > quit ************************************************************ ************************************************************ ************************************************************ Detected 1 [ERROR] messages, please review. ************************************************************ ************************************************************ ************************************************************

@natefoo
Copy link
Member

natefoo commented Mar 28, 2023

This could occur only on .org because of being run in a container. /usr/local/bin/mothur/names.dat would be a container path. I'm not sure if this is related but it seems to indicate that those files should be generated - on other infrastructure such as .eu it may successfully write to the conda dir (which is bad, but would work), but inside the container it fails.

@natefoo
Copy link
Member

natefoo commented Mar 28, 2023

Okay, I should have read a bit more closely - the /usr/local/bin/mothur/names.dat path is only a fallback. The actual issue is caused by the names.dat symlink pointing to None:

[g2main@galaxy-web-03 49203482]$ ls -lh working/names.dat
lrwxrwxrwx. 1 g2main G-803372 4 Mar 28 16:01 working/names.dat -> None
[g2main@galaxy-web-03 49203482]$ grep names.dat tool_script.sh
set -o pipefail; export TERM=vt100;  ln -s 'None' names.dat && ln -s '/corral4/main/objects/a/d/b/dataset_adbdf650-a1a8-4569-86ee-e01f34eccc24.dat' fasta.dat &&  echo 'unique.seqs( name=names.dat, fasta=fasta.dat, format=name )' | sed 's/ //g' | mothur | tee mothur.out.log

However, that is indeed the case with the particular bug report history I tried - there is no selectable mothur.names or mothur.count file in the history. This input is optional, so the tool can still run, but seems not to function correctly in this case.

@natefoo
Copy link
Member

natefoo commented Mar 29, 2023

usegalaxy.eu is experiencing this now after upgrading to 23.0. The issue is with the handling of the optional names input when it is left blank/unspecified. Prior to 23.0, the conditional in the tool that evaluated names would return false for both branches and neither the name or count param would be included in the unique.seqs() call, which is the correct behavior when names is not specified.

In 23.0, the first branch is evaluating true even when the names input is unspecified, likely due to the following:

galaxy.security.object_wrapper WARNING 2023-03-29 13:54:36,344 [pN:main.1,p:201504,tN:WSGI_1] Unable to create dynamic subclass SafeStringWrapper__galaxy.model.none_like.None__NoneType__NotImplementedType__Number__SafeStringWrapper__ToolParameterValueWrapper__bool__bytearray__ellipsis for <class 'galaxy.model.none_like.NoneDataset'>, None: type() doesn't support MRO entry resolution; use types.new_class()

@jennaj
Copy link
Member Author

jennaj commented Mar 29, 2023

Thanks Nate, that block is where I thought the problem was too.

ping @shiltemann and @bernt-matthias -- could either of you help? Or, suggest a someone else? This will impact the Smorg event -- would only work at AU (22.05) and only if they don't upgrade.

@mvdbeek mvdbeek self-assigned this Mar 29, 2023
@mvdbeek
Copy link
Member

mvdbeek commented Mar 29, 2023

I don't think the warning is related, but it should be straightforward to bisect the change. I'll take a look tomorrow

@natefoo
Copy link
Member

natefoo commented Mar 29, 2023

I'm already bisecting, I'll let you know what I find.

@bernt-matthias
Copy link
Contributor

I'm trying as well .. can't reproduce so far :(

<tool id="data_optional" name="data select" version="1.0.0">
    <command><![CDATA[
        echo INPUT $names >> '$output' &&
        echo ISOFTYPE $names.is_of_type("text") >> '$output'
        #if $names.is_of_type("text")
            echo "IF_IS_TRUE"
        #end if
    ]]></command>
    <inputs>
        <param name="names" type="data" format="mothur.names,mothur.count_table" optional="true"/>
    </inputs>
    <outputs>
        <data name="output" format="txt"/>
    </outputs>
    <tests>
        <test>
            <output name="output">
                <assert_contents>
                    <has_text text="INPUT None" />
                    <has_line line="ISOFTYPE False" />
                    <has_line line="IF_IS_TRUE" negate="true" />
                </assert_contents>
            </output>
        </test>
    </tests>
</tool>

@bernt-matthias
Copy link
Contributor

Fancy.. if I change the if to #if $names.is_of_type("mothur.names") it fails

@natefoo
Copy link
Member

natefoo commented Mar 29, 2023

The first breaking commit is 130b97b539689591b20c941a7ba907f75f9944a0 #14356

@bernt-matthias
Copy link
Contributor

Same here 130b97b539689591b20c941a7ba907f75f9944a0

@natefoo
Copy link
Member

natefoo commented Mar 29, 2023

Specifically this bit:

diff --git a/lib/galaxy/tools/evaluation.py b/lib/galaxy/tools/evaluation.py
index 2e112cd710..ae211ecf0e 100644
--- a/lib/galaxy/tools/evaluation.py
+++ b/lib/galaxy/tools/evaluation.py
@@ -330,7 +330,7 @@ class ToolEvaluator:
                 dataset = input_values[input.name]
                 wrapper_kwds = dict(
                     datatypes_registry=self.app.datatypes_registry,
-                    tool=self,
+                    tool=self.tool,
                     name=input.name,
                     compute_environment=self.compute_environment,
                 )
@@ -343,7 +343,7 @@ class ToolEvaluator:
                 wrapper_kwds = dict(
                     datatypes_registry=self.app.datatypes_registry,
                     compute_environment=self.compute_environment,
-                    tool=self,
+                    tool=self.tool,
                     name=input.name,
                 )
                 wrapper = DatasetCollectionWrapper(job_working_directory, dataset_collection, **wrapper_kwds)
@@ -390,7 +390,7 @@ class ToolEvaluator:
             if not isinstance(param_dict_value, (DatasetFilenameWrapper, DatasetListWrapper)):
                 wrapper_kwds = dict(
                     datatypes_registry=self.app.datatypes_registry,
-                    tool=self,
+                    tool=self.tool,
                     name=name,
                     compute_environment=self.compute_environment,
                 )

@bernt-matthias
Copy link
Contributor

Yep, undoing this seemed to fix it for me.

@natefoo
Copy link
Member

natefoo commented Mar 29, 2023

It'll presumably break type checking though... @nsoranzo?

@bernt-matthias
Copy link
Contributor

xref galaxyproject/galaxy#15870 .. so far the test case .. will add the potential fix once we see the test failing ..

This particular change implemented in the commit seems correct but still broke something. My next guess is that in the following part tool.inputs always raised an exception and ext was set to "data".

diff --git a/lib/galaxy/tools/wrappers.py b/lib/galaxy/tools/wrappers.py
index ecf61b4e24..ecc2637877 100644
--- a/lib/galaxy/tools/wrappers.py
+++ b/lib/galaxy/tools/wrappers.py
@@ -41,6 +41,7 @@ if TYPE_CHECKING:
     from galaxy.model.metadata import MetadataCollection
     from galaxy.tools import Tool
     from galaxy.tools.parameters.basic import (
+        BaseDataToolParameter,
         SelectToolParameter,
         ToolParameter,
     )
@@ -337,11 +338,16 @@ class DatasetFilenameWrapper(ToolParameterValueWrapper):
         formats: Optional[List[str]] = None,
     ) -> None:
         if not dataset:
-            try:
-                # TODO: allow this to work when working with grouping
-                ext = tool.inputs[name].extensions[0]  # type: ignore[union-attr]
-            except Exception:
-                ext = "data"
+            ext = "data"
+            if tool is not None and name is not None:
+                try:
+                    tool_input = tool.inputs[name]
+                    if TYPE_CHECKING:
+                        assert isinstance(tool_input, BaseDataToolParameter)
+                    # TODO: allow this to work when working with grouping
+                    ext = tool_input.extensions[0]
+                except Exception:
+                    pass
             self.dataset = cast(
                 DatasetInstance,
                 wrap_with_safe_string(

@natefoo
Copy link
Member

natefoo commented Mar 29, 2023

I believe you're right, if you apply the tools/evaluation.py changes and then force the extension to data (obviously the wrong fix, but just to confirm your guess), the tool works:

diff --git a/lib/galaxy/tools/wrappers.py b/lib/galaxy/tools/wrappers.py
index ecf61b4e24..8e23375a5e 100644
--- a/lib/galaxy/tools/wrappers.py
+++ b/lib/galaxy/tools/wrappers.py
@@ -337,11 +337,7 @@ class DatasetFilenameWrapper(ToolParameterValueWrapper):
         formats: Optional[List[str]] = None,
     ) -> None:
         if not dataset:
-            try:
-                # TODO: allow this to work when working with grouping
-                ext = tool.inputs[name].extensions[0]  # type: ignore[union-attr]
-            except Exception:
-                ext = "data"
+            ext = "data"
             self.dataset = cast(
                 DatasetInstance,
                 wrap_with_safe_string(

@bernt-matthias
Copy link
Contributor

I would say that

 ext = tool.inputs[name].extensions[0]

just makes no sense (it would assume the first dataset format defined by the parameter) .. and our luck was that it never worked so far .. until @nsoranzo fixed it :)
I will adapt my test such that it verifies it (I have two allowed formats .. for the first is_of_type should get True and for the second False).
The NoneDataset that is assigned to self.dataset is only used for getting the extension and is_of_type. So if a data parameter is left empty we will have "data" as ext.

I will try to replace the part with ext = "data" .. which would also fix the TODO by removing it ;)

bernt-matthias added a commit to bernt-matthias/galaxy that referenced this issue Mar 29, 2023
bernt-matthias added a commit to bernt-matthias/galaxy that referenced this issue Mar 29, 2023
bernt-matthias added a commit to bernt-matthias/galaxy that referenced this issue Mar 29, 2023
mvdbeek pushed a commit to mvdbeek/galaxy that referenced this issue Mar 30, 2023
mvdbeek added a commit to mvdbeek/galaxy that referenced this issue Mar 30, 2023
Tools may check for `is_of_type` if optional datasets to determine if
the dataset has been provided.

We should probably also drop the `tool` and `name` arguments that aren't
consumed anywhere.
mvdbeek added a commit to mvdbeek/galaxy that referenced this issue Mar 30, 2023
Tools may check for `is_of_type` if optional datasets to determine if
the dataset has been provided.

We should probably also drop the `tool` and `name` arguments that aren't
consumed anywhere.
mvdbeek pushed a commit to mvdbeek/galaxy that referenced this issue Mar 30, 2023
mvdbeek added a commit to mvdbeek/galaxy that referenced this issue Mar 30, 2023
Tools may check for `is_of_type` if optional datasets to determine if
the dataset has been provided.

We should probably also drop the `tool` and `name` arguments that aren't
consumed anywhere.
@natefoo
Copy link
Member

natefoo commented Mar 30, 2023

The fix was merged in galaxyproject/galaxy#15871 and has been deployed to usegalaxy.org, I have tested a previously failing job and it is working now.

@natefoo natefoo closed this as completed Mar 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants