BUG: Multiple values for 'n_jobs' or 'threads' #18

ChrisKeefe · 2020-06-11T17:37:35Z

Attempts to diagnose and fix a bug initially discovered here, in which beta diversity n_Jobs and threads parameters are being passed multiple values when called through the framework with 'auto'.

thermokarst · 2020-06-11T17:52:48Z

I was doing a whole-repo review this morning, as promised, and this bit of code looks like it has issues, which are probably related to the ci error:

q2-diversity-lib/q2_diversity_lib/_util.py

Lines 93 to 99 in 2995d07

    
           if cpus_requested == 'auto': 
        
               # remove 'auto' from args to prevent 'multiple values' TypeError... 
        
               argslist = list(args) 
        
               argslist.remove('auto') 
        
               return_args = tuple(argslist) 
        
               # ...then inject number of available cpus 
        
               return wrapped_function(*return_args, **kwargs, **{param_name: cpus})

Not sure if this is just-copy-and-paste error, but the original error you linked to above says:

Multiple values for argument thread

while the ci error is

TypeError: unweighted_unifrac() got multiple values for argument 'threads'

Please note the difference in error, in particular, the offending variable name (thread vs threads).

thermokarst · 2020-06-11T17:59:27Z

Here is a candidate for an idiomatic solution: https://docs.python.org/3.6/library/inspect.html#inspect.Signature.replace

thermokarst · 2020-06-11T18:35:45Z

Here is an draft implementation, in case you get stuck: ChrisKeefe/q2-diversity-lib@multiple_values...thermokarst:multiple_values_thermokarst

I wanted to get my thoughts in order, which wound up producing a functional result, figured I would share.

Co-authored by: thermokarst <matthewrdillon@gmail.com>

Co-authored by thermokarst <mathewrdillon@gmail.com>

Co-authored-by: thermokarst <matthewrdillon@gmail.com>

ChrisKeefe · 2020-06-11T19:34:35Z

Here is an draft implementation, in case you get stuck: ChrisKeefe/q2-diversity-lib@multiple_values...thermokarst:multiple_values_thermokarst

This is much prettier (and more direct) than whatever I was doing with argument order convention/keyword-only arguments. The docs, as usual, spell it out. Thanks!

q2_diversity_lib/_util.py

q2_diversity_lib/tests/test_beta.py

thermokarst · 2020-06-11T20:01:42Z

q2_diversity_lib/tests/test_beta.py

+        self.unweighted_unifrac_thru_framework(self.table_as_artifact,
+                                               self.tree_as_artifact,
+                                               threads=2)
+        self.unweighted_unifrac_thru_framework(self.table_as_artifact,


Good tests! Do you have a similar test, for checking if n_jobs == auto works as expected?

Hoo boy am I glad you asked. New commit with breaking test incoming. Something very messy is happening. I suspect it's related to the fact that everything is being unpacked into positional arguments in our return statement, but I'm really not sure.

Some notes:
jaccard fails with TypeError: jaccard() got multiple values for argument 'n_jobs'
unifrac methods don't trigger test failures, however...
when jaccard fails, a stderr message is exposed that comes from from unifrac that makes me think it's getting a memory address or something instead of the integer we're trying to deliver:

------------------------------------ Captured stderr call ------------------------------------- More threads were requested than stripes. Using 342021232 threads.

The number in the error message is a different large integer (positive or negative) every time.
These stderror lines appear only when there is a test that uses auto. Integer arguments are clean, which doesn't seem to jive well with my current hypothesis on why this is happening. 🤷‍♂️

I think you might've found a bug in the somewhere in the framework or test harness! First off, the main problem is in the test data itself:

self.jaccard_thru_framework(self.table_as_artifact, self.tree_as_artifact, n_jobs=2)

Note, you are passing in a tree as the second param, even though this method doesn't take a tree. That appears to be your second n_jobs (which is what I think might be a bug in the framework).

While playing with this, my next thought was, "well, is this somehow related to the decorator signature munging?" - I removed the decorators and tested, the answer appears to be "no". Put another way, I don't see how this could be related to your decorator utils (yet). This test fails either way, decorated or not.

Applying the following patch to 8dfdb0b has a 100% passing test suite:

diff --git a/q2_diversity_lib/tests/test_util.py b/q2_diversity_lib/tests/test_util.py index 13cacab..8db7a63 100644 --- a/q2_diversity_lib/tests/test_util.py +++ b/q2_diversity_lib/tests/test_util.py @@ -181,10 +181,8 @@ class ValidateRequestedCPUsTests(TestPluginBase): self.tree_as_artifact, threads='auto') self.jaccard_thru_framework(self.table_as_artifact, - self.tree_as_artifact, n_jobs=2) - # self.jaccard_thru_framework(self.table_as_artifact, - # self.tree_as_artifact, - # n_jobs='auto') + self.jaccard_thru_framework(self.table_as_artifact, + n_jobs='auto') # If we get here, then it ran without error self.assertTrue(True)

Here's a MWE:

import qiime2 from qiime2.plugins import demux dmx = qiime2.Artifact.load('../data/moving-pictures/demux.qza') demux.actions.summarize(dmx, dmx, dmx, dmx, n=dmx) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-2-87df907aed63> in <module> 2 from qiime2.plugins import demux 3 dmx = qiime2.Artifact.load('../data/moving-pictures/demux.qza') ----> 4 demux.actions.summarize(dmx, dmx, dmx, dmx, n=dmx) TypeError: summarize() got multiple values for argument 'n'

To clarify my thinking RE a bug in the framework. I know that this "got multiple values" business is just python doing what python does, but I am thinking there is a guard or two that we might be missing, although I'm not sure.

Totally agreed on this - seems like we should be doing a better job of checking argument counts somewhere. I'll go poking around with this in a little bit.

For now, I've figured out what's causing the stderr messages from unifrac, I'm just not sure why they display the numbers that they do. In every case where that stderror output appears, we were working with a small (3-sample) table, which unifrac can handle using a maximum of two threads. Using a larger table removes the issue, as does requesting a smaller number of threads. The 'auto' parameter had nothing to do with it, beyond that it was requesting 3 threads, one more than the max.

I'm wondering:

Why is unifrac reporting using 46530796 or -132493215 threads? ✔️

Should we be "catching" this error somehow, and passing a warning along to users? ❌

PR raised on unifrac ✔️

q2_diversity_lib/tests/test_util.py

This reverts commit 21e3646.

This reverts commit 3d51eb6.

ChrisKeefe · 2020-06-12T23:12:10Z

Quick summary for posterity:
Based on this and this, @thermokarst hypothesized psutil might be picking up the host machine's CPU count, rather than the VM's CPU count. Some digging into the psutil issue tracker (1, 2) suggest that psutil is currently unable to generate accurate physical cpu counts on systems with multiple sockets. This may be fixed here, but for now there's not much we can do about it.

Sooooo, I'm patching psutil in the breaking tests temporarily to prevent failure. Issue incoming to keep this on the radar.

ChrisKeefe added 2 commits June 11, 2020 10:31

Adds breaking test of cpu_request through framework

d553c18

MAINT: import/consistency cleanup

770006b

ChrisKeefe marked this pull request as draft June 11, 2020 17:37

ChrisKeefe added 3 commits June 11, 2020 11:50

Improves variable names

86a20ab

Co-authored by: thermokarst <matthewrdillon@gmail.com>

SQUASH: more var names

9823460

Co-authored by thermokarst <mathewrdillon@gmail.com>

BUG: fixes multiple values error by modifying BoundArguments.arguments

dbe3098

Co-authored-by: thermokarst <matthewrdillon@gmail.com>

thermokarst suggested changes Jun 11, 2020

View reviewed changes

ChrisKeefe added 8 commits June 11, 2020 14:15

TIDY: cleans up logic in _util.py

987fc70

Adds breaking test for n_jobs passed "auto"

3e27a29

LINT: removes unnecessary imports

8dfdb0b

fixes bad tests

703c407

Adds targeted breaking test for more-threads-than-features

00a8acb

NAME: clarifies test name

7f5872c

MAINT: Tidies up

f6a7a66

HMMM: is stderror breaking GHActions?

fb21d89

thermokarst reviewed Jun 12, 2020

View reviewed changes

q2_diversity_lib/tests/test_util.py Show resolved Hide resolved

ChrisKeefe added 7 commits June 12, 2020 14:24

removes psutil patches from failing tests

de312cb

TEMP: display stdout/stderr captures

90c8859

HACK: diagnosis through amputation

3d51eb6

HACK: glue one test back on

21e3646

Revert "HACK: glue one test back on"

eee1ab4

This reverts commit 21e3646.

Revert "HACK: diagnosis through amputation"

1eba024

This reverts commit 3d51eb6.

HACK: re-instate patches to smooth over psutil issues

9460a47

thermokarst approved these changes Jun 12, 2020

View reviewed changes

ChrisKeefe mentioned this pull request Jun 12, 2020

Remove psutil patches from test_util.py "through-the-framework" tests #20

Open

ChrisKeefe marked this pull request as ready for review June 12, 2020 23:12

ChrisKeefe mentioned this pull request Jun 12, 2020

NEW: Adds core beta diversity measures #6

Merged

thermokarst merged commit 8f4e655 into qiime2:master Jun 15, 2020

ChrisKeefe deleted the multiple_values branch November 30, 2020 00:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Multiple values for 'n_jobs' or 'threads' #18

BUG: Multiple values for 'n_jobs' or 'threads' #18

ChrisKeefe commented Jun 11, 2020

thermokarst commented Jun 11, 2020

thermokarst commented Jun 11, 2020

thermokarst commented Jun 11, 2020

ChrisKeefe commented Jun 11, 2020

thermokarst Jun 11, 2020

ChrisKeefe Jun 11, 2020

thermokarst Jun 11, 2020 •

edited

Loading

thermokarst Jun 11, 2020 •

edited

Loading

thermokarst Jun 12, 2020

ChrisKeefe Jun 12, 2020 •

edited

Loading

ChrisKeefe commented Jun 12, 2020

BUG: Multiple values for 'n_jobs' or 'threads' #18

BUG: Multiple values for 'n_jobs' or 'threads' #18

Conversation

ChrisKeefe commented Jun 11, 2020

thermokarst commented Jun 11, 2020

thermokarst commented Jun 11, 2020

thermokarst commented Jun 11, 2020

ChrisKeefe commented Jun 11, 2020

thermokarst Jun 11, 2020

Choose a reason for hiding this comment

ChrisKeefe Jun 11, 2020

Choose a reason for hiding this comment

thermokarst Jun 11, 2020 • edited Loading

Choose a reason for hiding this comment

thermokarst Jun 11, 2020 • edited Loading

Choose a reason for hiding this comment

thermokarst Jun 12, 2020

Choose a reason for hiding this comment

ChrisKeefe Jun 12, 2020 • edited Loading

Choose a reason for hiding this comment

ChrisKeefe commented Jun 12, 2020

thermokarst Jun 11, 2020 •

edited

Loading

thermokarst Jun 11, 2020 •

edited

Loading

ChrisKeefe Jun 12, 2020 •

edited

Loading