Checks for multioutput plugins and loop dependencies #312

darrylmasson · 2020-09-02T15:03:05Z

What is the problem / what does the code in this PR do

If the first dependency of a loop plugin is produced by a multi-output plugin, the automatic determination of loop_over fails. During the automatic determination of loop order, loop_over = self.deps[self.depends_on[0]].data_kind (plugin.py L610) evaluates to a dictionary rather than a string, and this fails 4 lines down when it dereferences kwargs.

Can you briefly describe how it works?

This PR checks to see if the data_kind of the plugin providing the first dependency is a multi-output plugin and asks the user to be more specific if it is. Fixes #311 .

WenzDaniel · 2020-09-02T23:09:11Z

Looks good, maybe here an additional MWE since it took me quite some time to get deps initialized:

plugins = st._get_plugins(targets=('events',), run_id='0')  # Generates plugins up to events
loop_over = plugins['records'].deps[plugins['records'].depends_on[0]].data_kind

returns a dict

plugins = st._get_plugins(targets=('events',), run_id='0')  # Generates plugins up to events
loop_over = plugins['records'].deps[plugins['records'].depends_on[0]].data_kind
loop_over[plugins['records'].depends_on[0]]

returns 'raw_records'

WenzDaniel · 2020-09-02T23:12:23Z

strax/plugin.py

@@ -608,6 +608,9 @@ def compute(self, **kwargs):
            loop_over = self.loop_over
        else:
            loop_over = self.deps[self.depends_on[0]].data_kind
+            if isinstance(loop_over, dict):


You have to changes this line in isinstance(loop_over, (dict, immutabledict)) since the DAQReader has an immutable dict as data_kind. See

plugins = st._get_plugins(targets=('events',), run_id='0') plugins['records'].deps[plugins['records'].depends_on[0]].data_kind

WenzDaniel · 2020-09-02T23:13:28Z

strax/plugin.py

@@ -608,6 +608,9 @@ def compute(self, **kwargs):
            loop_over = self.loop_over
        else:
            loop_over = self.deps[self.depends_on[0]].data_kind
+            if isinstance(loop_over, dict):
+                loop_over = loop_over[self.depends_on[0]]
+                # We can't have nested dictionaries here, can we?


I think you showed, that we cannot. So maybe remove this comment.

JoranAngevaare

Thanks Darryl for spotting this, the error is indeed very non-instructive and unpacking it from a multioutput plugin like this will help recognizing when something is off / badly constructed.

Maybe it's good to add that the high-level problem is that this plugin wasn't designed for these kinds of low-level datatypes. You will (with anything below the non-overlapping peaklets) run into this assertion error:

strax/strax/plugin.py

Line 619 in a19cc21

assert np.all(base[1:]['time'] >= strax.endtime(base[:-1])), \

. Your base needs to be split and sorted by time. Records have overlapping time-information and therefore aren't suitable for this kind of plugin.

JoranAngevaare · 2020-09-03T07:49:52Z

strax/plugin.py

@@ -608,6 +608,9 @@ def compute(self, **kwargs):
            loop_over = self.loop_over
        else:
            loop_over = self.deps[self.depends_on[0]].data_kind
+            if isinstance(loop_over, dict):
+                loop_over = loop_over[self.depends_on[0]]
+                # We can't have nested dictionaries here, can we?


Perhaps you can be on the safe side by making a check just below:

if not istinstance(loop_over, str): raise TypeError(f'Trying to loop over {loop_over} which is not a string? Please add loop_over = <base> to your plugin')

darrylmasson · 2020-09-03T08:25:04Z

Maybe it's good to add that the high-level problem is that this plugin wasn't designed for these kinds of low-level datatypes.

So what's the high-level solution we want to aim for? Prevent loop plugins from depending on records or lower (or other overlapping data types)?

JoranAngevaare · 2020-09-03T14:32:18Z

Hi Darryl,
Thanks, I think your solution is fine, for example for peaklets this should work (although presumably slow). We just shouldn't expect this plugin to work for things like records.

The lines I suggested were in addition to what you added before.

Combining this would lead to :

...
if isinstance(loop_over, (dict, immutabledict)):
    loop_over = loop_over[self.depends_on[0]]
if not isinstance(loop_over, str):
    raise TypeError("Please add \"loop_over = <base>\""
                            " to your plugin definition")
...

WenzDaniel · 2020-09-03T14:49:20Z

Would not your proposal the lines above obsolete?

if not isinstance(loop_over, str):
    raise TypeError("Please add \"loop_over = <base>\""
                            " to your plugin definition")

Would always be raised when

if isinstance(loop_over, (dict, immutabledict)):
    loop_over = loop_over[self.depends_on[0]]

is true. Hence we could remove it.

JoranAngevaare · 2020-09-03T14:54:17Z

Would not your proposal the lines above obsolete?
if not isinstance(loop_over, str):
    raise TypeError("Please add \"loop_over = <base>\""
                            " to your plugin definition")
Would always be raised when
if isinstance(loop_over, (dict, immutabledict)):
    loop_over = loop_over[self.depends_on[0]]
is true. Hence we could remove it.

No, you redefine loop_over the line just above it?

darrylmasson · 2020-09-04T10:47:16Z

If we do go for something that looks like this:

if hasattr(self, 'loop_over'):
    loop_over = self.loop_over
else
    loop_over = self.deps[self.depends_on[0]].data_kind
if isinstance(loop_over, (dict, immutabledict)):
    loop_over = loop_over[self.depends_on[0]]
if not isinstance(loop_over, str):
    raise TypeError("message")

The second isinstance will only do anything in the situation where a plugin is particularly convoluted. I would argue to skip the first check. The second is a catch-all case that should handle any ambiguous situation.

JoranAngevaare · 2020-09-08T07:52:29Z

The first isinstance changes the loop_over and is not a check as such. I don't follow your reasoning as you will end up with the TypeError if you skip the first isinstance for the multioutputplugins.

darrylmasson · 2020-09-08T12:14:03Z

That's my point - the second isinstance is a catch-all that should handle any ambiguous situation, and multioutput plugins are at least somewhat ambiguous in this regard. Rather than individually handle these cases, only having if not isinstance(loop_over, str) both makes the code easier to maintain and also deals with any possible future features that could affect this in some way.

JoranAngevaare · 2020-09-08T13:21:48Z

There is nothing wrong with multi-output plugins per se but perhaps you are right that rather than doing the thinking for the analysts that just accepting a simple format is clearer (and throwing an error otherwise).

WenzDaniel

Neat

Checks for multioutput plugins

a19cc21

darrylmasson requested a review from JoranAngevaare September 2, 2020 15:03

WenzDaniel requested changes Sep 2, 2020

View reviewed changes

JoranAngevaare reviewed Sep 3, 2020

View reviewed changes

Throws instead of figuring it out

3853894

JoranAngevaare approved these changes Sep 8, 2020

View reviewed changes

Cleanup

be6a896

WenzDaniel approved these changes Sep 9, 2020

View reviewed changes

WenzDaniel merged commit 06fd303 into AxFoundation:master Sep 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Checks for multioutput plugins and loop dependencies #312

Checks for multioutput plugins and loop dependencies #312

darrylmasson commented Sep 2, 2020 •

edited

WenzDaniel commented Sep 2, 2020

WenzDaniel Sep 2, 2020

WenzDaniel Sep 2, 2020

JoranAngevaare left a comment

JoranAngevaare Sep 3, 2020

darrylmasson commented Sep 3, 2020

JoranAngevaare commented Sep 3, 2020

WenzDaniel commented Sep 3, 2020

JoranAngevaare commented Sep 3, 2020

darrylmasson commented Sep 4, 2020

JoranAngevaare commented Sep 8, 2020 •

edited

darrylmasson commented Sep 8, 2020

JoranAngevaare commented Sep 8, 2020

WenzDaniel left a comment

Checks for multioutput plugins and loop dependencies #312

Checks for multioutput plugins and loop dependencies #312

Conversation

darrylmasson commented Sep 2, 2020 • edited

WenzDaniel commented Sep 2, 2020

WenzDaniel Sep 2, 2020

Choose a reason for hiding this comment

WenzDaniel Sep 2, 2020

Choose a reason for hiding this comment

JoranAngevaare left a comment

Choose a reason for hiding this comment

JoranAngevaare Sep 3, 2020

Choose a reason for hiding this comment

darrylmasson commented Sep 3, 2020

JoranAngevaare commented Sep 3, 2020

WenzDaniel commented Sep 3, 2020

JoranAngevaare commented Sep 3, 2020

darrylmasson commented Sep 4, 2020

JoranAngevaare commented Sep 8, 2020 • edited

darrylmasson commented Sep 8, 2020

JoranAngevaare commented Sep 8, 2020

WenzDaniel left a comment

Choose a reason for hiding this comment

darrylmasson commented Sep 2, 2020 •

edited

JoranAngevaare commented Sep 8, 2020 •

edited