Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Signals and ATP v2 #98

Merged
merged 36 commits into from
Sep 21, 2023
Merged

Signals and ATP v2 #98

merged 36 commits into from
Sep 21, 2023

Conversation

jaredoconnell
Copy link
Contributor

@jaredoconnell jaredoconnell commented Aug 25, 2023

Changes introduced with this PR

This PR adds signals to the schema types, and adds support for ATP v2.

For backwards compatibility, the old format of the step decorator is available for plugins that don't require signals.
For plugins that require signals, or for steps that benefit from their own object, you use @step_with_signals instead of @step, and you pass in two items. The first is the constructor (or a lambda that calls the constructor) for the object that has the step and signal methods, and the second is the function names for the signals. The signals are then processed later.

I needed to make some compromises to workaround Python's limitations.
For passing in associated signals with steps, I needed to pass in method names then load the methods later instead of passing in references to the methods, because the decorator is called before the class is fully defined.


By contributing to this repository, I agree to the contribution guidelines.

@jaredoconnell jaredoconnell marked this pull request as ready for review September 18, 2023 16:03
src/arcaflow_plugin_sdk/atp.py Show resolved Hide resolved
src/arcaflow_plugin_sdk/test_atp.py Show resolved Hide resolved
src/arcaflow_plugin_sdk/atp.py Show resolved Hide resolved
@jdowni000
Copy link
Contributor

This looks great to me. Let's handle the things Matt found. Great work.

@jaredoconnell jaredoconnell merged commit 44b451e into main Sep 21, 2023
4 checks passed
@jaredoconnell jaredoconnell deleted the signals branch September 21, 2023 16:15
Copy link
Contributor

@webbnh webbnh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that this is merged, I can comment with impunity (especially since I'm about to go on PTO and won't be able to resolve the "conversations"...). 😆

I didn't actually finish the review, but I'm out of time, so here's what I got.

Most of my comments are trivialities which you can take for what they are worth, but I did find a couple of things worth looking closely at:

  • There are a couple of places where you are flushing stdin which seems potentially problematic.
  • There is a place where you're doing a CV wait which looks improperly structured (although, it's possible that it doesn't matter as much in Python as it does in a "real language" 😉).

Other items of somewhat lesser note:

  • Catching SystemExit is a questionable thing to do. With your changes, it doesn't look like it's needed any more, but, if it is, it should be closely examined to see if it's the "right" mechanism.
  • There's some I/O redirection which looks weird, to me (but, maybe it's just one more "special thing" that Python can do...).
  • The error handling for CBORDecodeError seems inconsistent.

Cheers!

Comment on lines -19 to +22
import signal
import sys
import typing
import threading
import signal
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you keep your import lists ordered? (E.g., you might want to try out isort.)

Should the import of enum be in this block?

Comment on lines +37 to +39
WORK_DONE = 1
SIGNAL = 2
CLIENT_DONE = 3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using enum.auto() instead of explicit values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we need to sync these between the go and Python SDK, I think it makes sense to make these explicit.

Comment on lines +93 to +94
"""
signal.signal(signal.SIGINT, signal_handler) # Ignore sigint. Only care about arcaflow signals.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use a custom signal handler (i.e., signal_handler) instead of using the standard signal.SIG_IGN to ignore the signal?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'll need to refactor this at some point.

Comment on lines +95 to +97
if os.isatty(self.stdout.fileno()):
print("Cannot run plugin in ATP mode on an interactive terminal.")
return 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is curious...why is this the case? Do/should you have similar restrictions on stderr or stdin?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plugins communicate over stdout and stdin

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, but does that mean that there should never be a terminal connected there (not even for interactive testing)? (Is it even possible that this could happen?) That is, is there actually value to performing this check?

If so, then shouldn't you be checking stdin as well?

if message["config"] is None:
stderr.write("Work start message is missing the 'config' field.")
# Serialize then send HelloMessage
start_hello_message = HelloMessage(2, plugin_schema)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 2 here looks like a "magic number"...shouldn't that be a symbolic reference to an externally-defined constant? Or, perhaps, the whole HelloMessage(2, plugin_schema) expression should be encapsulated in a reference?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should create a version constant.

Comment on lines +5574 to +5581
def __call__(
self,
step_data: StepObjectT,
params: SignalDataT,
):
"""
:param params: Input data parameter for the signal handler.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the docstring is describing params, shouldn't it describe step_data, as well?

Comment on lines +5673 to +5679
# Create a map to populate
signal_handlers_map = {}
for handler_method_name in self.signal_handler_method_names:
# Retrieve the object attributes, which will be the schemas
handler = getattr(object_instance, handler_method_name)
signal_handlers_map[handler.id] = handler
self.signal_handlers = signal_handlers_map
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using a dictionary comprehension:

        self.signal_handlers = {
            handler.id: getattr(object_instance, handler_method_name)
            for handler_method_name in self.signal_handler_method_names
        }

Comment on lines +5689 to +5692
def get_step(self, step_id: str):
if step_id not in self.steps:
raise NoSuchStepException(step_id)
return self.steps[step_id]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If step_id not being in self.steps is an "exceptional" condition, consider coding it as such:

    def get_step(self, step_id: str):
        try:
            return self.steps[step_id]
        except KeyError as e:
            raise NoSuchStepException(e.args[0]) from None

Alternatively, avoid doing the lookup twice:

    def get_step(self, step_id: str):
        v = self.steps.get(step_id)
        if v is None:
            raise NoSuchStepException(step_id)
        return v

Similarly for get_signal().

Comment on lines +5759 to +5762
if not step.object_data_ready:
with step.object_cv:
# wait to be notified of it being ready. Test this by adding a sleep before the step call.
step.object_cv.wait()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code looks questionable: it probably should hold the object_cv lock when checking the object_data_ready value, and it should be prepared for spurious wake-ups. So, something more like this would be appropriate:

        with step.object_cv:
            while not step.object_data_ready:
                step.object_cv.wait()

(I would omit the code comment: it's obvious that we are waiting, it's clear what we are waiting for, and comments on how to test this belong in the test module not here....)

Comment on lines +27 to +29
@dataclass
class cancelInput:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why make this a @dataclass? It has no fields....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the practice for all classes that we're serializing/unserializing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants