Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic workflows, activities, signals, and queries #89

Merged
merged 4 commits into from
Jun 16, 2023

Conversation

cretz
Copy link
Member

@cretz cretz commented Jun 15, 2023

What was changed

  • Added support for [Activity(Dynamic = true)] which gets called on unmatched activity in worker with single IRawValue[] parameter
  • Added support for [Workflow(Dynamic = true)] which gets called on unmatched workflow in worker with single IRawValue[] parameter
  • Added support for [WorkflowSignal(Dynamic = true)] which gets called on unmatched signal in workflow with string signal name and IRawValue[] parameter
  • Added support for [WorkflowQuery(Dynamic = true)] which gets called on unmatched query in workflow with string query name and IRawValue[] parameter
  • Added ActivityExecutionContext.PayloadConverter for use by activities
  • Added Workflow.PayloadConverter for use by workflows
  • Fixed double-encoding issue for schedule workflow updates
  • Added IPayloadConverter.ToRawValue and IPayloadConverter.ToValue<T> extensions to assist in conversion to/from raw values respectively
  • Removed embedded payload converter from RawValue/IRawValue - we make the converters otherwise available and this was needed because people may need to construct these values and the converter was only there for convenience
  • Added Workflow.DynamicSignal and Workflow.DynamicQuery as properties that can be set from within a workflow

Checklist

  1. Closes [Feature Request] Support dynamic workflows, activities, signals, and queries #27

@cretz cretz requested a review from a team June 15, 2023 12:46
var availStr = string.Join(", ", avail);
throw new ApplicationFailureException(
$"Activity {act.Context.Info.ActivityType} is not registered on this worker," +
$" available activities: {availStr}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Not related to this specific PR) Just a reminder that listing activities in failure details is now problematic. Server will report warning at 2KB, and truncate the Failure if exceeding 4KB (see code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ug, yeah that's the size of the entire failure proto - message, stack trace, recursive causes and all. What should we do for big stack traces? What should we do for large cause chains?

Seems it's not listing activities is problematic, but large failures are problematic and I wonder if we need a holistic solution. How am I supposed to know whether I can list more or less activities just because the stack trace happens to be smaller?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, the problem is not specifically with listing activity names. Yet, the fact that the server may penalize us on for sending the list of activity names if that exceeds some threshold means that this is not a reliable way to provide that kind of feedback to user. We'll need to find an alternative for listing registered activity/workflow/signal/queries types AND a better way to deal with activities failures containing large stack traces.

Also, I'd say that Failure should not contain stack trace for these kind of errors, if that's possible. There is no user code on the stack at this point, so nothing they should need to trace.

But really, that don't need to be resolved today.

Copy link
Member Author

@cretz cretz Jun 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

server may penalize us on for sending the list of activity names if that exceeds some threshold

Yeah, I need to probably put some upper bound on this

We'll need to find an alternative for listing registered activity/workflow/signal/queries types

temporalio/features#51

I'd say that Failure should not contain stack trace for these kind of errors, if that's possible

It's possible, but not very transparent. As an open source library I like showing where all exceptions occur even in our own code.

set
{
if (value != null && !value.Dynamic)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why require that the dynamic query function has the Dynamic attribute if they are explicitely assigning to DynamicQuery? Seems redundant...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are different things. [WorkflowQuery] is declaration-site method made into a query. But we also offer (for DSLs and other use cases) a way to set handlers from code at runtime of the workflow. [Workflow(Dynamic = true)] is the recommended way, but DynamicQuery property is an optional runtime alternative.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I got that. What I'm saying is that if the user is explicitly calling the setter to set handler at runtime, why is it required that provided function do have the Dynamic attribute?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not required to have that (or even have an attribute). This is a reference to a property getter that just checks the absence of the name, it is completed unrelated to the attribute. From the test for that setter:

            Workflow.DynamicQuery = WorkflowQueryDefinition.CreateWithoutAttribute(
                null, (string queryName, IRawValue[] args) =>
                {
                    Events.Add($"query-{queryName}: " +
                        Workflow.PayloadConverter.ToValue<string>(args.Single()));
                    return "done";
                });

{
// If it's not null, send _all_ buffered signals. We will copy all from
// buffered, clear buffered, and send all to apply signal
var signals = bufferedSignals.ToList();
Copy link
Contributor

@mjameswh mjameswh Jun 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is technically incorrect, as the Workflow could change/unregister the DynamicSignal handler or register named signal handler from a signal handler, causing reentrant calls to either this loop or the one in named query handler registration. In these case, ApplySignalWorkflow might push back signals to the queue, which might scramble buffered signals relative order. See this commit for example cases.

This should instead be something like:

    while (signal = bufferedSignals.tryTakeFirst()) {
        ApplySignalWorkflow(signal);
    }

Copy link
Member Author

@cretz cretz Jun 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the approach you show there is more confusing but no more accurate. I intentionally take all of the signals off at handler registration time and clear the list before I even start processing them. This is very clear logic to the user setting the handler. You set a signal handler, your handler gets all buffered signals guaranteed. No maybe about it. Same for setting a named handler. It's the same problem that exists in many places when mutating a collection while iterating it and is exactly why we flush the collection before iterating on it.

With my logic, it is clear what happens when, say, you have psuedo code:

SetDynamicSignalHandler(HandleDynamicSignals)
SetSpecificSignalHandler("Signal1", HandleSignal1Signals)

The first one gets all buffered signals and the second one gets 0 buffered signals. I have documented this behavior and it is clear. But if I had to change the documentation to "start the draining process from the beginning when registered" it would be much less clear.

If you change handlers inside of some code somewhere else, doesn't affect that you chose to have all your buffered signals delivered to the original one since you know that is a registration time behavior.

Copy link
Member Author

@cretz cretz Jun 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also start all of these tasks immediately, just like if a workflow task came in with all of these signals present on it

// Send all buffered signals that apply to this one. We are going to collect these by
// iterating list in reverse matching predicate while removing, then iterating in
// reverse
var buffered = new List<SignalWorkflow>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See related comment above

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way signals are drained is very much by intention. It is confusing behavior to remove from this one by one.

Copy link
Contributor

@mjameswh mjameswh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor comments. Only one thing that really need to be addressed (risk of mangling buffered signals if signal handlers are added/modified/removed from a signal handler (ie. reentrant case)), but overall LGTM.

@cretz cretz merged commit b9edda0 into temporalio:main Jun 16, 2023
6 checks passed
@cretz cretz deleted the dynamic branch June 16, 2023 14:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] Support dynamic workflows, activities, signals, and queries
2 participants