Dynamic workflows, activities, signals, and queries #89

cretz · 2023-06-15T12:46:46Z

What was changed

Added support for [Activity(Dynamic = true)] which gets called on unmatched activity in worker with single IRawValue[] parameter
Added support for [Workflow(Dynamic = true)] which gets called on unmatched workflow in worker with single IRawValue[] parameter
Added support for [WorkflowSignal(Dynamic = true)] which gets called on unmatched signal in workflow with string signal name and IRawValue[] parameter
Added support for [WorkflowQuery(Dynamic = true)] which gets called on unmatched query in workflow with string query name and IRawValue[] parameter
Added ActivityExecutionContext.PayloadConverter for use by activities
Added Workflow.PayloadConverter for use by workflows
Fixed double-encoding issue for schedule workflow updates
Added IPayloadConverter.ToRawValue and IPayloadConverter.ToValue<T> extensions to assist in conversion to/from raw values respectively
Removed embedded payload converter from RawValue/IRawValue - we make the converters otherwise available and this was needed because people may need to construct these values and the converter was only there for convenience
Added Workflow.DynamicSignal and Workflow.DynamicQuery as properties that can be set from within a workflow

Checklist

Closes [Feature Request] Support dynamic workflows, activities, signals, and queries #27

mjameswh · 2023-06-15T15:25:16Z

src/Temporalio/Worker/ActivityWorker.cs

+                        var availStr = string.Join(", ", avail);
+                        throw new ApplicationFailureException(
+                            $"Activity {act.Context.Info.ActivityType} is not registered on this worker," +
+                            $" available activities: {availStr}",


(Not related to this specific PR) Just a reminder that listing activities in failure details is now problematic. Server will report warning at 2KB, and truncate the Failure if exceeding 4KB (see code.

Ug, yeah that's the size of the entire failure proto - message, stack trace, recursive causes and all. What should we do for big stack traces? What should we do for large cause chains?

Seems it's not listing activities is problematic, but large failures are problematic and I wonder if we need a holistic solution. How am I supposed to know whether I can list more or less activities just because the stack trace happens to be smaller?

Indeed, the problem is not specifically with listing activity names. Yet, the fact that the server may penalize us on for sending the list of activity names if that exceeds some threshold means that this is not a reliable way to provide that kind of feedback to user. We'll need to find an alternative for listing registered activity/workflow/signal/queries types AND a better way to deal with activities failures containing large stack traces.

Also, I'd say that Failure should not contain stack trace for these kind of errors, if that's possible. There is no user code on the stack at this point, so nothing they should need to trace.

But really, that don't need to be resolved today.

server may penalize us on for sending the list of activity names if that exceeds some threshold

Yeah, I need to probably put some upper bound on this

We'll need to find an alternative for listing registered activity/workflow/signal/queries types

temporalio/features#51

I'd say that Failure should not contain stack trace for these kind of errors, if that's possible

It's possible, but not very transparent. As an open source library I like showing where all exceptions occur even in our own code.

mjameswh · 2023-06-15T15:30:33Z

src/Temporalio/Worker/WorkflowInstance.cs

+            set
+            {
+                if (value != null && !value.Dynamic)
+                {


Why require that the dynamic query function has the Dynamic attribute if they are explicitely assigning to DynamicQuery? Seems redundant...

They are different things. [WorkflowQuery] is declaration-site method made into a query. But we also offer (for DSLs and other use cases) a way to set handlers from code at runtime of the workflow. [Workflow(Dynamic = true)] is the recommended way, but DynamicQuery property is an optional runtime alternative.

Yeah, I got that. What I'm saying is that if the user is explicitly calling the setter to set handler at runtime, why is it required that provided function do have the Dynamic attribute?

It is not required to have that (or even have an attribute). This is a reference to a property getter that just checks the absence of the name, it is completed unrelated to the attribute. From the test for that setter:

Workflow.DynamicQuery = WorkflowQueryDefinition.CreateWithoutAttribute( null, (string queryName, IRawValue[] args) => { Events.Add($"query-{queryName}: " + Workflow.PayloadConverter.ToValue<string>(args.Single())); return "done"; });

mjameswh · 2023-06-15T15:36:28Z

src/Temporalio/Worker/WorkflowInstance.cs

+                {
+                    // If it's not null, send _all_ buffered signals. We will copy all from
+                    // buffered, clear buffered, and send all to apply signal
+                    var signals = bufferedSignals.ToList();


This is technically incorrect, as the Workflow could change/unregister the DynamicSignal handler or register named signal handler from a signal handler, causing reentrant calls to either this loop or the one in named query handler registration. In these case, ApplySignalWorkflow might push back signals to the queue, which might scramble buffered signals relative order. See this commit for example cases.

This should instead be something like:

while (signal = bufferedSignals.tryTakeFirst()) { ApplySignalWorkflow(signal); }

I think the approach you show there is more confusing but no more accurate. I intentionally take all of the signals off at handler registration time and clear the list before I even start processing them. This is very clear logic to the user setting the handler. You set a signal handler, your handler gets all buffered signals guaranteed. No maybe about it. Same for setting a named handler. It's the same problem that exists in many places when mutating a collection while iterating it and is exactly why we flush the collection before iterating on it.

With my logic, it is clear what happens when, say, you have psuedo code:

SetDynamicSignalHandler(HandleDynamicSignals) SetSpecificSignalHandler("Signal1", HandleSignal1Signals)

The first one gets all buffered signals and the second one gets 0 buffered signals. I have documented this behavior and it is clear. But if I had to change the documentation to "start the draining process from the beginning when registered" it would be much less clear.

If you change handlers inside of some code somewhere else, doesn't affect that you chose to have all your buffered signals delivered to the original one since you know that is a registration time behavior.

We also start all of these tasks immediately, just like if a workflow task came in with all of these signals present on it

mjameswh · 2023-06-15T15:51:15Z

src/Temporalio/Worker/WorkflowInstance.cs

+            // Send all buffered signals that apply to this one. We are going to collect these by
+            // iterating list in reverse matching predicate while removing, then iterating in
+            // reverse
+            var buffered = new List<SignalWorkflow>();


See related comment above

The way signals are drained is very much by intention. It is confusing behavior to remove from this one by one.

mjameswh

A few minor comments. Only one thing that really need to be addressed (risk of mangling buffered signals if signal handlers are added/modified/removed from a signal handler (ie. reentrant case)), but overall LGTM.

cretz added 3 commits June 14, 2023 10:31

Support for raw values as input/output

c87af65

Dynamic activities, workflows, signals, and queries

d97f7f2

More tests for dynamism

c296172

cretz requested a review from a team June 15, 2023 12:46

Re-disable code-style

7d65cb5

mjameswh reviewed Jun 15, 2023

View reviewed changes

mjameswh approved these changes Jun 15, 2023

View reviewed changes

cretz merged commit b9edda0 into temporalio:main Jun 16, 2023
6 checks passed

cretz deleted the dynamic branch June 16, 2023 14:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamic workflows, activities, signals, and queries #89

Dynamic workflows, activities, signals, and queries #89

cretz commented Jun 15, 2023

mjameswh Jun 15, 2023

cretz Jun 15, 2023

mjameswh Jun 15, 2023

cretz Jun 15, 2023 •

edited

Loading

mjameswh Jun 15, 2023

cretz Jun 15, 2023

mjameswh Jun 15, 2023

cretz Jun 15, 2023

mjameswh Jun 15, 2023 •

edited

Loading

cretz Jun 15, 2023 •

edited

Loading

cretz Jun 15, 2023 •

edited

Loading

mjameswh Jun 15, 2023

cretz Jun 15, 2023

mjameswh left a comment

Dynamic workflows, activities, signals, and queries #89

Dynamic workflows, activities, signals, and queries #89

Conversation

cretz commented Jun 15, 2023

What was changed

Checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cretz Jun 15, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mjameswh Jun 15, 2023 • edited Loading

Choose a reason for hiding this comment

cretz Jun 15, 2023 • edited Loading

Choose a reason for hiding this comment

cretz Jun 15, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mjameswh left a comment

Choose a reason for hiding this comment

cretz Jun 15, 2023 •

edited

Loading

mjameswh Jun 15, 2023 •

edited

Loading

cretz Jun 15, 2023 •

edited

Loading

cretz Jun 15, 2023 •

edited

Loading