Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task.is_descendant_of? can't deal with multiple parents #184

Closed
HoneyryderChuck opened this issue May 26, 2022 · 7 comments
Closed

Task.is_descendant_of? can't deal with multiple parents #184

HoneyryderChuck opened this issue May 26, 2022 · 7 comments
Assignees

Comments

@HoneyryderChuck
Copy link

I've stumbled into this recently, where we rely on Task.is_descendant_of?, however we also support parallel tasks with a Merge. Tasks in Spiff only have one parent, which means that some of our use-cases fail to traverse back in the tree, because the multiple previous paths of a merge can't be traversed (only one).

I'm working around it with a separate implementation that starts at the "possible parent", and traverses the task tree downwards.

@essweine essweine self-assigned this May 26, 2022
@HoneyryderChuck
Copy link
Author

Just adding that the lack of support for multiple parents results in other bugs further downstream.

Say we have a situation where two tasks are connected to another, which means they have the latter in "children", but the latter task only signals one of them as "parent". On exclusive choice "sync children" logic, depending on where it goes, one of those tasks may disappear from the task tree, however its reference is kept in the other task "parent".

This is not a problem if the workflow continues executing in memory, as the next tasks to be unblocked are usually based on "children", and the other task still has a reference to a task, even if it's not in the task tree. However, if you serialize and deserialize it again, that task.parent will now be None. That's because it was serialized with the disappearing task id, and on deserialization, the reference won't be caught in the workflow task tree, so it'll be left empty.

This then breaks certain codepaths relying on the existence on of parent, such as this one, which will eventually be run once the task is reachable.

@essweine
Copy link
Contributor

essweine commented May 27, 2022

It's expected that a task (as opposed to a task spec) would only have one parent.

Spiff makes a distinction between task specs (which represent a specification and may have multiple inputs and outputs) and tasks (the instantiation where each task creates new tasks for its children, thus having only onw parent).

There is not a one-to-one correspondence between tasks and task specs. A task spec can have an arbitrary number of tasks associated with it (one for each time it's reached during a workflow).

A task can actually only be reached by one other task, hence the single parent. If two different task specs are connected to the the same task spec, two different tasks will be created for the child, one for each instance where the child task was reached. So if you needed to know all the paths that were followed into a particular task spec for a particular workflow instance, you would need to search for all the tasks with a spec matching the child spec and check the ancestors of each matching task.

@HoneyryderChuck
Copy link
Author

Both the Merge and the Join tasks have multiple previous tasks.

@danfunk
Copy link
Collaborator

danfunk commented May 28, 2022

@HoneyryderChuck - we'd really like to see what you are working on. We don't have a lot of people working outside of BPMN. Do you think you would be willing to get on a video call with us next week?

@essweine
Copy link
Contributor

I second the request for a demo. I don't understand why relying the task specs to find the ancestors won't work for you.

@HoneyryderChuck
Copy link
Author

If we schedule a call, I'm not allowed to show it running, or any data, as that's against my employer's policy. I can however show you a few snippets and describe how that fits together.

But long story short, we're using a custom JSON schema to describe workflows, which we then "transform" to a dict representation that SPIFF can receive to build workflow specs. We then use SPIFF workflows to orchestrate runs, where each task runs in celery jobs (or not, they have several different behaviours); before and after each transition, the workflow is deserialized and serialized, so as to persist the "current state" in a database.

I'm not sure if it's still interesting for you to schedule a call after this description, as I'd imagine that, minus the focus on BPMN, should be as conventional as what other SPIFF users are using.

@essweine
Copy link
Contributor

essweine commented Jun 1, 2022

I'd still be interested in seeing even snippets -- all the work I've done has been related to BPMN so I don't really know how people are using the non-BPMN capabilities (up until now, I guess I would have even added "or if" to that statement). I figured you probably must be generating workflows automatically somehow (you could generate BPMN but it's definitely not designed for that).

I am still not sure why you can't use the workflow spec rather that the task to determine how you got to a particular state in a workflow. Following the task ancestry will give you one path that got you to a particular task spec; if the task spec was executed more than one via different paths, there will be a task associated with each path, whose ancestry you can can follow to determine all the ways you you got there; if you need to know all possible paths that could reach a task spec, examining the workflow spec is the way to do that.

@danfunk danfunk closed this as completed Jun 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants