Adam Donahue edited this page Jul 8, 2015 · 6 revisions
Clone this wiki locally

About Spiff Workflow

Spiff Workflow is a workflow engine implemented in pure Python. It is based on the excellent work of the Workflow Patterns initiative. Its main design goals are the following:

  • Directly support as many of the patterns of workflowpatterns.com as possible.
  • Map those patterns into workflow elements that are easy to understand by a user in a workflow GUI editor.
  • Provide a clean Python API.

You can find a list of supported workflow patterns below.

General Concept

The process of using Spiff Workflow involves the following steps:

  1. Write a workflow specification. A specification may be written using XML (example), JSON, or Python (example).
  2. Run the workflow using the Python API. Example code for running the workflow:
from SpiffWorkflow.specs import *
from SpiffWorkflow import Workflow

spec = WorkflowSpec()
# (Add tasks to the spec here.)

wf = Workflow(spec)

Specification vs. Workflow Instance

One critical concept to know about SpiffWorkflow is the difference between a TaskSpecand Task and the difference between a WorkflowSpec and Workflow.

In order to understand how to handle a running workflow consider the following process:

Choose product -> Choose amount -> Produce product A
                              `--> Produce product B

As you can see, in this case the process resembles a simple tree. Choose product, Choose amount, Produce product A, and Produce product B are all specific kinds of task specifications, and the whole process is a workflow specification.

But when you execute the workflow, the path taken does not necessarily have the same shape. For example, if the user chooses to produce 3 items of product A, the path taken looks like the following:

Choose product -> Choose amount -> Produce product A
                              |--> Produce product A
                              `--> Produce product A

This is the reason why you will find two different categories of objects in Spiff Workflow:

  • Specification objects (WorkflowSpec and TaskSpec) represent the workflow definition, and
  • derivation tree objects (Workflow and Task) model the task tree that represents the state of a running workflow.

Defining a Workflow

The WorkflowSpec and TaskSpec classes are used to define a workflow. SpiffWorkflow has many types of TaskSpecs: Join, Split, Execute, Wait, and all others are derived from TaskSpec. The specs can be serialized and deserialized to a variety of formats.

A WorkflowSpec is built by chaining TaskSpecs together in a tree. You can either assemble workflow using Python objects (see the example linked above), or by loading it from XML such as follows:

from SpiffWorkflow.storage import XmlSerializer

serializer = XmlSerializer()
xml_file = 'my_workflow.xml'
xml_data = open(xml_file).read()
spec = serializer.deserialize_workflow_spec(xml_data, xml_file)

(Passing the filename to the deserializer is optional, but improves error messages.)

For a full list of all TaskSpecs see the SpiffWorkflow.specs module. All classes have full API documentation. To understand better how each individual subtype of TaskSpec works, look at the workflow patterns web site; especially the flash animations showing how each type of task works.

Note: The TaskSpec classes named "ThreadXXXX" create logical threads based on the model in http://www.workflowpatterns.com. There is no Python threading implemented.

Running a workflow

To run the workflow, create an instance of the Workflow class as follows:

from SpiffWorkflow import Workflow

spec = ... # see above

wf = Workflow(spec)

The Workflow object then represents the state of this particular instance of the running workflow. In other words, it includes the derivation tree and the data, by holding a tree that is composed of Task objects. All changes in the progress or state of a workflow are always reflected in one (or more) of the Task objects. Each Task has a state, and can hold data.

Hint: To visualize the state of a running workflow, you may use the Workflow.dump() method to print the task tree to stdout.

Some tasks change their state automatically based on internal or environmental changes. Other tasks may need to be triggered by you, the user. The latter kind of tasks can, for example, be completed by calling


Understanding the task states

The following task states exist:


The states are reached in a strict order and the lines in the diagram show the possible state transitions.

The order of these state transitions is violated only in one case: A Trigger task may add additional work to a task that was already COMPLETED, causing it to change the state back to FUTURE.

  • MAYBE means that the task will possibly, but not necessarily run at a future time. This means that it can not yet be fully determined as to whether or not it may run, for example, because the execution still depends on the outcome of an ExclusiveChoice task in the path that leads towards it.

  • LIKELY is like MAYBE, except it is considered to have a higher probability of being reached because the path leading towards it is the default choice in an ExclusiveChoice task.

  • FUTURE means that the processor has predicted that this this path will be taken and this task will, at some point, definitely run. (Unless the task is explicitly set to CANCELLED, which can not be predicted.) If a task is waiting on predecessors to run then it is in FUTURE state (not WAITING).

  • WAITING means I am in the process of doing my work and have not finished. When the work is finished, then I will be READY for completion and will go to READY state. WAITING is an optional state.

  • READY means "the preconditions for marking this task as complete are met".

  • COMPLETED means that the task is done.

  • CANCELLED means that the task was explicitly cancelled, for example by a CancelTask operation.

Associating data with a workflow

The difference between specification objects and derivation tree objects is also important when choosing how to store data in a workflow. Spiff Workflow supports storing data in two ways:

  • Task spec data is stored in the TaskSpec object. In other words, if a task causes a task spec data value to change, that change is reflected to all other instances in the derivation tree that use the TaskSpec object.
  • Task data is local to the Task object, but is carried along to the children of each Task object in the derivation tree.

Developer's details

A derivation tree is created based off of the spec using a hierarchy of Task objects (not TaskSpecs, but each Task points to the TaskSpec that generated it).

Think of a derivation tree as tree of execution paths (some, but not all, of which will end up executing). Each Task object is basically a node in the derivation tree. Each task in the tree links back to its parent (there are no connection objects). The processing is done by walking down the derivation tree one Task at a time and moving the task (and it's children) through the sequence of states towards completion. The states are documented in Task.py.

You can serialize/deserialize specs and open standards like OpenWFE are supported (and others can be coded in easily). You can also serialize/deserialize a running workflow (it will pull in its spec as well).

There's a decent eventing model that allows you to tie in to and receive events (for each task, you can get event notifications from its TaskSpec). The events correspond with how the processing is going in the derivation tree, not necessarily how the workflow as a whole is moving. See TaskSpec.py for docs on events.

You can nest workflows (using the SubWorkflowSpec).

The serialization code is done well which makes it easy to add new formats if we need to support them.

Other documentation

API documentation is currently embedded into the Spiff Workflow source code and currently not yet made available in a prettier form.

If you need more help, please drop by our mailing list.

Supported Workflow Patterns

Hint: All examples are located here.

Control-Flow Patterns

  1. Sequence [control-flow/sequence.xml]
  2. Parallel Split [control-flow/parallel_split.xml]
  3. Synchronization [control-flow/synchronization.xml]
  4. Exclusive Choice [control-flow/exclusive_choice.xml]
  5. Simple Merge [control-flow/simple_merge.xml]
  6. Multi-Choice [control-flow/multi_choice.xml]
  7. Structured Synchronizing Merge [control-flow/structured_synchronizing_merge.xml]
  8. Multi-Merge [control-flow/multi_merge.xml]
  9. Structured Discriminator [control-flow/structured_discriminator.xml]
  10. Arbitrary Cycles [control-flow/arbitrary_cycles.xml]
  11. Implicit Termination [control-flow/implicit_termination.xml]
  12. Multiple Instances without Synchronization [control-flow/multi_instance_without_synch.xml]
  13. Multiple Instances with a Priori Design-Time Knowledge [control-flow/multi_instance_with_a_priori_design_time_knowledge.xml]
  14. Multiple Instances with a Priori Run-Time Knowledge [control-flow/multi_instance_with_a_priori_run_time_knowledge.xml]
  15. Multiple Instances without a Priori Run-Time Knowledge [control-flow/multi_instance_without_a_priori.xml]
  16. Deferred Choice [control-flow/deferred_choice.xml]
  17. Interleaved Parallel Routing [control-flow/interleaved_parallel_routing.xml]
  18. Milestone [control-flow/milestone.xml]
  19. Cancel Task [control-flow/cancel_task.xml]
  20. Cancel Case [control-flow/cancel_case.xml]
  22. Recursion [control-flow/recursion.xml]
  23. Transient Trigger [control-flow/transient_trigger.xml]
  24. Persistent Trigger [control-flow/persistent_trigger.xml]
  25. Cancel Region [control-flow/cancel_region.xml]
  26. Cancel Multiple Instance Task [control-flow/cancel_multi_instance_task.xml]
  27. Complete Multiple Instance Task [control-flow/complete_multiple_instance_activity.xml]
  28. Blocking Discriminator [control-flow/blocking_discriminator.xml]
  29. Cancelling Discriminator [control-flow/cancelling_discriminator.xml]
  30. Structured Partial Join [control-flow/structured_partial_join.xml]
  31. Blocking Partial Join [control-flow/blocking_partial_join.xml]
  32. Cancelling Partial Join [control-flow/cancelling_partial_join.xml]
  33. Generalized AND-Join [control-flow/generalized_and_join.xml]
  34. Static Partial Join for Multiple Instances [control-flow/static_partial_join_for_multi_instance.xml]
  35. Cancelling Partial Join for Multiple Instances [control-flow/cancelling_partial_join_for_multi_instance.xml]
  36. Dynamic Partial Join for Multiple Instances [control-flow/dynamic_partial_join_for_multi_instance.xml]
  37. Acyclic Synchronizing Merge [control-flow/acyclic_synchronizing_merge.xml]
  38. General Synchronizing Merge [control-flow/general_synchronizing_merge.xml]
  39. Critical Section [control-flow/critical_section.xml]
  40. Interleaved Routing [control-flow/interleaved_routing.xml]
  41. Thread Merge [control-flow/thread_merge.xml]
  42. Thread Split [control-flow/thread_split.xml]
  43. Explicit Termination [control-flow/explicit_termination.xml]

Workflow Data Patterns

  1. Task Data [data/task_data.xml]
  2. Block Data [data/block_data.xml]
  9. Task to Task [data/task_to_task.xml]
  10. Block Task to Sub-Workflow Decomposition [data/block_to_subworkflow.xml]
  11. Sub-Workflow Decomposition to Block Task [data/subworkflow_to_block.xml]

Specs that have no corresponding workflow pattern on workflowpatterns.com

  • Execute - spawns a subprocess and waits for the results
  • Transform - executes commands that can be used for data transforms
  • Celery - executes a Celery task (see http://celeryproject.org/)