Workflow Definition

Achal Aggarwal edited this page Oct 3, 2016 · 3 revisions

Workflow Definition

Arbiter workflows are defined in YAML. A given Arbiter workflow definition is tightly coupled with the configuration file(s) that will be used to generate it. Before writing workflows with Arbiter, you should create a configuration file defining the action types available for use. See the Configuration page for details on how to define a configuration file.

Dependencies between Actions and Parallelism

In an Oozie XML workflow, the author is responsible for putting the actions in the appropriate order to satisfy the dependencies between actions. Parallelism is managed by manually inserting fork/join pairs. Arbiter manages all of this automatically for the workflow author. In an Arbiter workflow dependencies between actions are explicitly specified, and Arbiter will order the actions to satisfy these dependencies. Furthermore, it will insert fork/join pairs to run actions in parallel when possible.

Example Workflow

This workflow uses the action types defined in the example configuration on the Configuration page.

---
name: email-rollups
errorHandler:
  name: screamapillar
  type: screamapillar
  recipients: fake_email
  sender: fake_email
actions:
  - name: email_campaign_stats
    type: rollup
    rollup_file: zz_email_campaign_stats.sql
    category: regular
    dependencies: []
  - onlyIf: "import-trans-email eq true"
    name: trans_email_overview
    type: rollup
    rollup_file: trans_email_overview.sql
    category: regular
    dependencies: [email_campaign_stats, user_language]
  - name: email_overview
    type: rollup
    rollup_file: zz_email_overview.sql
    category: regular
    dependencies: [email_campaign_stats, user_language]
  - name: user_language
    type: rollup
    rollup_file: user_language.sql
    category: regular
    dependencies: []
  - name: transactional_lifecycle_email_stats
    type: rollup
    rollup_file: transactional_lifecycle_email_stats.sql
    category: regular
    dependencies: []
  - name: move-temp-to-final
    type: fs
    onlyIf: "${wf:conf('some.variable') eq 'false'}"
    elem: {
      delete: {path: "/foo/bar/final"},
      move: {
        source: "/bar/foo/temp",
        target: "/foo/bar/final"
      }
    }
    dependencies: []
    forceError: kill   

The name element is required and defines the name of the generated workflow.

Error Handler

The errorHandler element is optional. If specified, Arbiter will ensure the flow of execution passes through this action before the workflow exits, successfully or not. This can be used to send an email with the workflow status, for example.

The name element defines the name of the error handler action in the generated XML. The type element specifies the type of this action. It must match the name of an action type specified in the configuration file. Both of these elements are required.

In this example, the sender and recipients will be interpolated into the default arguments specified in the configuration file. The set of additional properties like this is dependent on what is defined in the configuration file for the given action type.

Actions

The actions element defines the actions that make up this workflow. This is an example action

  - name: trans_email_overview
    type: rollup
    rollup_file: trans_email_overview.sql
    category: regular
    dependencies: [email_campaign_stats, user_language]

The name element defines the name of this action in the generated XML. It will also be used if another action specifies a dependency on this action. The type element specifies the specifies the type of this action. It must match the name of an action type specified in the configuration file. Both of these elements are required. dependencies specifies the actions on which this action depends. The dependencies are specified in terms of the action names. This element is required, but if no dependencies are necessary the empty list [] can be used.

rollup_file and category will be interpolated into the default arguments specified in the configuration file. The set of additional properties like this is dependent on what is defined in the configuration file for the given action type.

Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.