Mirror of Apache Oozie
Java JavaScript Shell CSS Batchfile PowerShell Other
Failed to load latest commit information.
bin OOZIE-2745 test-patch should also list the failed tests (gezapeti via… Dec 5, 2016
builds OOZIE-1000 Remove Yahoo branding from docs, tests, etc (rkanter via v… Sep 28, 2012
client amend-OOZIE-2691 Show workflow action retry information in UI Jan 30, 2017
core OOZIE-2803 Mask passwords when printing out configs/args in MapReduce… Feb 24, 2017
distro OOZIE-2732 Remove login server example (rkanter via abhishekbafna) Jan 11, 2017
docs amend OOZIE-2787 Oozie distributes application jar twice making the s… Feb 10, 2017
examples OOZIE-2710 Oozie HCatalog example workflow fails (abhishekbafna via s… Oct 21, 2016
hadooplibs OOZIE-2742 Unable to kill applications based on tag (satishsaley via … Nov 23, 2016
minitest Changed version to 4.4.0-SNAPSHOT Sep 26, 2016
server OOZIE-2788 Fix jobs API servlet mapping for EmbeddedOozieServer (abhi… Feb 10, 2017
sharelib OOZIE-2792 amend to fix regex Feb 27, 2017
src/main OOZIE-2778 Copy only jetty.version related server dependencies to dis… Feb 8, 2017
tools OOZIE-2727 OozieDBCLI creates temporary directories and do not delete… Jan 6, 2017
utils/dbutils/updatescripts Adding Apache License 2.0 missing sql/pig/js Sep 14, 2011
webapp amend-OOZIE-2691 Show workflow action retry information in UI Jan 30, 2017
zookeeper-security-tests Changed version to 4.4.0-SNAPSHOT Sep 26, 2016
.gitignore OOZIE-1968 Building modules independently (shwethags) Aug 19, 2014
LICENSE.txt OOZIE-685 Update License file with 3rd party license information. (Mo… Feb 8, 2012
NOTICE.txt OOZIE-2723 JSON.org license is now CatX (rkanter, abhishekbafna via s… Nov 25, 2016
README.txt OOZIE-1102 Update Oozie README.txt to have the TLP mailing list and l… Nov 29, 2012
pom.xml OOZIE-2777 amend to fix RAT report Feb 7, 2017
release-log.txt OOZIE-2803 Mask passwords when printing out configs/args in MapReduce… Feb 24, 2017
source-headers.txt Adding Apache License 2.0 txt files Sep 13, 2011
work.log Merge pull request #749 from angelokh/OOZIE-97-3.0.1 Sep 2, 2011


Apache Oozie

What is Oozie

Oozie is an extensible, scalable and reliable system to define, manage, schedule, and execute complex Hadoop workloads via web services. More specifically, this includes:

  * XML-based declarative framework to specify a job or a complex workflow of dependent jobs.
  * Support different types of job such as Hadoop Map-Reduce, Pipe, Streaming, Pig, Hive and custom java applications.
  * Workflow scheduling based on frequency and/or data availability.
  * Monitoring capability, automatic retry and failure handing of jobs.
  * Extensible and pluggable architecture to allow arbitrary grid programming paradigms.
  * Authentication, authorization, and capacity-aware load throttling to allow multi-tenant software as a service.

Oozie Overview

Oozie is a server based Workflow Engine specialized in running workflow jobs with actions that run Hadoop Map/Reduce and Pig jobs.

Oozie is a Java Web-Application that runs in a Java servlet-container.

For the purposes of Oozie, a workflow is a collection of actions (i.e. Hadoop Map/Reduce jobs, Pig jobs) arranged in a control dependency DAG (Direct Acyclic Graph). "control dependency" from one action to another means that the second action can't run until the first action has completed.

Oozie workflows definitions are written in hPDL (a XML Process Definition Language similar to JBOSS JBPM jPDL).

Oozie workflow actions start jobs in remote systems (i.e. Hadoop, Pig). Upon action completion, the remote systems callback Oozie to notify the action completion, at this point Oozie proceeds to the next action in the workflow.

Oozie workflows contain control flow nodes and action nodes.

Control flow nodes define the beginning and the end of a workflow ( start , end and fail nodes) and provide a mechanism to control the workflow execution path ( decision , fork and join nodes).

Action nodes are the mechanism by which a workflow triggers the execution of a computation/processing task. Oozie provides support for different types of actions: Hadoop map-reduce, Hadoop file system, Pig, SSH, HTTP, eMail and Oozie sub-workflow. Oozie can be extended to support additional type of actions.

Oozie workflows can be parameterized (using variables like ${inputDir} within the workflow definition). When submitting a workflow job values for the parameters must be provided. If properly parameterized (i.e. using different output directories) several identical workflow jobs can concurrently.

Documentations :
Oozie web service is bundle with the built-in details documentation.

More inforamtion could be found at:

Oozie Quick Start:

Supported Hadoop Versions:

This version of Oozie was primarily tested against Hadoop 0.20.205.x. This will not work on earlier versions of Hadoop such as 0.20.x. and 0.21.


If you have any questions/issues, please send an email to:


Subscribe using the link: