Skip to content
This repository


Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Mirror of Apache Oozie

Octocat-spinner-32 bin OOZIE-1753 Update create-release-artifact script for git (rkanter) March 21, 2014
Octocat-spinner-32 builds OOZIE-1000 Remove Yahoo branding from docs, tests, etc (rkanter via v… September 28, 2012
Octocat-spinner-32 client OOZIE-1785 Add oozie email action xsd to (jagatsingh vi… April 23, 2014
Octocat-spinner-32 core OOZIE-1527 Fix scalability issues with coordinator materialization (… April 23, 2014
Octocat-spinner-32 distro OOZIE-1552 Bring Windows shell script functionality and structure in … February 10, 2014
Octocat-spinner-32 docs OOZIE-1797 Workflow rerun command should use existing workflow proper… April 21, 2014
Octocat-spinner-32 examples OOZIE-1581 Workflow performance optimizations (mona) January 23, 2014
Octocat-spinner-32 hadooplibs OOZIE-1756 hadoop-auth version is wrong if profile isn't selected (rk… March 24, 2014
Octocat-spinner-32 hbaselibs Bump up trunk to 4.1.0-SNAPSHOT February 28, 2013
Octocat-spinner-32 hcataloglibs Revert "Merge branch 'master' of… March 21, 2014
Octocat-spinner-32 login Bump up trunk to 4.1.0-SNAPSHOT February 28, 2013
Octocat-spinner-32 minitest OOZIE-1500 Fix many OS-specific issues on Windows (dwann via rohini) September 30, 2013
Octocat-spinner-32 sharelib OOZIE-1713 Avoid creating dummy input file for each launcher job (pur… March 25, 2014
Octocat-spinner-32 src OOZIE-1311 Refactor action Main classes into sharelibs (rkanter) April 11, 2013
Octocat-spinner-32 tools OOZIE-1684 DB upgrade from 3.3.0 to trunk fails on Oracle (rkanter) February 03, 2014
Octocat-spinner-32 utils Adding Apache License 2.0 missing sql/pig/js September 14, 2011
Octocat-spinner-32 webapp OOZIE-1781 UI - Last Modified time is not displayed for coord action … April 15, 2014
Octocat-spinner-32 workflowgenerator OOZIE-1083 WFGEN Help -> About dialog box (jaoki via tucu) April 18, 2013
Octocat-spinner-32 .gitignore Improvements to test-patch script (tucu) August 17, 2012
Octocat-spinner-32 DISCLAIMER.txt OOZIE-683 Add DISCLAIMER file in the root.(Mohammad) February 07, 2012
Octocat-spinner-32 LICENSE.txt OOZIE-685 Update License file with 3rd party license information. (Mo… February 08, 2012
Octocat-spinner-32 NOTICE.txt OOZIE-678 Update NOTICE.txt to reflect the workcount binaries into oo… February 03, 2012
Octocat-spinner-32 README.txt OOZIE-1102 Update Oozie README.txt to have the TLP mailing list and l… November 29, 2012
Octocat-spinner-32 pom.xml OOZIE-1769 An option to update coord properties/definition (puru via … April 16, 2014
Octocat-spinner-32 release-log.txt OOZIE-1785 Add oozie email action xsd to (jagatsingh vi… April 23, 2014
Octocat-spinner-32 source-headers.txt Adding Apache License 2.0 txt files September 13, 2011
Octocat-spinner-32 work.log Merge pull request #749 from angelokh/OOZIE-97-3.0.1 September 02, 2011
Apache Oozie

What is Oozie

Oozie is an extensible, scalable and reliable system to define, manage, schedule, and execute complex Hadoop workloads via web services. More specifically, this includes:

  * XML-based declarative framework to specify a job or a complex workflow of dependent jobs.
  * Support different types of job such as Hadoop Map-Reduce, Pipe, Streaming, Pig, Hive and custom java applications.
  * Workflow scheduling based on frequency and/or data availability.
  * Monitoring capability, automatic retry and failure handing of jobs.
  * Extensible and pluggable architecture to allow arbitrary grid programming paradigms.
  * Authentication, authorization, and capacity-aware load throttling to allow multi-tenant software as a service.

Oozie Overview

Oozie is a server based Workflow Engine specialized in running workflow jobs with actions that run Hadoop Map/Reduce and Pig jobs.

Oozie is a Java Web-Application that runs in a Java servlet-container.

For the purposes of Oozie, a workflow is a collection of actions (i.e. Hadoop Map/Reduce jobs, Pig jobs) arranged in a control dependency DAG (Direct Acyclic Graph). "control dependency" from one action to another means that the second action can't run until the first action has completed.

Oozie workflows definitions are written in hPDL (a XML Process Definition Language similar to JBOSS JBPM jPDL).

Oozie workflow actions start jobs in remote systems (i.e. Hadoop, Pig). Upon action completion, the remote systems callback Oozie to notify the action completion, at this point Oozie proceeds to the next action in the workflow.

Oozie workflows contain control flow nodes and action nodes.

Control flow nodes define the beginning and the end of a workflow ( start , end and fail nodes) and provide a mechanism to control the workflow execution path ( decision , fork and join nodes).

Action nodes are the mechanism by which a workflow triggers the execution of a computation/processing task. Oozie provides support for different types of actions: Hadoop map-reduce, Hadoop file system, Pig, SSH, HTTP, eMail and Oozie sub-workflow. Oozie can be extended to support additional type of actions.

Oozie workflows can be parameterized (using variables like ${inputDir} within the workflow definition). When submitting a workflow job values for the parameters must be provided. If properly parameterized (i.e. using different output directories) several identical workflow jobs can concurrently.

Documentations :
Oozie web service is bundle with the built-in details documentation.

More inforamtion could be found at:

Oozie Quick Start:

Supported Hadoop Versions:

This version of Oozie was primarily tested against Hadoop 0.20.205.x. This will not work on earlier versions of Hadoop such as 0.20.x. and 0.21.


If you have any questions/issues, please send an email to:

Subscribe using the link:

Something went wrong with that request. Please try again.