Mirror of Apache Oozie
Branch: master
Clone or download
asalamon74 OOZIE-3326 [action] Sqoop Action should support tez delegation tokens…
… for hive-import (bgoerlitz, dionusos via asalamon74)
Latest commit ca9eee9 Feb 22, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
bin OOZIE-3395 [build] Migration from FindBugs to SpotBugs (kmarton via a… Feb 22, 2019
builds OOZIE-1000 Remove Yahoo branding from docs, tests, etc (rkanter via v… Sep 28, 2012
client OOZIE-3395 [build] Migration from FindBugs to SpotBugs (kmarton via a… Feb 22, 2019
core OOZIE-3395 [build] Migration from FindBugs to SpotBugs (kmarton via a… Feb 22, 2019
distro Changed version to 5.2.0-SNAPSHOT Oct 4, 2018
docs OOZIE-2949 Escape quotes whitespaces in Sqoop <command> field (asalam… Feb 1, 2019
examples OOZIE-3368 [fluent-job] CredentialsRetrying example does not compile … Oct 15, 2018
fluent-job OOZIE-3395 [build] Migration from FindBugs to SpotBugs (kmarton via a… Feb 22, 2019
minitest Changed version to 5.2.0-SNAPSHOT Oct 4, 2018
server OOZIE-3427 [core] Use best practices in HTTP response headers (asalam… Feb 7, 2019
sharelib OOZIE-3326 [action] Sqoop Action should support tez delegation tokens… Feb 22, 2019
src/main OOZIE-3438 Copy only apache-jsp server dependencies to distro (asalam… Feb 21, 2019
tools OOZIE-3395 [build] Migration from FindBugs to SpotBugs (kmarton via a… Feb 22, 2019
webapp OOZIE-3431 [web UI] Oozie web UI should not serve image from http://e… Feb 15, 2019
zookeeper-security-tests Changed version to 5.2.0-SNAPSHOT Oct 4, 2018
.gitignore OOZIE-1968 Building modules independently (shwethags) Aug 19, 2014
LICENSE.txt OOZIE-685 Update License file with 3rd party license information. (Mo… Feb 8, 2012
NOTICE.txt OOZIE-3150 Remove references to not present dependencies within NOTIC… Mar 26, 2018
README.md OOZIE-2734 amend [docs] Switch from TWiki to Markdown (asalamon74 via… Sep 14, 2018
pom.xml OOZIE-3395 [build] Migration from FindBugs to SpotBugs (kmarton via a… Feb 22, 2019
release-log.txt OOZIE-3326 [action] Sqoop Action should support tez delegation tokens… Feb 22, 2019
source-headers.txt OOZIE-2352 Unportable shebang in shell scripts (dbist13 via andras.pi… May 25, 2018

README.md

Apache Oozie

What is Oozie

Oozie is an extensible, scalable and reliable system to define, manage, schedule, and execute complex Hadoop workloads via web services. More specifically, this includes:

  • XML-based declarative framework to specify a job or a complex workflow of dependent jobs.
  • Support different types of job such as Hadoop Map-Reduce, Pipe, Streaming, Pig, Hive and custom java applications.
  • Workflow scheduling based on frequency and/or data availability.
  • Monitoring capability, automatic retry and failure handing of jobs.
  • Extensible and pluggable architecture to allow arbitrary grid programming paradigms.
  • Authentication, authorization, and capacity-aware load throttling to allow multi-tenant software as a service.

Oozie Overview

Oozie is a server based Workflow Engine specialized in running workflow jobs with actions that run Hadoop Map/Reduce and Pig jobs.

Oozie is a Java Web-Application that runs in a Java servlet-container.

For the purposes of Oozie, a workflow is a collection of actions (i.e. Hadoop Map/Reduce jobs, Pig jobs) arranged in a control dependency DAG (Directed Acyclic Graph). "control dependency" from one action to another means that the second action can't run until the first action has completed.

Oozie workflows definitions are written in hPDL (a XML Process Definition Language similar to JBOSS JBPM jPDL).

Oozie workflow actions start jobs in remote systems (i.e. Hadoop, Pig). Upon action completion, the remote systems callback Oozie to notify the action completion, at this point Oozie proceeds to the next action in the workflow.

Oozie workflows contain control flow nodes and action nodes.

Control flow nodes define the beginning and the end of a workflow ( start , end and fail nodes) and provide a mechanism to control the workflow execution path ( decision , fork and join nodes).

Action nodes are the mechanism by which a workflow triggers the execution of a computation/processing task. Oozie provides support for different types of actions: Hadoop map-reduce, Hadoop file system, Pig, SSH, HTTP, eMail and Oozie sub-workflow. Oozie can be extended to support additional type of actions.

Oozie workflows can be parameterized (using variables like ${inputDir} within the workflow definition). When submitting a workflow job values for the parameters must be provided. If properly parameterized (i.e. using different output directories) several identical workflow jobs can concurrently.

Documentations :

Oozie web service is bundle with the built-in details documentation.

More inforamtion could be found at: http://oozie.apache.org/

Oozie Quick Start: http://oozie.apache.org/docs/5.0.0/DG_QuickStart.html

Supported Hadoop Versions:

This version of Oozie was primarily tested against Hadoop 2.4.x and 2.6.x.

If you have any questions/issues, please send an email to:

user@oozie.apache.org

Subscribe using the link:

http://oozie.apache.org/mail-lists.html