HappyRay edited this page Sep 15, 2017 · 13 revisions

Welcome to the Azkaban Wiki!

Azkaban is a batch workflow job scheduler. Often times there is a need to run a set of jobs and processes in a particular order within a workflow. Azkaban will resolve these job dependencies and provide an easy to use web user interface to maintain and track your workflows.

Here are a few listed features:

  • Web UI
  • Easy workflow uploads
  • Easy to set up job dependencies
  • Schedule workflows
  • Authentication/Authorization (permissions on jobs)
  • Ability to kill and restart workflows
  • Modular and plug-able
  • Project workspaces
  • Logging and auditing of workflow and jobs

Wiki Navigation

[Get Latest Release]

How we use it?

Azkaban has been running at LinkedIn since 2009. The project is actively being maintained and changes are deployed typically in two weeks cycle to multiple production clusters.

LinkedIn has been using it primarily to run our various scheduled Hadoop work flows, ETL flows, Spark flows, machine learning flows and even some real-time data flows.

Today at Linkedin we run over 25000 flows a day across Azkaban clusters

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.