Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Welcome to the Azkaban Wiki!
Azkaban is a batch workflow job scheduler. Often times there is a need to run a set of jobs and processes in a particular order within a workflow. Azkaban will resolve these job dependencies and provide an easy to use web user interface to maintain and track your workflows.
Here are a few listed features:
- Web UI
- Easy workflow uploads
- Easy to set up job dependencies
- Schedule workflows
- Authentication/Authorization (permissions on jobs)
- Ability to kill and restart workflows
- Modular and plug-able
- Project workspaces
- Logging and auditing of workflow and jobs
How we use it?
Azkaban has been running at LinkedIn since 2009. The project is actively being maintained and changes are deployed typically in two weeks cycle to multiple production clusters.
LinkedIn has been using it primarily to run our various scheduled Hadoop work flows, ETL flows, Spark flows, machine learning flows and even some real-time data flows.
Today at Linkedin we run over 25000 flows a day across Azkaban clusters