Azkaban Execution

This doc is for older versions (v0.2.1 and before) of WhereHows. Please refer to this for the latest version.

Collect Azkaban execution information, including Azkaban flows/jobs definitions, DAGs, executions, owners, and schedules.


List of properties required for the ETL process:

configuration key description
az.db.driver Azkaban database driver, e.g., com.mysql.jdbc.Driver
az.db.jdbc.url Azkaban database JDBC URL (not including username and password), e.g., jdbc:mysql://localhost:3306/azkaban
az.db.password Azkaban database password
az.db.username Azkaban database username lookback period in minutes for executions


Major related file:

Connect to Azkaban MySQL database, collect metadata, and store in local file.

Major source tables from Azkaban database: project_flows, execution_flows, triggers, project_permissions


Major related file:,

Transform the JSON output into CSV format.


Major related file:,

Load into MySQL database. Major related tables: flow, flow_job, flow_dag, flow_schedule, flow_owner_permission, flow_execution, job_execution

