Intermix.io Airflow Plugin
The intermix.io Airflow plugin surfaces important information about the performance of Airflow DAGs and Tasks.
The plugin will automatically annotate SQL queries executed by Airflow via PostgresOperator. The queries are annotated with the intermix.io - Tagging Specification. Tagging your SQL allows you to leverage the intermix.io App Tracing suite of tools.
The plugin works by prepending the query with a SQL comment containing metadata about the query itself (Airflow DAG, task, user, etc). This does not slow down query execution or affect the logical execution of the code. It is used to provide data inside our analytics service.
Set the AIRFLOW_HOME environment variable to point to your Airflow data directory.
Copy this folder to all machines and environments that will be executing Airflow.
python setup.py install inside this folder on all those environments.
Support for PostgresHook
The plugin will automatically annotate SQL queries executed by Airflow via PostgresOperator.
In addition, there is lightweight support for PostgresHook methods get_first, get_records, and run. If you are using these methods, the plugin will retrieve the file name, line of code, and class name. An attempt will be made to retrieve the DAG and Task name as well.
However, if you are using PostgresHook, we recommend using the intermix.io Python Plugin to explicitly pass in the DAG and Task name.
This plugin has been tested on:
Airflow versions 1.1.0
Airflow versions 1.9.0
Python version 2.7.x.
Python version 3.6.x.
Questions & Support
For questions and support please contact firstname.lastname@example.org.
If you're looking to contribute, please contact us at email@example.com.
intermix.io is a single monitoring dashboard for data engineers to keep an eye on their mission-critical data flows.
This software is published under the MIT license. For full license see the LICENSE file.