ODD Airflow adapter is used for extracting data transformers and data transformers runs info and metadata from Apache Airflow (versions up to 1.10.15). This adapter is implemetation of push model (see more https://github.com/opendatadiscovery/opendatadiscovery-specification/blob/main/specification/specification.md#discovery-models). After installation, your Airflow will push new data transformer on DAG creation, and data transformer runs on every DAG run.
Entity type | Entity source |
---|---|
Data Transformer | DAG |
Data Transformer run | DAG's runs |
For more information about data entities see https://github.com/opendatadiscovery/opendatadiscovery-specification/blob/main/specification/specification.md#data-model-specification
pip3 install odd-airflow
from odd_airflow import DAG
default_args = {
"data_catalog_base_url": "https://yourcatalog.url", # Data catalog ingestion API url
"unit_id": "airflow_unit_id" # Host of Airflow source or any name for ODDRN generation (in order to uniquely identify Data entities)
}
dag = DAG(
dag_id='your_example_dag',
default_args=default_args,
schedule_interval=None,
tags=['example']
)
# Your tasks
Alternatively you can define env variables:
DATA_CATALOG_base_URL=https://yourcatalog.url
AIRFLOW_UNIT_ID=airflow_unit_id
- Python 3.8
- Airflow <= 1.10.15
docker-compose -f docker/docker-compose.yml up
Airflow UI will be available at localhost:8081