Skip to content

Latest commit

 

History

History
47 lines (34 loc) · 2.11 KB

CHANGELOG.md

File metadata and controls

47 lines (34 loc) · 2.11 KB

DOP v0.3.0 — 2021-08-11

Features

  • Support for "generic" airflow operators: you can now use regular python operators as part of your config files.

  • Support for “dbt docs” command to generate documentation for all dbt tasks: Users can now add “docs generate” as a target in their DOP configuration and additionally specify a GCS bucket with the --bucket and --bucket-path options where documents are copied to.

  • Serve dbt docs: Documents generated by dbt can be served as a web page by deploying the provided app on GAE. Note that deploying is an additional step that needs to be carried out after docs have been generated. See infrastructure/dbt-docs/README.md for details.

  • dbt tasks artifacts run_results created by dbt tasks saved to BigQuery: This json file contains information on completed dbt invocations and is saved in the BQ table “run_results” for analysis and debugging.

  • Add support for Airflow v1.10.14 and v1.10.15 local environments: Users can specify which version they want to use by setting the AIRFLOW_VERSION environment variable.

  • Pre-commit linters: added pre-commit hooks to ensure python, yaml and some support for plain text file consistency in formatting and style throughout DOP codebase.

Changes

  • Ensure DAGs using the same DBT project do not run concurrently: Safety feature to safely allow selective execution of workflows by calling specific commands or tags (e.g. dbt run --m) within a single dbt project. This avoids creating inter-dependant workflows to avoid overriding each other's artifacts, since they will share the same target location (within the dbt container).

  • Test time-partitioning: Time-partitioning of datetime type properly validated as part of schema validation.

  • Use Python 3.7 and dbt 0.19.1 in Composer K8s Operator

  • Add Dataflow example task: with the introduction of "regular" in the yaml config Airflow Operators, it is now possible to run compute intensive Dataflow jobs. Check example_dataflow_template for an example on how to implement a Dataflow pipeline.