New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Support Airflow connections for dbt targets #53
Conversation
The objective of airflow-dbt-python has always been to make dbt a first-class citizen of Airflow. As part of this goal, we want to integrate dbt with all of Airflow's features. In particular, Airflow connections allow us to safely store connection information for operators to use, so all dbt operators should be able to leverage them too. The way we achieve this is by manually instantiating a dbt Project and Profile. When reading the latter, we also inject any Airflow connections that match the given target argument (if any). Moreover, if the profile is not defined, we simply create our own with any Airflow connection that was passed as a target (of course, missing both a profiles file and an Airflow connection raises an error). Future work may extend this to support custom connection types, at the moment we are doing a best effort to include all possible arguments, but it's not perfect. This feature was suggested in the dbt Slack channel as a way to avoid having to manage a profiles.yml file, that may contain sensitive information, making it a bad target for version control.
Dependency resolution was taking forever with Poetry. To solve this, we added Airflow's list of constraints. A few had to be relaxed to allow for dbt and Airflow to play well together.
Tests will probably fail for 1.10.12 Airflow. This one will take a bit of work as it's quite a big change. Seriously considering just dropping Airflow v1 support. |
Just thought that before merging this we also want to update the documentation to mention the new feature and add some examples. |
df00578
to
2e10c86
Compare
Codecov Report
@@ Coverage Diff @@
## master #53 +/- ##
==========================================
+ Coverage 97.83% 98.58% +0.75%
==========================================
Files 8 8
Lines 830 917 +87
==========================================
+ Hits 812 904 +92
+ Misses 18 13 -5
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
0f86475
to
76b3436
Compare
76b3436
to
eef25d1
Compare
eb7cbcc
to
d0b09d0
Compare
2ce42ca
to
bced4b0
Compare
6f3e009
to
8ccb715
Compare
8ccb715
to
be6abc3
Compare
Closes #52
The objective of airflow-dbt-python has always been to make dbt a
first-class citizen of Airflow. As part of this goal, we want to
integrate dbt with all of Airflow's features. In particular, Airflow
connections allow us to safely store connection information for
operators to use, so all dbt operators should be able to leverage them
too.
The way we achieve this is by manually instantiating a dbt Project and
Profile. When reading the latter, we also inject any Airflow
connections that match the given target argument (if any). Moreover,
if the profile is not defined, we simply create our own with any
Airflow connection that was passed as a target (of course, missing
both a profiles file and an Airflow connection raises an error).
Future work may extend this to support custom connection types, at the
moment we are doing a best effort to include all possible arguments,
but it's not perfect.
This feature was suggested in the dbt Slack channel as a way to avoid
having to manage a profiles.yml file, that may contain sensitive
information, making it a bad target for version control.