Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support Airflow connections for dbt targets #53

Merged
merged 11 commits into from Apr 29, 2022

Conversation

tomasfarias
Copy link
Owner

Closes #52

The objective of airflow-dbt-python has always been to make dbt a
first-class citizen of Airflow. As part of this goal, we want to
integrate dbt with all of Airflow's features. In particular, Airflow
connections allow us to safely store connection information for
operators to use, so all dbt operators should be able to leverage them
too.

The way we achieve this is by manually instantiating a dbt Project and
Profile. When reading the latter, we also inject any Airflow
connections that match the given target argument (if any). Moreover,
if the profile is not defined, we simply create our own with any
Airflow connection that was passed as a target (of course, missing
both a profiles file and an Airflow connection raises an error).

Future work may extend this to support custom connection types, at the
moment we are doing a best effort to include all possible arguments,
but it's not perfect.

This feature was suggested in the dbt Slack channel as a way to avoid
having to manage a profiles.yml file, that may contain sensitive
information, making it a bad target for version control.

The objective of airflow-dbt-python has always been to make dbt a
first-class citizen of Airflow. As part of this goal, we want to
integrate dbt with all of Airflow's features. In particular, Airflow
connections allow us to safely store connection information for
operators to use, so all dbt operators should be able to leverage them
too.

The way we achieve this is by manually instantiating a dbt Project and
Profile. When reading the latter, we also inject any Airflow
connections that match the given target argument (if any). Moreover,
if the profile is not defined, we simply create our own with any
Airflow connection that was passed as a target (of course, missing
both a profiles file and an Airflow connection raises an error).

Future work may extend this to support custom connection types, at the
moment we are doing a best effort to include all possible arguments,
but it's not perfect.

This feature was suggested in the dbt Slack channel as a way to avoid
having to manage a profiles.yml file, that may contain sensitive
information, making it a bad target for version control.
Dependency resolution was taking forever with Poetry. To solve this,
we added Airflow's list of constraints. A few had to be relaxed to
allow for dbt and Airflow to play well together.
@tomasfarias
Copy link
Owner Author

Tests will probably fail for 1.10.12 Airflow. This one will take a bit of work as it's quite a big change. Seriously considering just dropping Airflow v1 support.

@tomasfarias tomasfarias self-assigned this Apr 26, 2022
@tomasfarias tomasfarias added the enhancement New feature or request label Apr 26, 2022
@tomasfarias
Copy link
Owner Author

Just thought that before merging this we also want to update the documentation to mention the new feature and add some examples.

@codecov
Copy link

codecov bot commented Apr 28, 2022

Codecov Report

Merging #53 (fa9d991) into master (a1e535c) will increase coverage by 0.75%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master      #53      +/-   ##
==========================================
+ Coverage   97.83%   98.58%   +0.75%     
==========================================
  Files           8        8              
  Lines         830      917      +87     
==========================================
+ Hits          812      904      +92     
+ Misses         18       13       -5     
Flag Coverage Δ
unittests 98.58% <100.00%> (+0.75%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
airflow_dbt_python/hooks/backends/s3.py 96.62% <100.00%> (+2.31%) ⬆️
airflow_dbt_python/hooks/dbt.py 98.54% <100.00%> (+0.99%) ⬆️
airflow_dbt_python/operators/dbt.py 99.66% <100.00%> (+0.33%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a1e535c...fa9d991. Read the comment docs.

@tomasfarias tomasfarias force-pushed the targets-from-connections branch 8 times, most recently from 0f86475 to 76b3436 Compare April 29, 2022 10:15
@tomasfarias tomasfarias force-pushed the targets-from-connections branch 4 times, most recently from eb7cbcc to d0b09d0 Compare April 29, 2022 11:57
@tomasfarias tomasfarias force-pushed the targets-from-connections branch 2 times, most recently from 2ce42ca to bced4b0 Compare April 29, 2022 14:37
@tomasfarias tomasfarias force-pushed the targets-from-connections branch 4 times, most recently from 6f3e009 to 8ccb715 Compare April 29, 2022 16:08
@tomasfarias tomasfarias marked this pull request as ready for review April 29, 2022 17:44
@tomasfarias tomasfarias merged commit c4d13d3 into master Apr 29, 2022
@tomasfarias tomasfarias deleted the targets-from-connections branch April 29, 2022 17:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] Use Airflow connections instead of a profiles.yml file
1 participant