-
-
Notifications
You must be signed in to change notification settings - Fork 39
Version 3.0.0 #145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Version 3.0.0 #145
Conversation
0f03285
to
dd0fddd
Compare
dd0fddd
to
8d6cc58
Compare
8d6cc58
to
c666d7b
Compare
Hi, @tomasfarias |
Hello @millin. Thanks for the work! I was to busy the previous one, so I'll be taking a look this week. |
Hello, @tomasfarias! |
Apologies @millin, I've been out sick for a few days, and I missed my original deadline. I'll review this now. The PR is quite large, so I'll go commit by commit. |
) | ||
|
||
|
||
def py37_copytree(source: URL, destination: URL, replace: bool = True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: Happy to see this workaround removed. Fully behind dropping support for Python 3.7.
default_factory=list, repr=False | ||
) | ||
|
||
# legacy behaviors - https://github.com/dbt-labs/dbt-core/blob/main/docs/guides/behavior-change-flags.md |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: neat 👍
airflow_dbt_python/hooks/dbt.py
Outdated
conn = copy(conn) | ||
extra_dejson = conn.extra_dejson | ||
options = extra_dejson.pop("options") | ||
for k, v in re.findall(r"-c (\w+)=(.*)$", options): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue(blocking): Could we modify the docstring (or add a comment) explaining what are we trying to find with this regular expression?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now in a different module, but the same regular expression is present.
airflow_dbt_python/hooks/dbt.py
Outdated
"password", | ||
depends_on=lambda x: not any( | ||
( | ||
*(k in x.extra_dejson for k in ("private_key_file", "private_key_content")), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue(blocking): I don't think we care whether there are private key file and contents in extra_dejson
as long as we have our password and .get("authenticator", "") == "oauth"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should keep the mental model small for users with only one key to consider to determine what everything else means (authenticator).
airflow_dbt_python/hooks/dbt.py
Outdated
DbtConnectionParam( | ||
"login", | ||
"user", | ||
depends_on=lambda x: x.extra_dejson.get("authenticator", "") != "oauth", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: I feel like we could just have a utility function that we can re-use here instead of using lambda:
def make_extra_dejson_compare_callable(key, default, comparison_operator, expected):
def compare_extra_dejson_value(conn):
return comparison_operator(conn.extra_dejson.get(key, default), expected))
return compare_extra_dejson_value
Then here:
import operator
...
DbtConnectionParam(
"login",
"user",
depends_on=make_extra_dejson_compare_callable("authenticator", "", operator.ne, "oauth"),
),
This one is kind of a nitpick. I do prefer an approach like this to using lambda, but I think the other comments are more important.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I liked this idea and decided to combine recurring parameters together
|
||
return {self.conn.conn_id: details} | ||
|
||
def get_dbt_details_from_connection(self, conn: Connection) -> dict[str, Any]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: Okay, this is no longer a classmethod, safe to ignore my comment on this topic. I think this is now fine.
raise ValueError(f"Unsupported scheme: {url.scheme}") | ||
|
||
return client, path | ||
path, *remain = path.split("@", maxsplit=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: neat 👍
from airflow_dbt_python.utils.url import URL, URLLike | ||
|
||
|
||
class DbtGCSRemoteHook(GCSHook, DbtRemoteHook): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: I'm not very familiar with GCS unfortunately, so I'll default on saying this is fine 👍, although I do have one comment about some (apparently) copy-pasted comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, pretty lengthy PR, so naturally a pretty lengthy review is in order.
I've tagged all my review comments with conventional comments to give you an idea of what needs more focus to address.
Ultimately, I agree with the direction of the changes, so if we get these comments addressed I have no problem with shipping this.
pyproject.toml
Outdated
types-freezegun = ">=1.1.6" | ||
types-PyYAML = ">=6.0.7" | ||
pytest-mock = "^3.14.0" | ||
mock-gcp = {git = "https://github.com/millin/mock-gcp.git", rev = "0d972df9b6cce164b49f09ec4417a4eb77beb960"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: I skimmed through this, and it seems like upstream is not active. Could you publish your own fork in PyPI? I feel a bit uneasy about including a git source, so if publishing to PyPI is not possible I would like a comment briefly describing what are we getting from the fork that is not in upstream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The difference with upstream is that it mostly doesn't work 😄. I sent a PR to the original author but haven't gotten a response. But I liked the concept and used it as a basis.
I may try publishing this in PyPI (never done it, to be honest).
d8d6a79
to
5ff0c76
Compare
5ff0c76
to
e136027
Compare
@tomasfarias Please check out the new changes |
pyproject.toml
Outdated
types-PyYAML = ">=6.0.7" | ||
pytest-mock = "^3.14.0" | ||
mock-gcp = {git = "https://github.com/millin/mock-gcp.git", rev = "0d972df9b6cce164b49f09ec4417a4eb77beb960"} | ||
mock-gcp = ">=0.2.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: Thank you for this.
@@ -1 +1 @@ | |||
"""Hooks module provides DbtHooks and DbtRemoteHooks.""" | |||
"""Hooks module provides DbtHooks and DbtFSHooks.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: Fantastic work in this commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've got no more comments. Let's ship this!
target
parameter is no longer supported.Use
dbt_conn_id
parameter instead.✨ Features
[ Bug ] Wrong class name for BigQuery #142
[Feature] Snowflake Key Pair authentication #147
@branch_name
to the end of the URL)[Feature] Support branch checkout for Git remote hook #122
[Feature] Add support for GCS bucket #139
🐛 Bug Fixes
project_conn_id
instead ofprofiles_conn_id
in the download_dbt_profiles method.setup_dbt_logging
.profiles_dir
parameter when specified.The operator wouldn't recognize profiles_dir if it finds profile.yml in the project_dir #133
🔥 Refactoring
py37_copytree
as Python 3.7 is no longer supported.functools.cache
to cache get_remote method calls.🧪 Tests
🔍 Other