Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(snowflake): add config option to specify deny patterns for upstreams #7962

Conversation

mayurinehate
Copy link
Collaborator

@mayurinehate mayurinehate commented May 4, 2023

Followup on #7894

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

@github-actions github-actions bot added the ingestion PR or Issue related to the ingestion of metadata label May 4, 2023
@mayurinehate mayurinehate requested a review from hsheth2 May 8, 2023 07:48
Copy link
Collaborator

@asikowitz asikowitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple minor comments but LGTM. I'd really like to get some focused connector tests around this at some point (if priorities allow), because regexes can be hard to debug

):
raise ValueError(
"Can not perform Deletion Detection, Lineage Extraction, Profiling without extracting snowflake technical schema. Set `include_technical_schema` to True or disable Deletion Detection, Lineage Extraction, Profiling."
"Can not perform Deletion Detection, Profiling without extracting snowflake technical schema. Set `include_technical_schema` to True or disable Deletion Detection, Lineage Extraction, Profiling."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to remove the reference to lineage extraction at the end of the comment here?

)


def create_deny_regex_sql_filter(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although will be annoying, definitely something we should test in connector tests in some way

@@ -426,20 +447,30 @@ def view_lineage_history(
) = 1
"""

# Note on use of `upstreams_deny_pattern`` to ignore temporary tables:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra ` here

Comment on lines -715 to -718
AND upstream_column_table_name NOT LIKE '%.FIVETRAN\\_%\\_STAGING.%'
AND upstream_column_table_name NOT LIKE '%\\_\\_DBT\\_TMP'
AND upstream_table_name NOT LIKE '%.FIVETRAN\\_%\\_STAGING.%'
AND upstream_table_name NOT LIKE '%\\_\\_DBT\\_TMP'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We escape underscores here but not in DEFAULT_UPSTREAMS_DENY_LIST, just want to make sure this is ok.

@asikowitz asikowitz merged commit c845c75 into datahub-project:master May 8, 2023
44 of 45 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ingestion PR or Issue related to the ingestion of metadata
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants