Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ingest/tableau): extract lineage from csql queries #7561

Conversation

maaaikoool
Copy link
Contributor

@maaaikoool maaaikoool commented Mar 13, 2023

... that contain unsupported queries.

Context

The tableau ingestion uses the lineage provided by the Tableau metadata catalog API.

However, Tableau has limited support for lineage in custom SQL tables. This results in incomplete lineage. The issue has been reported many times: 1, 2, 3

Proposed solution

We are adding a new configuration (default disabled) that will enable parsing the custom SQL table queries so that we can have full lineage. The config also contains a mapping of the tableau database name to the desired name.

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

Closes #5854

@github-actions github-actions bot added the ingestion PR or Issue related to the ingestion of metadata label Mar 13, 2023
@maaaikoool maaaikoool force-pushed the ingest-lineage-from-unsupported-csql-tables branch from d098180 to 53badba Compare March 13, 2023 16:33
@vercel
Copy link

vercel bot commented Mar 22, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated
docs-website ✅ Ready (Inspect) Visit Preview 💬 Add your feedback Mar 24, 2023 at 9:59AM (UTC)

Copy link
Collaborator

@mayurinehate mayurinehate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment about removing the new large (42k lines) golden file.
Otherwise looks good.

@codecov-commenter
Copy link

codecov-commenter commented Mar 24, 2023

Codecov Report

Patch coverage: 90.90% and project coverage change: -7.81 ⚠️

Comparison is base (1324231) 74.87% compared to head (92d48ff) 67.07%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7561      +/-   ##
==========================================
- Coverage   74.87%   67.07%   -7.81%     
==========================================
  Files         353      353              
  Lines       35386    35407      +21     
==========================================
- Hits        26496    23748    -2748     
- Misses       8890    11659    +2769     
Flag Coverage Δ
pytest-testIntegration ?
pytest-testIntegrationBatch1 36.45% <22.72%> (-0.02%) ⬇️
pytest-testQuick 63.56% <90.90%> (+0.01%) ⬆️
pytest-testSlowIntegration 32.95% <22.72%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...-ingestion/src/datahub/ingestion/source/tableau.py 89.82% <88.88%> (-2.60%) ⬇️
...ion/src/datahub/ingestion/source/tableau_common.py 93.00% <100.00%> (+0.21%) ⬆️
...n/src/datahub/ingestion/source/tableau_constant.py 100.00% <100.00%> (ø)

... and 76 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Collaborator

@mayurinehate mayurinehate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hsheth2 hsheth2 merged commit a50c712 into datahub-project:master Apr 11, 2023
yoonhyejin pushed a commit that referenced this pull request Apr 19, 2023
Co-authored-by: Mayuri Nehate <33225191+mayurinehate@users.noreply.github.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ingestion PR or Issue related to the ingestion of metadata
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Tableau Ingestion] Lineages for CustomSQLTables are not always correct
4 participants