-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(systemMetadata): Adding a lastRunId field system metadata #8672
Merged
iprentic
merged 12 commits into
datahub-project:master
from
jjoyce0510:jj--add-last-run-id-to-system-metadata--oss
Sep 6, 2023
Merged
feat(systemMetadata): Adding a lastRunId field system metadata #8672
iprentic
merged 12 commits into
datahub-project:master
from
jjoyce0510:jj--add-last-run-id-to-system-metadata--oss
Sep 6, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Adding tests
github-actions
bot
added
product
PR or Issue related to the DataHub UI/UX
devops
PR or Issue related to DataHub backend & deployment
labels
Aug 18, 2023
chriscollins3456
approved these changes
Aug 21, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
neato
hsheth2
added a commit
to hsheth2/datahub
that referenced
this pull request
Aug 21, 2023
5 tasks
hsheth2
added
release-0.10.6
merge-pending-ci
A PR that has passed review and should be merged once CI is green.
labels
Aug 22, 2023
spadhi7
added a commit
to spadhi7/datahub
that referenced
this pull request
Oct 4, 2023
* tag 'v0.11.0': (188 commits) fix(spark-test): upgrade gradle and fix spark smoke test (datahub-project#8777) fix(gms): Fixed Recently Viewed section for users with '@' in the URN. (datahub-project#8754) feat: add feedback widget (datahub-project#8732) fix(custom-search): fix custom search to be able to use unquoted query (datahub-project#8805) docs(db-retention): update with default setting (datahub-project#8797) feat(openapi): entity endpoints & analytics raw (datahub-project#8537) feat(search): Also de-duplicate the field queries based on field names (datahub-project#8788) fix(ingest): drop `wrap_aspect_as_workunit` method (datahub-project#8766) feat(ingest): drop sql_metadata parser (datahub-project#8765) docs: minor fix on versioning navbar and dropdown (datahub-project#8790) chore(ingest): upgrade sqlglot fork (datahub-project#8775) docs: add datahub source to integrations page (datahub-project#8787) fix(ingest/bigquery): fix partition and median queries for profiling (datahub-project#8778) fix(ingest/tableau): fix tableau native CLL for snowflake, add type annotations (datahub-project#8779) refactor(ingest): Add support for group-owners in dataflow entities (datahub-project#8154) feat(systemMetadata): Adding a lastRunId field system metadata (datahub-project#8672) feat(airflow-plugin): add package type information (datahub-project#8795) fix(ingest/datahub): Support postgres; build(postgres): Modernize postgres docker setup (datahub-project#8762) docs(session): add documentation for session token duration and fix default (datahub-project#8791) chore(analytics): bump version (datahub-project#8786) ...
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
devops
PR or Issue related to DataHub backend & deployment
merge-pending-ci
A PR that has passed review and should be merged once CI is green.
product
PR or Issue related to the DataHub UI/UX
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Currently, the system metadata object captures the runId when the aspect was first observed. This is problematic for resolving the ingestion source that was responsible for an ingestion of a specific aspect, mainly because we use the runId to backtrace to an ingestion source, but we grab the runid using the max of the lastObserved timestamps.
If an entity does not change frequently, it's possible that we do not have the ingestion executionRequest object anymore for the runId in the system metadata.
To address this problem, we are now ALSO saving a lastRunId field in system metadata which can be used to always track the most recent run id that touched a given aspect, even if the aspect DID NOT CHANGE. This is a MUCH more reliable way to find the ingestion source that was used to ingest an urn.
Status
Ready for review
Checklist