Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(gms): Write back lineage search results to in-memory cache bound to feature flag #6006

Conversation

RyanHolstien
Copy link
Collaborator

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable). If a new feature has been added a Usage Guide has been added for the same.
  • For any breaking change/potential downtime/deprecation/big changes an entry has been made in Updating DataHub

@github-actions github-actions bot added the product PR or Issue related to the DataHub UI/UX label Sep 21, 2022
@github-actions
Copy link

github-actions bot commented Sep 21, 2022

Unit Test Results (build & test)

562 tests  ±0   562 ✔️ ±0   14m 7s ⏱️ + 1m 8s
139 suites ±0       0 💤 ±0 
139 files   ±0       0 ±0 

Results for commit e0434ab. ± Comparison against base commit b638bcf.

♻️ This comment has been updated with latest results.

Copy link
Contributor

@shirshanka shirshanka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small suggestions to make it easier to debug if there is unexpected staleness.

if (lineageResult == null) {
maxHops = maxHops != null ? maxHops : 1000;
lineageResult = _graphService.getLineage(sourceUrn, direction, 0, MAX_RELATIONSHIPS, maxHops);
if (cacheEnabled) {
cache.put(Pair.of(sourceUrn, direction), lineageResult);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we drop in the current timestamp into the value section?
So maybe

cache.put(Pair.of(sourceUrn,direction), Pair.of(lineageResult, currentTimestamp))

that way we can always know when a cache entry was inserted into the cache, at least for debugging purposes.

@@ -71,10 +73,14 @@ public LineageSearchResult searchAcrossLineage(@Nonnull Urn sourceUrn, @Nonnull
@Nonnull List<String> entities, @Nullable String input, @Nullable Integer maxHops, @Nullable Filter inputFilters,
@Nullable SortCriterion sortCriterion, int from, int size) {
// Cache multihop result for faster performance
EntityLineageResult lineageResult = cache.get(Pair.of(sourceUrn, direction), EntityLineageResult.class);
EntityLineageResult lineageResult = cacheEnabled ? cache.get(Pair.of(sourceUrn, direction), EntityLineageResult.class)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a log.warn here if we found something in the cache and if the insert timestamp is from a long time ago (we can default to hard-coded 1 hour for this PR).

@shirshanka shirshanka merged commit 2c65921 into datahub-project:master Sep 22, 2022
@RyanHolstien RyanHolstien deleted the piyushn/fix_lineage_search_caching branch September 22, 2022 22:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
product PR or Issue related to the DataHub UI/UX
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants