test(InsightsTest): match retrySearchUntil threshold to asserted asset count#2500
Merged
Conversation
…t count InsightsTest.searchAssets creates 4 assets (1 AtlanCollection + 2 Folder + 1 AtlanQuery) and asserts both that there are 3 distinct typeName aggregation buckets and that entities.size() == 4 / approximateCount == 4L. It called retrySearchUntil(index, 3L), which retries until hits >= 3. When the 4th asset (commonly the last-created AtlanQuery) hasn't been indexed yet, retry stops at 3 hits but only 2 distinct types exist, and the bucket-count assertion intermittently fails as "expected [3] but found [2]" on the new daily Test (leangraph-test) workflow's matrix run. Match the threshold to the asset count actually under assertion. This is a latent test bug — independent of any ES refresh-semantics changes on the server side. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
41ca63c to
bf51b71
Compare
7 tasks
cmgrote
approved these changes
May 18, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two surgical fixes to make the daily
Test (leangraph-test)workflow deterministic. Both target eventual-consistency races against the ES index that surface only under the workflow's parallel matrix on a shared tenant.InsightsTest.searchAssetsexpected [3] but found [2]on typeName aggregation bucket countretrySearchUntil(index, 3L)→4Lso retry waits for all 4 assets to be indexed before evaluating the aggregationSuggestionsTest.findSuggestions*expected [1] but found [0]onownerGroups/descriptions/assignedTermssuggestion listsawaitConsistency()to retry-search for the peer columns until their non-tag metadata is also visible in ES, not just tagsTogether these address MS-1269 and MS-1270.
Root-cause walkthrough
InsightsTest.searchAssets
The test creates 4 assets and asserts that the
typeNameterm-aggregation has 3 buckets (AtlanCollection,Folder,AtlanQuery). It also assertsentities.size() == 4andapproximateCount == 4Lfurther down.But it called
retrySearchUntil(index, 3L)— retry until hits ≥ 3. When the 4th asset (typically the most-recently-createdAtlanQuery) hasn't been indexed yet, the retry stops at 3 hits but only 2 distinct typeName values are present, so the bucket-count assertion intermittently fires. Aligning the retry threshold to the asset count closes the race.SuggestionsTest.awaitConsistency
awaitConsistency()currently callswaitForTagsToSync(taggedAssetGuids, log), which covers__classificationNamesand__traitNamesbut notownerGroups,description,userDescription, or__meanings. The Suggestions API aggregates these additional fields from peer columns. Under parallel matrix load on a shared tenant, the ES outbox processor (30-second idle poll) hasn't drained the peer's non-tag updates by the timefindSuggestionsDefaultruns, so all aggregations come back empty.Diagnostic evidence:
SuggestionsTestpasses 24/24 in 1m 39s againstleangraph-test[ms-1268-trace]server logs (separate diagnostic branch) confirm the peer column's__meaningsandownerGroupsare correctly persisted in Cassandra and emitted into the ES bulk update body — they just hadn't arrived in ES yet when the test queriedThe extension issues an explicit search that filters Columns named
COLUMN_NAME1with all four metadata fields existing, retrying until both metadata-bearing peers (t1c1andv1c1) are visible.retrySearchUntilalready encapsulates exponential backoff and bounded retries.What this is not fixing
switchable-graph-provideras #6721certificateStatus) — that was a cascade of MS-1267; resolved once MS-1267 was deployedAfter this PR plus the existing
PurposeTest/asset-importtoken-permission gaps that are environment-specific, theTest (leangraph-test)workflow should hold steady.Test plan
Test (leangraph-test)workflow, confirmIntegration (InsightsTest)andIntegration (SuggestionsTest)are green for 3 consecutive runsTestworkflow still passes both jobs (no regression on the other tenant)🤖 Generated with Claude Code