Add external run Tests#28637
Conversation
There was a problem hiding this comment.
Pull request overview
Adds infrastructure to run the Java integration / Playwright / scale test suites against an external, already-running OpenMetadata cluster (where the search engine is not directly reachable on :9200). It introduces an admin-only test-support REST endpoint that proxies a read-only subset of search engine requests, switches the in-test SearchClient to route through that endpoint when external config is present, and adds a new GitHub Actions workflow that obtains a JWT and drives the suites.
Changes:
- New
TestSupportSearchResourceplusSearchClient.rawSearchRequestimplementations on the ES and OS clients for an admin-gated, read-only search passthrough. - Test harness (
OssTestServer,ServerHandle,ExternalServer,SearchClient,IndexAliasInspector,Scale100kEntitiesIT) updated to detect external mode and route search introspection through the passthrough; affected tests are tagged (search-direct) or skipped (LiveIndexRetryIT,ReindexStatsIT#orphanedSchemaDoesNotFailReindex). - New
java-playwright-external.ymlworkflow runsui-it,search-it, and matrixedscale-itprofiles against the external cluster.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| openmetadata-service/.../search/SearchClient.java | Adds RawSearchResponse record and default rawSearchRequest API |
| openmetadata-service/.../opensearch/OpenSearchClient.java | Implements rawSearchRequest via OS generic client |
| openmetadata-service/.../elasticsearch/ElasticSearchClient.java | Implements rawSearchRequest via ES low-level client, parses query string |
| openmetadata-service/.../resources/testsupport/TestSupportSearchResource.java | New admin-only read-only passthrough endpoints (/passthrough, /cluster-alias, /exists) |
| openmetadata-integration-tests/.../scenarios/search/reindex/*UIIT.java | Tag direct-search UI ITs with search-direct for exclusion in external runs |
| openmetadata-integration-tests/.../it/util/OssTestServer.java | Delegate to UiTestServer when OM_URL+OM_ADMIN_TOKEN are set |
| openmetadata-integration-tests/.../it/tests/search/Scale100kEntitiesIT.java | Parameterize seed/timeout, resolve table alias dynamically |
| openmetadata-integration-tests/.../it/tests/search/ReindexStatsIT.java | Skip orphan-schema test in external mode |
| openmetadata-integration-tests/.../it/tests/search/LiveIndexRetryIT.java | Skip whole class in external mode |
| openmetadata-integration-tests/.../it/server/ServerHandle.java | Add external flag + isExternal() |
| openmetadata-integration-tests/.../it/server/ExternalServer.java | Mark handles produced from env as external |
| openmetadata-integration-tests/.../it/search/SearchClient.java | Route through /v1/test-support/search when external |
| openmetadata-integration-tests/.../it/search/IndexAliasInspector.java | Fetch cluster alias from server in external mode |
| .github/workflows/java-playwright-external.yml | New workflow to drive UI/search/scale suites against an external cluster |
| private static void validateReadOnly(String path) { | ||
| String value = normalize(path); | ||
| boolean allowed = ALLOWED_TOKENS.stream().anyMatch(value::contains); | ||
| if (!allowed) { | ||
| throw new BadRequestException( | ||
| "Only read-only search introspection paths are permitted: " + ALLOWED_TOKENS); | ||
| } | ||
| } |
| int queryStart = endpoint.indexOf('?'); | ||
| String path = queryStart >= 0 ? endpoint.substring(0, queryStart) : endpoint; | ||
| es.co.elastic.clients.transport.rest5_client.low_level.Request request = | ||
| new es.co.elastic.clients.transport.rest5_client.low_level.Request(method, path); | ||
| if (queryStart >= 0) { | ||
| for (String pair : endpoint.substring(queryStart + 1).split("&")) { | ||
| int eq = pair.indexOf('='); | ||
| if (eq > 0) { | ||
| request.addParameter(pair.substring(0, eq), pair.substring(eq + 1)); | ||
| } | ||
| } | ||
| } |
| private static void validateReadOnly(String path) { | ||
| String value = normalize(path); | ||
| boolean allowed = ALLOWED_TOKENS.stream().anyMatch(value::contains); | ||
| if (!allowed) { | ||
| throw new BadRequestException( | ||
| "Only read-only search introspection paths are permitted: " + ALLOWED_TOKENS); | ||
| } |
There was a problem hiding this comment.
⚠️ Security: Allowlist bypass via substring match in validateReadOnly
The validateReadOnly method at line 142 uses value::contains to check if the path contains any allowed token. This is a substring match, meaning a path like /_bulk?ignore=_count or /_delete_by_query?_search=1 would pass validation because it contains _count or _search as a substring, while actually targeting a mutating endpoint.
While admin-only auth is a mitigating factor, this weakens the defense-in-depth intent of the allowlist. The check should verify the path segment rather than performing a simple substring search.
Check allowed tokens only in the path portion (before '?') and require them to appear as path segments (preceded by '/'):
private static void validateReadOnly(String path) {
String value = normalize(path);
// Split on '?' to isolate the path portion from query params
String pathOnly = value.contains("?") ? value.substring(0, value.indexOf('?')) : value;
boolean allowed =
ALLOWED_TOKENS.stream()
.anyMatch(token -> pathOnly.contains("/" + token) || pathOnly.endsWith(token));
if (!allowed) {
throw new BadRequestException(
"Only read-only search introspection paths are permitted: " + ALLOWED_TOKENS);
}
}
- Apply fix
Check the box to apply the fix or reply for a change | Was this helpful? React with 👍 / 👎
| @Path("/v1/test-support/search") | ||
| @Collection(name = "testSupportSearch") |
There was a problem hiding this comment.
⚠️ Security: Test-support endpoint ships in production with no disable gate
The TestSupportSearchResource is annotated with @Collection, which means it is auto-registered in every deployment including production. Although it requires admin auth, shipping a raw search-engine passthrough in production increases the attack surface — a compromised admin token could exfiltrate arbitrary search engine data or (combined with the allowlist bypass) perform mutations.
Consider gating registration behind a configuration flag (e.g., testSupportEnabled: true in the YAML config) or a build profile so it's excluded from production images.
Gate the resource behind an opt-in environment variable so it's not available in production by default:
// Option: Check a config flag in the constructor and short-circuit all methods,
// or conditionally register the resource in the Collection scanner.
// Minimal approach: add an env-var / config guard in the constructor:
public TestSupportSearchResource(Authorizer authorizer) {
this.authorizer = authorizer;
if (!Boolean.parseBoolean(System.getenv("OM_TEST_SUPPORT_ENABLED"))) {
throw new IllegalStateException(
"TestSupportSearchResource is disabled; set OM_TEST_SUPPORT_ENABLED=true to enable");
}
}
- Apply fix
Check the box to apply the fix or reply for a change | Was this helpful? React with 👍 / 👎
🟡 Playwright Results — all passed (13 flaky)✅ 4259 passed · ❌ 0 failed · 🟡 13 flaky · ⏭️ 88 skipped
🟡 13 flaky test(s) (passed on retry)
How to debug locally# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip # view trace |
| public Response exists(@Context SecurityContext securityContext, @QueryParam("path") String path) | ||
| throws IOException { | ||
| authorizer.authorizeAdmin(securityContext); | ||
| validateReadOnly(path); | ||
| RawSearchResponse response = | ||
| Entity.getSearchRepository() | ||
| .getSearchClient() | ||
| .rawSearchRequest("GET", normalize(path), null); | ||
| boolean exists = response.statusCode() >= 200 && response.statusCode() < 300; | ||
| return Response.ok(Map.of("exists", exists)).build(); | ||
| } |
| String body; | ||
| try (var is = response.getEntity().getContent()) { | ||
| body = new String(is.readAllBytes(), java.nio.charset.StandardCharsets.UTF_8); | ||
| } |
| Assumptions.assumeTrue( | ||
| System.getenv("OM_URL") == null, | ||
| "LiveIndexRetryIT pauses the ES container (EsOutageInjector) and reads the retry queue via " | ||
| + "the in-JVM DAO — both require the embedded stack, so it is skipped in external mode"); |
| Assumptions.assumeTrue( | ||
| System.getenv("OM_URL") == null, | ||
| "Creating an orphan (schema row hard-deleted without cascading to its table) needs the " | ||
| + "in-JVM DAO; the REST API would reject or cascade, so this case is embedded-only"); |
| private static final List<String> ALLOWED_TOKENS = | ||
| List.of("_count", "_search", "_alias", "_cat/indices"); |
| private static void validateReadOnly(String path) { | ||
| String value = normalize(path); | ||
| boolean allowed = ALLOWED_TOKENS.stream().anyMatch(value::contains); | ||
| if (!allowed) { | ||
| throw new BadRequestException( | ||
| "Only read-only search introspection paths are permitted: " + ALLOWED_TOKENS); | ||
| } |
| private static void validateReadOnly(String path) { | ||
| String value = normalize(path); | ||
| boolean allowed = ALLOWED_TOKENS.stream().anyMatch(value::contains); | ||
| if (!allowed) { | ||
| throw new BadRequestException( | ||
| "Only read-only search introspection paths are permitted: " + ALLOWED_TOKENS); | ||
| } |
| if (queryStart >= 0) { | ||
| for (String pair : endpoint.substring(queryStart + 1).split("&")) { | ||
| int eq = pair.indexOf('='); | ||
| if (eq > 0) { | ||
| request.addParameter(pair.substring(0, eq), pair.substring(eq + 1)); | ||
| } | ||
| } | ||
| } |
| String body; | ||
| try (var is = response.getEntity().getContent()) { | ||
| body = new String(is.readAllBytes(), java.nio.charset.StandardCharsets.UTF_8); | ||
| } | ||
| return new RawSearchResponse(response.getStatusCode(), body); | ||
| } |
| private String login() { | ||
| final String body = "{\"email\":\"" + email + "\",\"password\":\"" + passwordB64 + "\"}"; | ||
| try { |
| @Path("/v1/test-support/search") | ||
| @Collection(name = "testSupportSearch") | ||
| @Tag(name = "TestSupport", description = "Admin-only read-only search passthrough for tests") | ||
| @Produces(MediaType.APPLICATION_JSON) | ||
| @Slf4j | ||
| public class TestSupportSearchResource { |
…st docs Keep only public PR/issue numbers and generic symptoms in the new search-indexing regression tests. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Regression guards for reported search-indexing issues, covering both the live indexing path (entity create/update -> SearchRepository) and the reindex path (SearchResource -> SearchIndexingApplication bulk sink): - GlossaryRenameSearchUIIT / ClassificationRenameSearchUIIT: prefix-rename must not double-apply to the search doc (live + reindex) - LongCompoundNameSearchUIIT: long compound names stay searchable - PipelineOwnerIndexUIIT: owned entity carries its owner in the index - TestCaseResultReindexUIIT / TestCaseTierRebuildUIIT: status/tier survive a recreate reindex - OwnersNestedMappingIT: every index maps owners as nested (RBAC) - CustomPropertyAggregationIT: custom-property aggregations don't fail shards - UnsafeReindexConfigIT: server survives an aggressive bulk-sink config - ReindexOutageRecoveryIT: reindex path reconciles after an engine outage Docs reference only public PR/issue numbers and generic symptoms. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
| private static void validateReadOnly(String path) { | ||
| String value = normalize(path); | ||
| boolean allowed = ALLOWED_TOKENS.stream().anyMatch(value::contains); | ||
| if (!allowed) { | ||
| throw new BadRequestException( | ||
| "Only read-only search introspection paths are permitted: " + ALLOWED_TOKENS); | ||
| } | ||
| } |
| SdkClients.overrideAdminToken(token); | ||
| AuthSession.update( | ||
| new TokenSet(token, null, null, Instant.now().plus(Duration.ofDays(3650)))); | ||
| LOG.info("Refreshed external admin token (re-login)"); |
| String body; | ||
| try (var is = response.getEntity().getContent()) { | ||
| body = new String(is.readAllBytes(), java.nio.charset.StandardCharsets.UTF_8); | ||
| } | ||
| return new RawSearchResponse(response.getStatusCode(), body); | ||
| } |
Regression guards for reported search-indexing issues, covering both the live indexing path (entity create/update -> SearchRepository) and the reindex path (SearchResource -> SearchIndexingApplication bulk sink): - GlossaryRenameSearchUIIT / ClassificationRenameSearchUIIT: prefix-rename must not double-apply to the search doc (live + reindex) - LongCompoundNameSearchUIIT: long compound names stay searchable - PipelineOwnerIndexUIIT: owned entity carries its owner in the index - TestCaseResultReindexUIIT / TestCaseTierRebuildUIIT: status/tier survive a recreate reindex - OwnersNestedMappingIT: every index maps owners as nested (RBAC) - CustomPropertyAggregationIT: custom-property aggregations don't fail shards - ExtensionHighlightSearchIT: highlight on a flattened extension.* field does not fail the shard (no 'has no associated analyzer' 500) - UnsafeReindexConfigIT: server survives an aggressive bulk-sink config - ReindexOutageRecoveryIT: reindex path reconciles after an engine outage Docs reference only public PR/issue numbers and generic symptoms. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Regression guards for reported search-indexing issues, covering both the live indexing path (entity create/update -> SearchRepository) and the reindex path (SearchResource -> SearchIndexingApplication bulk sink): - GlossaryRenameSearchUIIT / ClassificationRenameSearchUIIT: prefix-rename must not double-apply to the search doc (live + reindex) - LongCompoundNameSearchUIIT: long compound names stay searchable - PipelineOwnerIndexUIIT: owned entity carries its owner in the index - TestCaseResultReindexUIIT / TestCaseTierRebuildUIIT: status/tier survive a recreate reindex - OwnersNestedMappingIT: every index maps owners as nested (RBAC) - CustomPropertyAggregationIT: custom-property aggregations don't fail shards - ExtensionHighlightSearchIT: highlight on a flattened extension.* field does not fail the shard (no 'has no associated analyzer' 500) - UnsafeReindexConfigIT: server survives an aggressive bulk-sink config - ReindexOutageRecoveryIT: reindex path reconciles after an engine outage Docs reference only public PR/issue numbers and generic symptoms. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Code Review
|
| Compact |
|
Was this helpful? React with 👍 / 👎 | Gitar
| /** Status code + raw JSON body of a low-level passthrough request to the search engine. */ | ||
| record RawSearchResponse(int statusCode, String body) {} |
| } catch (final ExecutionException | java.util.concurrent.TimeoutException e) { | ||
| failed++; | ||
| if (firstFailure == null) { | ||
| firstFailure = (e instanceof ExecutionException) ? e.getCause() : e; | ||
| } | ||
| } |
| private static void validateReadOnly(String path) { | ||
| String value = normalize(path); | ||
| boolean allowed = ALLOWED_TOKENS.stream().anyMatch(value::contains); | ||
| if (!allowed) { | ||
| throw new BadRequestException( | ||
| "Only read-only search introspection paths are permitted: " + ALLOWED_TOKENS); | ||
| } |
|



Search-indexing regression tests — issue coverage
This branch adds behaviour/regression tests for reported search-indexing issues. Tests target
both indexing paths where relevant — the live path (entity create/update →
SearchRepository) and the reindex path (SearchResource→SearchIndexingApplicationbulksink).
Legend: ✅ new test added (this branch) · 🟡 covered by existing/folded into existing test ·
⛔ not added (reason given) · Fixed? = whether the underlying product issue is resolved.
A. Query-time mapping / field-type 500s
extension.*highlight →search 500(no analyzer)ExtensionHighlightSearchIT(verified on embedded OpenSearch — highlight on a flattenedextension.*field no longer fails the shard)extension.*/ non-aggregatable field → 500CustomPropertyAggregationITvalue_countagg on string custom property → shard failureCustomPropertyAggregationITsearch_afterbreaks on comma-containing namesownersmappedobjectnotnested(13 mapping files, incl. locales)OwnersNestedMappingIT(sweeps every index)ownersnotnested(e2e upgrade)OwnersNestedMappingITtoo_many_nested_clauseson long compound namesLongCompoundNameSearchUIITextension→fileExtensionaggregation 500 after upgradedataAssetReindexAliasSwapIT,NoDuplicatesDuringReindexIT)B. Indexing-time parse failures (docs dropped at sink)
CustomPropertyAggregationIT+ immense-term ITSearchIndexImmenseTermITSearchIndexImmenseTermITSearchIndexImmenseTermIT/SearchIndexFieldLimitITC. Reindex job reliability (stuck / stale / lock / restart)
redeploy_pipelines409 (stale hybrid runner)ReindexStopUnderLoadIT(single-pod); multi-pod ⛔ReindexOutageRecoveryIT(reindex path) +LiveIndexRetryIT(live path)ReindexStopUnderLoadITTestCaseResultReindexUIITD. Reindex resource exhaustion (OOM / disk / payload)
UnsafeReindexConfigIT(server-survives guard)UnsafeReindexConfigIT(partial)ReindexOutageRecoveryIT(reindex-path fault recovery)ReindexOutageRecoveryITSearchIndexImmenseTermITE. Stale / missing data after reindex (correctness)
testCaseResultomitted)TestCaseResultReindexUIIT(reindex path)TestCaseTierRebuildUIIT(reindex path)GlossaryRenameSearchUIIT+ClassificationRenameSearchUIIT(live + reindex paths)isOwnerdeniesPipelineOwnerIndexUIITNoDuplicatesDuringReindexIT/ReindexAliasSwapITSimpleReindexTriggerUIITF. Performance / slowness
Scale100kEntitiesIT— recordsreindex_msover 100k/500k cohorts; no hard time threshold (env-dependent → flaky)Scale100kEntitiesIT(throughput/latency metrics; #27865 added fast-write settings)Scale100kEntitiesITassertsdb_count == es_countafter recreate — catches silent under-indexingG. Pre-release / CI / internal
table_search_indexReindexAliasSwapITNew tests in this branch (11):
GlossaryRenameSearchUIIT,ClassificationRenameSearchUIIT,LongCompoundNameSearchUIIT,PipelineOwnerIndexUIIT,TestCaseResultReindexUIIT,TestCaseTierRebuildUIIT(UIIT);OwnersNestedMappingIT,CustomPropertyAggregationIT,ExtensionHighlightSearchIT,UnsafeReindexConfigIT,ReindexOutageRecoveryIT(IT). The 5 ITsare validated green on the embedded stack; the 6 UIITs are compile-verified.
Describe your changes:
Fixes #
I worked on ... because ...
Type of change:
High-level design:
N/A — small change.
Tests:
Use cases covered
Unit tests
Backend integration tests
Ingestion integration tests
Playwright (UI) tests
Manual testing performed
UI screen recording / screenshots:
Not applicable.
Checklist:
Fixes <issue-number>: <short explanation>Fixes #<issue-number>above.Summary by Gitar
TestSupportSearchResourceas a restricted admin-only passthrough for remote search engine introspection.ExternalTokenRefresherto maintain valid session tokens during long-running integration suites against remote clusters.SearchClientto route requests through the API passthrough when running in external mode.rawSearchRequestsupport toElasticSearchClientandOpenSearchClientto facilitate low-level index inspection.EntityLoaderconcurrency viajpw.loader.maxWorkersto prevent 504 timeouts on shared external infrastructure.NamespaceCleanupfor recursive hard-deletion of root entities to maintain state hygiene.ExtensionHighlightSearchITto guard against index-shard failures during highlighting on flattened fields.ReindexOutageRecoveryIT,UnsafeReindexConfigIT,CustomPropertyAggregationIT, andOwnersNestedMappingITto expand test coverage.SimpleReindexTriggerUIITto use exact index count assertions.This will update automatically on new commits.