Containers: batch container data-model column tag retrieval to avoid subtree fan-out by harshach · Pull Request #27836 · open-metadata/OpenMetadata

harshach · 2026-04-30T05:05:49Z

Summary

Container retrieval was running a depth-unbounded LIKE 'hashPrefix.%' against tag_usage to fetch tags for a container's data-model columns. Because Containers can contain other
Containers indefinitely (S3/ADLS/GCS folder hierarchies), this prefix match returned the entire descendant subtree of every nested container, then filtered in memory by exact hash match.
For a instance with ~500K assets under one bucket, a single top-level container GET matched ~595K rows out of 7.4M.

This PR replaces the prefix LIKE with a batched exact-match IN over the columns' hashes — Container-only, since it's the only entity type with unbounded child nesting.

Background

Companion to #27745 / #27824, which restored the partial-index match by adding back the WHERE state = 1 predicate. Those PRs unblock the seq scan; this PR removes the unnecessary fan-out
so the indexed seek doesn't return 595K rows we don't need.

Why Container only

Surveyed all 8 callers of Entity.populateEntityFieldTags (Table, Container, File, Topic, DashboardDataModel, APIEndpoint, Worksheet, TestCase). Only Container's children can themselves
nest — the rest have schema-bounded child sets where the prefix LIKE is fine. No shared-helper change, no blast radius on the other 7 entity types.

Changes

CollectionDAO.TagUsageDAO

Added getTagsInternalByTargetHashes — same SQL as getTagsInternalByPrefix (with the tag / glossary_term_entity JSON join) but WHERE targetFQNHash IN () instead of LIKE.
Added default Map<String, List> getTagsByTargetFQNHashes(List) — chunks input at 1000 hashes per query, returns a hash → tags map.

ContainerRepository

Replaced both populateEntityFieldTags call sites (single-entity GET in setFields, PII path in getSampleData) with a private populateDataModelColumnTags(setTags, columns) that flattens
columns (including struct children), hashes their FQNs, and calls the new batched DAO.
Removed the now-unused populateEntityFieldTags import and the unused fqnPrefix parameter.

Untouched:

Entity.populateEntityFieldTags (still used by Table, Topic, Dashboard, APIEndpoint, Worksheet, File, TestCase — fine for them).
bulkPopulateEntityFieldTags (LIST path) — already uses chunked IN-list batching via getTagsInternalBatch.

Behavior

Same observable semantics (derived-tag wrapping, setTags=false skip, struct-child flattening). Difference is purely in how the data is fetched:

┌───────────────────────────────────────┬───────────────────────────────────────────────┬─────────────────────────────────────────────────┐
│ │ Before │ After │
├───────────────────────────────────────┼───────────────────────────────────────────────┼─────────────────────────────────────────────────┤
│ Query │ WHERE targetfqnhash_lower LIKE 'hash.%' │ WHERE targetFQNHash IN (...) │
├───────────────────────────────────────┼───────────────────────────────────────────────┼─────────────────────────────────────────────────┤
│ Match scope │ Entire descendant subtree │ Exact column FQNs only │
├───────────────────────────────────────┼───────────────────────────────────────────────┼─────────────────────────────────────────────────┤
│ Rows fetched on a top-level container │ All tags below it (595K in the reported case) │ Tag count of the data model's flattened columns │
└───────────────────────────────────────┴───────────────────────────────────────────────┴─────────────────────────────────────────────────┘

Tests

Added to ContainerResourceIT:

get_parentDataModelTags_doesNotLeakChildContainerTags — creates a parent container with tagged columns plus a tagged child container nested under it, GETs the parent, asserts (a)
parent's column tags are correct and (b) the child container's tag does not leak into any of the parent's columns. Regression test for the over-fetch.
get_dataModelStructColumnTags_areReturned — tags a column nested inside a STRUCT, verifies the batched fetch returns it through the flattened-children traversal.

Verification

✅ mvn -pl openmetadata-service install -DskipTests — clean
✅ Both new IT tests pass individually
✅ All 30 ContainerResourceIT#get_* tests pass (existing + new)
✅ test_sampleData* + patch_dataModelColumnTags* pass (covers the PII call site)

Summary by Gitar

SDK Refactoring:
- Updated ContainerService to use the base class method buildPathWithEncodedName instead of a manual URLEncoder implementation.
- Promoted buildPathWithEncodedName in EntityServiceBase to protected to enable access by subclasses.

_{This will update automatically on new commits.}

Copilot

Pull request overview

This PR improves how Container dataModel column tags are fetched to address performance problems and incorrect tag leakage in deeply nested container hierarchies. It replaces prefix-based tag retrieval with an exact-match (by target FQN hash) batched lookup, and adds integration tests to validate correct tag isolation and nested-struct tag handling.

Changes:

Update ContainerRepository to fetch dataModel column tags via flattened columns + exact target FQN hash lookups.
Add TagUsageDAO query/helper to retrieve tag usages by a list of target hashes with chunking.
Add integration tests to ensure parent containers don’t receive child container column tags and that nested struct child column tags are returned.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File	Description
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/ContainerRepository.java	Switch column-tag population from prefix scanning to flattened-column exact hash batching.
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/CollectionDAO.java	Add TagUsageDAO query + chunked helper for fetching tags by target FQN hashes.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/ContainerResourceIT.java	Add regression tests for tag leakage across container hierarchy and nested-struct column tag retrieval.

populateDataModelColumnTags previously called addDerivedTagsGracefully once per flattened column, which internally batches across that column's own tags but issues a separate derived-tag DB lookup for every column. On data models with many columns (or struct types with deep nesting) this becomes an N+1 pattern. Refactor: - Pre-compute Map<String, Column> hashToColumn once (LinkedHashMap to preserve column order) so we no longer hash each FQN twice — once for the target-hash list and again on lookup. - After fetching tags by target hash, flatten all returned TagLabels into a single list and call TagLabelUtil.batchFetchDerivedTags(...) once for the whole data model. - Per column, use addDerivedTagsWithPreFetched(columnTags, derivedMap) to avoid further DB lookups. - Fall back to the per-column addDerivedTagsGracefully path if the batch derived-tag fetch raises, preserving existing semantics. Net effect: total derived-tag DB queries drop from O(N) to 1 regardless of column count or nesting depth. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

fetchAndSetDataModelColumnTags is the bulk fieldFetcher used by setFieldsInBulk on every page of containers (search index rebuilds, list endpoints, PII masking, etc). It batched container-level tags in a single IN query, but then called addDerivedTagsGracefully once per container to enrich with derived tags. With pages of 50 and a service holding 100k containers, that's up to 100k extra derived-tag DB round-trips on container-level tags alone — separate from the per- column N+1 already addressed in populateDataModelColumnTags. Refactor: - Flatten all returned container TagLabels and call TagLabelUtil.batchFetchDerivedTags(allContainerTags) once per bulk batch. - Per container, use addDerivedTagsWithPreFetched(containerTags, derivedTagsMap) so no further DB lookup happens. - Fall back to per-container addDerivedTagsGracefully if the batch fetch raises, preserving existing semantics. Net effect: container-level derived-tag DB queries drop from O(N) per page to 1, regardless of page size. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.

ContainerRepository.listChildren paged through the relationship table and then called Entity.getEntity once per child reference to hydrate the entity. With the UI's default page size of 10 that's already 10 sub-queries per page; the API allows up to 1,000,000 per page, so the loop scales linearly with page size. Replace the per-ref loop with a single dao.findEntitiesByIds(ids, Include.ALL) call, then look each ref up in the resulting map to preserve the relationship-offset ordering returned by findToWithOffset. findEntitiesByIds chunks at 30k IDs internally, so callers requesting the maximum page are still safe. Refs whose entity was deleted between the relationship lookup and the bulk fetch are skipped rather than failing the whole page (mirrors the per-ref Entity.getEntity behaviour where a missing entity would have thrown). Net effect on a default-paginated UI request: ~11 queries → 2 (one findToWithOffset, one bulk findEntitiesByIds). The countFindTo total query is unchanged. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Removing `children` from the Container `fields=` allow-list (the slim payload change for Tahoe-shaped buckets) broke two integration tests that fetched the parent with `fields=["*"]` and walked `bucket.children.root`. With the new contract that field is `None`, so those assertions raised AttributeError. Migrate both tests to the dedicated `/v1/containers/name/{fqn}/children` endpoint via a new `OpenMetadata.list_container_children(fqn, limit, offset)` helper. The helper is the public migration path SDK consumers should follow for the same reason; documenting it in code is cheaper than a release note. Affected: - tests/integration/auto_classification/containers/test_container_classification.py - tests/integration/s3/test_s3_storage.py Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…xity The 30s in-memory count cache had two problems that outweighed its benefit: - Per-pod (Guava, not Redis) so two pods serve different paging.totals; paging gets visibly inconsistent during the TTL window. - Applied to every entity, including small tables (tag, glossary_term, …) where SELECT count(*) is microseconds and caching adds pure overhead. Drop the cache, the cachedListCount wrapper, and the buildListCountCacheKey helper; switch the three callers in listAfter / listAfterWithOffset / listBefore to call dao.listCount directly. If a specific entity's count is ever measured to be slow, the right fix is targeted (Postgres pg_class.reltuples estimate, or just dropping paging.total entirely for that endpoint), not a per-pod global cache. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 24 out of 24 changed files in this pull request and generated 3 comments.

Five small Copilot/gitar findings on PR #27836: - OpenMetadataServerHealthCheckTest: comment said "intentionally low (1 ms)" while the assertion enforces 50 ms; updated comment to match. - ContainerRepository derived-tags fallback: pass the exception object to LOG.warn so the stack trace is preserved (was logging only ex.getMessage()). - HikariCPDataSourceFactory: drop the stale "default 0 (disabled)" inline comment that contradicts the 60s default we now apply. - DataAssetsHeader: include `entityType` in the breadcrumb fetch effect's dependency array so switching entity type while keeping FQN re-runs. - storageAPI / ContainerChildren: type the /children response as `PagingResponse<Container[]>` (matches the slim Container summaries the server now returns) and tighten the component's state + column generics to `Container[]` instead of `EntityReference[]`. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

basedpyright runs in --baselinemode=discard for the static-checks CI job, so new uses of self.get_suffix / self._use_raw_data / dict-typed responses on self.client.get(...) are flagged even though the same patterns are baselined in older mixin code. Sidestep the new errors by hardcoding the /containers path (the REST client prepends /api/v1) and narrowing the response via isinstance(resp, dict) before calling .get(...). Drop the _use_raw_data branch — the helper has a typed return so callers don't need a raw-response escape hatch. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 24 out of 24 changed files in this pull request and generated 2 comments.

github-actions · 2026-05-01T00:16:51Z

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

Two gitar review findings on the container tag-fetch refactor: - The setTags=false branch of populateDataModelColumnTags ran a c.setTags(c.getTags()) loop over every flattened column — a literal no-op carried over from Entity.populateEntityFieldTags. Replace it with an early return and a comment explaining the prior behavior. - The batchFetchDerivedTags try/catch + warn-and-fallback existed verbatim in both fetchAndSetDataModelColumnTags (bulk path) and populateDataModelColumnTags (single path). Extract a private tryBatchFetchDerivedTags helper so the fallback behavior stays in lockstep across both paths. No behavior change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- ContainerChildren: include `t` in the columns useMemo dep array so in-flight language switches re-render the column titles. The memo had [] but referenced t('label.name'), so the title stayed in the prior language until the component remounted. - ContainerPage: eagerly fetch the children paging.total on container navigation (limit=0, no row payload) so the Children tab badge shows the correct count immediately. Without this, the badge stayed at 0 for containers that default to a non-Children tab (e.g. ones with a dataModel that opens the Schema tab) until the user clicked Children and ContainerChildren reported the count via its context setter. - ContainerPage.test: add a default mockResolvedValue for the new getContainerChildrenByName call so existing tests don't blow up on an unmocked .then. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 25 out of 25 changed files in this pull request and generated 2 comments.

@slf4j

Drop the dedicated slow-request log entirely and let those lines flow into the main application log alongside everything else. Three pieces removed: - conf/openmetadata.yaml: the org.openmetadata.slowrequest logger block that pinned level=OFF (additive=false) and routed records to a separate ./logs/slow-requests.log file. Without the override, the topic now inherits root (INFO + server.log + stdout). - openmetadata-service / integration-tests logback.xml: matching per-logger entries removed. - OpenMetadataApplication.routeSlowRequestLogToServerLog: the runtime hack that reached into the logback context to flip additivity on. With the YAML override gone, the workaround it was working around is gone too. RequestLatencyContext still uses @slf4j(topic = "org.openmetadata.slowrequest") so operators can grep / filter by name, but the slow-request emission is now LOG.info instead of LOG.warn — the user's request was "log to main logger in INFO mode". The other LOG.warn calls in the class stay (they're real config-error warnings, not slow-request emissions). Addresses Copilot review 4209338255 (don't override operator logging config) by removing the override entirely. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Replace `(container?.children ?? []) as unknown as Container[]` with a structural row type (`ContainerChildRow`) that only declares the fields the table actually renders. Both the slim Container summaries returned by the /children endpoint and the EntityReference[] the customize-page widget preview feeds in are structurally assignable to it, so no cast is needed and the compiler will catch a real shape mismatch (e.g., if `description` ever moved off Container) instead of letting it slip through. Suggested in Copilot review on ContainerChildren.tsx:130. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 29 out of 29 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

openmetadata-ui/src/main/resources/ui/src/components/DataAssets/DataAssetsHeader/DataAssetsHeader.component.tsx:355

fetchContainerAncestors does not guard against stale in-flight requests. If the user navigates quickly between containers (or an earlier request resolves after a later navigation), the older response can call setParentContainers(...) and overwrite the newer container’s breadcrumbs.

Consider tracking the latest requested FQN in a useRef (or using an AbortController) and only applying ancestors/setIsBreadcrumbLoading(false) when the resolving request still matches the current dataAsset.fullyQualifiedName.

  const fetchContainerAncestors = async (containerFqn: string) => {
    // Always reset state at the top so a navigation to a container without an FQN
    // (or to one whose ancestor fetch fails) doesn't keep painting the previous
    // container's breadcrumbs. Without this reset, switching between containers
    // could leave stale ancestors visible after either an early return or an error.
    setParentContainers([]);
    if (isEmpty(containerFqn)) {
      return;
    }
    setIsBreadcrumbLoading(true);
    try {
      // Single batched server call replaces what used to be one
      // getContainerByName per ancestor level.
      const ancestors = await getContainerAncestors(containerFqn);
      setParentContainers(ancestors ?? []);
    } catch (error) {
      showErrorToast(error as AxiosError, t('server.unexpected-response'));
    } finally {
      setIsBreadcrumbLoading(false);

Copilot · 2026-05-01T00:47:09Z

 # Logging settings.
 # https://logback.qos.ch/manual/layouts.html#conversionWord
 # Set LOG_FORMAT=json for structured logs. The default text format preserves legacy output.
 logging:
  level: ${LOG_LEVEL:-INFO}
  loggers:
    org.openmetadata.service.util.jdbi.OMSqlLogger:
      level: ${SQL_LOG_LEVEL:-INFO}
      additive: false
      appenders:
        - type: file
          layout:
            type: om-event-layout
            format: ${LOG_FORMAT:-text}
            pattern: "%level [%d{ISO8601,UTC}] [%t] %logger{5} - %msg%n"
            appendLineSeparator: true
          threshold: DEBUG
          currentLogFilename: ./logs/queries.log
          archivedLogFilenamePattern: ./logs/queries-%d{yyyy-MM-dd}-%i.log.gz
          archivedFileCount: 7
          timeZone: UTC
          maxFileSize: 50MB


With the dedicated org.openmetadata.slowrequest logger removed from the sample config, the RequestLatencyContext slow-request logs (topic org.openmetadata.slowrequest) will now inherit the root logger settings and flow into the main appenders by default. Combined with the switch to LOG.info(...), this can significantly increase log volume and also removes the previous ability to toggle/route these logs separately.

If the intent is to keep slow-request logging optional or isolated, consider re-adding an explicit org.openmetadata.slowrequest logger entry here (potentially with its own appender/level) or documenting the new recommended override mechanism for operators.

The container detail page no longer inlines `children` into the parent payload (it was unbounded for object-store buckets), and added a server-side ancestor resolver to skip the per-level breadcrumb fetch loop. Mirror both in the Java and Python SDKs so external consumers have a clean migration path off `fields=children`. Java SDK - ContainerService: listChildren(fqn, ListParams) → ListResponse<Container> (slim projection, paginated via limit/offset) and listAncestors(fqn) → List<EntityReference> (root → immediate parent, single call). - Containers fluent: top-level `Containers.listChildren(fqn).limit(N).fetch()` and `Containers.listAncestors(fqn)`, plus `FluentContainer.children()` / `.ancestors()` for chaining off an already-fetched container. Python SDK - Containers.list_children(fqn, limit=, offset=) → EntityList[Container], delegates to the existing `OMetaContainerMixin.list_container_children` helper. - Containers.list_ancestors(fqn) → list[EntityReference], hits /v1/containers/name/{fqn}/ancestors directly. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The line speaks for itself; the why is in the commit history and the ContainerResource Swagger description. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 32 out of 32 changed files in this pull request and generated 2 comments.

A 50ms wall-clock bound in a unit test is flaky under JVM warmup, GC, and CI scheduler noise — and it doesn't actually verify the "no I/O" contract, just that whatever the test happened to do finished fast. The remaining test covers what we actually care about: the probe returns healthy. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

With fields=children removed, /v1/containers/name/{fqn}/children is the primary way to enumerate a container's children. The endpoint took a SecurityContext but never called authorizer.authorize on it, so a caller without VIEW_BASIC on the parent could still walk its subtree. Mirror the gate listAncestors got: build a VIEW_BASIC OperationContext + ResourceContextByName(fqn) and authorize before delegating to the repo. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

URLEncoder is form-encoding, not path-encoding: it emits + for spaces and leaves / unencoded, both of which the server treats differently in a path segment than a query parameter. EntityServiceBase already has the right helper (buildPathWithEncodedName) backed by okhttp3.HttpUrl.Builder for RFC-3986 path-segment encoding — but it was private. Promote it to protected and reuse for /children and /ancestors so a container FQN with a space (or any character that differs between form and path encoding) hits the right endpoint instead of 404'ing. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

gitar-bot · 2026-05-01T01:21:40Z

Code Review ✅ Approved 11 resolved / 11 findings

Optimizes container tag retrieval by replacing unbounded prefix lookups with a batched exact-match strategy for column FQN hashes. All identified issues, including redundant hash computation, authorization gaps, and duplicate tag processing logic, have been resolved.

✅ 11 resolved

✅ Performance: FQN hash computed twice per column in populateDataModelColumnTags

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/ContainerRepository.java:259-268
In populateDataModelColumnTags, FullyQualifiedName.buildHash(c.getFullyQualifiedName()) is called once when building the targetHashes list (line 261) and again when looking up each column in the result map (line 267). For large data models this doubles the hashing work unnecessarily. Pre-computing a Map<Column, String> or Map<String, Column> would avoid the redundant computation.

✅ Bug: Bulk-fetched children skip setFields/setDefaultFields hydration

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/ContainerRepository.java:565-579 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/ContainerRepository.java:84-96 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/ContainerRepository.java:326-330
The old code called Entity.getEntity(ref, EMPTY_FIELDS, ALL) for each child, which routes through EntityRepository.get() → setFields(). Even with empty fields, setFields unconditionally calls setDefaultFields(container) (line 87), which populates the service reference via a relationship lookup (line 326-329).

The new code at line 567 uses dao.findEntitiesByIds(ids, Include.ALL), which only deserializes the stored JSON blob and never calls setFields or setDefaultFields. As a result, every child container returned by listChildren will be missing its service EntityReference (and any other relationship-based fields that setDefaultFields populates).

This is an observable API regression — callers of the listChildren endpoint will receive child containers with a null service field.

✅ Security: listAncestors endpoint has no authorization check

📄 openmetadata-service/src/main/java/org/openmetadata/service/resources/storages/ContainerResource.java:780-786
The new listAncestors endpoint at line 780 bypasses the base-class getInternal/listInternal machinery and directly calls repository.getAncestors(fqn) with no authorization gate. While listChildren follows the same pattern (also no auth), the ancestors endpoint is new code being added now. An unauthenticated caller could enumerate the full container hierarchy for any FQN.

Consider adding an OperationContext + authorizer.authorize() check, or routing through the base-class helper that enforces VIEW_BASIC permissions, similar to how get() delegates to getInternal().

✅ Quality: getAncestors returns mutable lists instead of unmodifiable

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/ContainerRepository.java:603 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/ContainerRepository.java:628
Both the early-return at line 603 (new ArrayList<>()) and the normal return at line 628 return mutable ArrayList instances. Per project conventions, repository methods should return unmodifiable collections to prevent accidental mutation by callers.

✅ Quality: EntityReferenceRow should be a Java 21 record

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityDAO.java:263-277
The new EntityReferenceRow inner class (lines 263-288) is a plain data carrier with public final fields and a constructor — the exact use-case for a Java record. Per the project's Java 21 guidelines, this should be record EntityReferenceRow(UUID id, String name, String displayName, String fqn, boolean deleted) with the toEntityReference method kept as an instance method on the record. This would eliminate ~15 lines of boilerplate and align with the existing EntityNameColumnHashJsonPair and EntityIdJsonPair records already declared in the same file (lines 836, 845).

...and 6 more resolved from earlier reviews

Options

Display: compact → Showing less information.

Comment with these commands to change:

`Compact`
`gitar display:verbose`

_{Was this helpful? React with 👍 / 👎 | Gitar}

Copilot

Pull request overview

Copilot reviewed 33 out of 33 changed files in this pull request and generated no new comments.

Comments suppressed due to low confidence (2)

openmetadata-service/src/main/resources/logback.xml:27

RequestLatencyContext logs to the org.openmetadata.slowrequest logger topic. With the dedicated logger removed here, slow-request diagnostics will inherit the root logger configuration (STDOUT + main file) and can no longer be routed to a separate file / disabled via a dedicated logger level. If this was not intentional, consider re-adding an explicit org.openmetadata.slowrequest logger configuration (level + optional file appender) so operators can control noise and retention for slow-request logs independently of the main application logs.

    <logger name="org.openmetadata.service.Entity" level="ERROR" />
    <!-- Add specific logger for SearchIndexApp and ReindexingJobLogger -->
    <logger name="org.openmetadata.service.apps.bundles.searchIndex" level="INFO" />
    <root level="${LOG_LEVEL:-INFO}">
        <appender-ref ref="STDOUT" />
        <appender-ref ref="file"/>
    </root>

openmetadata-sdk/src/main/java/org/openmetadata/sdk/services/storages/ContainerService.java:79

encodeFqn uses URLEncoder.encode(...) directly for a URL path segment. URLEncoder is intended for query-string encoding and will encode spaces as +, which is not a valid path encoding and is inconsistent with other SDK services (e.g. PipelineService replaces + with %20). This can break requests for FQNs containing spaces (or other characters that need path encoding).

Consider switching to a shared path-segment encoder (or at minimum .replace("+", "%20")) to keep SDK URL construction correct and consistent across services.

sonarqubecloud · 2026-05-01T01:48:42Z

Quality Gate passed for 'open-metadata-ui'

Issues
1059 New issues
0 Accepted issues

Measures
17 Security Hotspots
0.0% Coverage on New Code
3.8% Duplication on New Code

See analysis details on SonarQube Cloud

sonarqubecloud · 2026-05-01T02:23:45Z

Quality Gate passed for 'open-metadata-ingestion'

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
41.2% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

Containers with deep nesting causing performance issues due to tag fetch

92692a2

Copilot AI review requested due to automatic review settings April 30, 2026 05:05

github-actions Bot added backend safe to test Add this label to run secure Github workflows on PRs labels Apr 30, 2026

Copilot started reviewing on behalf of harshach April 30, 2026 05:06 View session

harshach changed the title ~~Containers with deep nesting causing performance issues due to tag fetch~~ Containers: batch container data-model column tag retrieval to avoid subtree fan-out Apr 30, 2026

gitar-bot Bot reviewed Apr 30, 2026

View reviewed changes

Comment thread openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/ContainerRepository.java Outdated

Copilot AI reviewed Apr 30, 2026

View reviewed changes

Comment thread openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/ContainerRepository.java Outdated

harshach had a problem deploying to test April 30, 2026 05:15 — with GitHub Actions Error

harshach and others added 2 commits April 29, 2026 22:18

Copilot AI review requested due to automatic review settings April 30, 2026 05:22

Copilot started reviewing on behalf of harshach April 30, 2026 05:23 View session

Copilot AI reviewed Apr 30, 2026

View reviewed changes

harshach had a problem deploying to test April 30, 2026 05:32 — with GitHub Actions Error

gitar-bot Bot reviewed Apr 30, 2026

View reviewed changes

Comment thread openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/ContainerRepository.java Outdated

harshach had a problem deploying to test April 30, 2026 05:58 — with GitHub Actions Failure

harshach temporarily deployed to test April 30, 2026 05:58 — with GitHub Actions Inactive

harshach and others added 2 commits April 30, 2026 16:17

Copilot AI reviewed Apr 30, 2026

View reviewed changes

gitar-bot Bot reviewed Apr 30, 2026

View reviewed changes

Comment thread ...metadata-service/src/main/java/org/openmetadata/service/jdbi3/HikariCPDataSourceFactory.java

gitar-bot Bot reviewed Apr 30, 2026

View reviewed changes

Comment thread openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/ContainerRepository.java

gitar-bot Bot reviewed Apr 30, 2026

View reviewed changes

Comment thread openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/ContainerRepository.java Outdated

harshach and others added 2 commits April 30, 2026 16:40