Skip to content

Optimize ai prompts with the 4-d method#5

Merged
Code-Eat-Rabbit merged 3 commits into
issue-22392-poc-patchfrom
cursor/optimize-ai-prompts-with-the-4-d-method-0516
Oct 11, 2025
Merged

Optimize ai prompts with the 4-d method#5
Code-Eat-Rabbit merged 3 commits into
issue-22392-poc-patchfrom
cursor/optimize-ai-prompts-with-the-4-d-method-0516

Conversation

@Code-Eat-Rabbit

Copy link
Copy Markdown
Owner

Describe your changes:

Fixes #

I implemented owner type validation in ingestion/src/metadata/utils/owner_utils.py to enforce business rules: multiple users allowed, only one team allowed, and mutual exclusivity between users and teams. The JSON Schema (openmetadata-spec/src/main/resources/json/schema/type/ownerConfig.json) was updated with clearer descriptions and examples reflecting these rules. The inheritance logic in owner_utils.py was also verified and found to be correctly implemented.

Additionally, the owner configuration integration test suite was completely restructured and streamlined from 13 to 8 focused tests, moved to ingestion/tests/integration/owner_config_tests/. All test-related documentation and comments were translated to English, and the PostgreSQL test data (init-db.sql) was slightly expanded for better coverage.

Changes were tested using unit tests and the new 8-scenario integration test suite, specifically verifying Test 04 for owner type validation and Test 05 for inheritance behavior.

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: Implement owner type validation and streamline owner config tests
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: The changes are descriptive and example-based, not structural, so migration scripts are not needed.
  • I have added a test that covers the exact scenario we are fixing. For complex issues, comment the issue number in the test for future reference.
  • I have added tests around the new logic.
  • For connector/ingestion changes: I updated the documentation.

Open in Cursor Open in Web

Implement owner type validation (multiple users, single team, no mixing) and clarify inheritance logic. Update JSON schema and tests.

Co-authored-by: yourton.ma <yourton.ma@gmail.com>
@cursor

cursor Bot commented Oct 11, 2025

Copy link
Copy Markdown

Cursor Agent can help with this pull request. Just @cursor in comments and I'll start working on changes in this branch.
Learn more about Cursor Agents

cursoragent and others added 2 commits October 11, 2025 02:34
This commit restructures the owner configuration tests and documentation. It removes redundant files, consolidates information into README.md and QUICK-START.md, and ensures all test files are in the correct location with English comments.

Co-authored-by: yourton.ma <yourton.ma@gmail.com>
Removes outdated files, consolidates documentation, and adds a setup script for integration tests.

Co-authored-by: yourton.ma <yourton.ma@gmail.com>
@Code-Eat-Rabbit Code-Eat-Rabbit marked this pull request as ready for review October 11, 2025 02:59

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is being reviewed by Cursor Bugbot

Details

You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

owner_entity = owner_ref.root[0]
all_owners.append(owner_entity)
owner_types.add(owner_entity.type)
logger.debug(f"Found owner by email: {owner_name} (type: {owner_entity.type})")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Owner Lookup Logic Limits to First Match

It looks like the owner lookup logic in _get_owner_refs now only processes the first entity found for a given owner name or email. Previously, it would include all entities returned by get_reference_by_name or get_reference_by_email. This change could lead to valid owners being dropped if the API returns multiple matches, resulting in incomplete owner assignments.

Fix in Cursor Fix in Web

@Code-Eat-Rabbit Code-Eat-Rabbit merged commit 4dab553 into issue-22392-poc-patch Oct 11, 2025
1 check passed
@Code-Eat-Rabbit Code-Eat-Rabbit deleted the cursor/optimize-ai-prompts-with-the-4-d-method-0516 branch October 12, 2025 02:52
Code-Eat-Rabbit pushed a commit that referenced this pull request Jun 30, 2026
…tency (open-metadata#28117)

* fix(rdf): converge Fuseki state on weekly rebuilds and isolate API latency

RdfIndexApp ran daily and never reconciled removed relationships, so triples
grew unboundedly across runs. When Fuseki crash-looped on the resulting disk
pressure, every entity-write hook blocked synchronously on the unreachable
server (no HTTP connect timeout, 3-retry loop on ConnectException), saturating
the bounded AsyncService pool and pushing login to ~45s.

Storage-side fixes (stop growth):
- Drop the extractRelationshipTriples "preserve forward" path in
  RdfRepository.createOrUpdate; the translator is the source of truth and the
  surrounding orchestration already rewrites the current relationship set.
  This also removes a wasted CONSTRUCT round-trip per entity write.
- bulkStoreRelationships now does per-source-entity DELETE WHERE with a
  predicate-exclusion FILTER for lineage edges, so relationships that no
  longer exist actually leave the store.
- Wire RdfRepository.clearAllGlossaryTermRelations() into RdfIndexApp's
  initializeJob (the method existed but had no callers).
- Flip recreateIndex default to true and move the cron to Saturday midnight
  ("0 0 * * 6"). Add reloadOntologies() so CLEAR ALL doesn't leave the
  ontology graph empty before indexing starts.
- Include a 2.0.1 post-data migration that updates existing installed_apps
  rows; the app loader is insert-only on upgrade.

Connectivity / concurrency fixes (isolate API latency from Fuseki health):
- Add 2s connectTimeout to every JenaFusekiStorage HttpClient and fast-fail
  on ConnectException / ClosedChannelException / HttpConnectTimeoutException
  instead of retrying. Introduce a 5-failure/30s circuit breaker.
- Route all RdfUpdater mutators through AsyncService.execute with a bounded
  pendingWrites gate (cap 1000, drop-on-overflow with logged warning) so a
  dead Fuseki can no longer block request threads or starve the AsyncService
  pool.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(rdf): address PR review — preserve relationships, scope DELETEs, surface ontology failures

PR open-metadata#28117 review feedback. Addresses 13 findings across gitar-bot and Copilot:

Storage correctness:
- JenaFusekiStorage.storeEntity now keeps URI-valued triples (relationships)
  and only refreshes literal-valued triples. A metadata-only PATCH would
  otherwise wipe every inter-entity edge until the next weekly recreate-index,
  and async ordering between updateEntity and addRelationship could leave the
  graph missing edges (Copilot #1, #2).
- RdfRepository.removeRelationship wraps the DELETE in the knowledge named
  graph and uses getRelationshipPredicate so the predicate URI matches what
  addRelationship actually wrote (e.g. UPSTREAM → prov:wasDerivedFrom). The
  previous bare DELETE in the default graph was a silent no-op (Copilot #3).
- RdfBatchProcessor now calls a new RdfRepository.clearOutgoingEntityRelationships
  for every entity in the batch, not just those with current edges. An entity
  whose last outgoing relationship was removed in MySQL contributes zero
  RelationshipData entries, so bulkStoreRelationships' per-source DELETE
  never fired for it (Copilot #4).
- bulkStoreRelationships no longer swallows non-connect DELETE errors —
  DELETE WHERE on a source with no edges is a no-op, so exceptions there
  are real failures (malformed SPARQL, auth, server errors) and should
  surface (Copilot #5).

Visibility:
- reloadOntologies() now checks areOntologiesLoaded() after load and throws
  if still empty. OntologyLoader.loadOntologies catches internally, so the
  old reloadOntologies always appeared to succeed (Copilot #6).
- clearAllGlossaryTermRelations rethrows on failure instead of silently
  logging — the indexer's caller can now react to cleanup failures (Copilot #10).
- clearAllGlossaryTermRelations pulls custom predicate URIs from
  GlossaryTermRelationSettings and includes them in the DELETE FILTER. The
  hardcoded list missed any custom predicates an admin configured (Copilot #7).

Quality:
- Set / LinkedHashSet imported instead of using java.util.* fully qualified
  in JenaFusekiStorage and RdfBatchProcessor (gitar-bot #2).
- RdfIndexAppTest uses InOrder to assert clearAll → reloadOntologies
  ordering — a plain verify would have accepted a future change that
  reordered the calls (Copilot #9).
- Documented the residual gap that HttpClient.connectTimeout only bounds
  TCP connect, not request bodies; circuit breaker + bounded pendingWrites
  contain the blast radius (Copilot #8).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(rdf): expect per-source clear on batches whose relationships are all filtered

The two EventSubscription-skip tests used verifyNoInteractions on the RDF
repository mock, which was valid before because filtered batches never
touched RDF. The new per-source reconciliation clear in
RdfBatchProcessor.processBatchRelationships now runs for every batch entity
regardless of whether its relationships survive filtering — that's
deliberate, since stale RDF state for those source entities still needs
to be reconciled even when their current MySQL edges all point to excluded
entity types. Switch the assertions to verify clearOutgoingEntityRelationships
is the sole interaction (no bulkAdd, no addRelationship).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(rdf): address remaining PR review nits

Three findings from the second gitar-bot review pass:

- Replace the fully qualified `org.openmetadata.schema.configuration.GlossaryTermRelationSettings` / `SettingsType` / `SettingsCache` references in clearAllGlossaryTermRelations with imports, matching the project's existing convention. Other pre-existing FQN usages in the same file are left alone (not part of this PR's scope).
- Make expandPredicateCurie throw IllegalArgumentException on null/empty input instead of silently defaulting to `om:relatedTo`. The current caller already null-guards so the path is unreachable today, but a future caller could otherwise silently miss-clean a misconfigured predicate.
- Document why the lineage predicate URIs in the reconciliation DELETE filter (UPSTREAM / hasLineageDetails) are literal-hardcoded rather than baseUri-derived: they match what addLineageWithDetails actually writes (also hardcoded at RdfRepository.java:423,435). Switching the filter to be baseUri-derived would stop matching the stored lineage triples on non-default baseUri deployments and would incorrectly delete them. Comment added in both clearOutgoingEntityRelationships and bulkStoreRelationships so the next reader doesn't get nudged into "fixing" it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(rdf): surface cleanup failures, sync fallback predicates, time-bound reads

Addresses the three unresolved Copilot findings from review 4295208187:

- Drop the try/catch around clearAllGlossaryTermRelations in initializeJob.
  clearAllGlossaryTermRelations rethrows specifically so the indexer can fail
  loudly; wrapping it again let an unreconciled graph slip past as a
  "successful" run. The outer execute() handler will now mark the run FAILED.

- Sync DEFAULT_GLOSSARY_TERM_RELATION_PREDICATES with what SettingsCache
  actually bootstraps (SettingsCache.java:355-486): adds skos:exactMatch (the
  real default for `synonym`), om:antonym, om:partOf, om:hasPart, rdfs:seeAlso.
  Keeps legacy om:* URIs from the stale getGlossaryTermRelationPredicateUri
  switch so a cleanup run still scrubs pre-SettingsCache data.

- Apply READ_TIMEOUT_MS (10s) via QueryExecution.setTimeout on every read path
  (executeSparqlQuery for SELECT/CONSTRUCT/ASK/DESCRIBE, getEntity, getAllGraphs,
  getTripleCount, testConnection, the ontology presence check). A Fuseki that
  accepts the TCP connection but stalls mid-query no longer hangs reads
  indefinitely. UPDATE-side calls still rely on the connect timeout + circuit
  breaker + bounded pendingWrites since Jena's RDFConnection.update API
  doesn't expose a per-request timeout cleanly; comment near the constant
  notes the gap and a viable follow-up via UpdateExecHTTPBuilder.timeout.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(rdf): qualify EntityRelationship in test to fix compile

RdfIndexAppTest references EntityRelationship.class in two verify() calls
that I added in the previous commit, but the class was never imported into
the test file. CI's openmetadata-service test compile fails with "cannot
find symbol class EntityRelationship", which cascades into 11 dependent
checks (build x2, openmetadata-service-unit-tests, three Java integration
test workflows, two Python integration test shards that build OM as a
setup step, Test Report aggregate, maven-sonarcloud-ci, and the unit-test
status gate). Use the fully qualified
org.openmetadata.schema.type.EntityRelationship to match how every other
reference in this file already spells it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(rdf): drop QueryExecution.setTimeout — removed in Jena 5 used by IT classpath

GlossaryOntologyExportIT was failing on RdfUpdater.initialize with
NoSuchMethodError: 'void org.apache.jena.query.QueryExecution.setTimeout(long,
java.util.concurrent.TimeUnit)'. openmetadata-service builds against Jena 4.10
(apache-jena-libs), but openmetadata-integration-tests directly pulls in
jena-core/jena-arq 5.0.0, and Jena 5 removed the setTimeout overloads from
the QueryExecution interface. Compile passes, integration test JVM links the
5.x class and bombs at the first read path (loadOntology's ASK check).

Strip the nine setTimeout calls and the READ_TIMEOUT_MS constant. A clean
read-side timeout that works on both Jena 4 and 5 needs to be plumbed via
QueryExecutionHTTPBuilder.timeout / UpdateExecHTTPBuilder.timeout instead of
RDFConnection — bigger change than this PR should carry. The comment near
CONNECT_TIMEOUT now records that history so the next reader knows why we
don't simply re-add setTimeout. Protection against a stalled-but-accepting
Fuseki still relies on the 5-failure circuit breaker + bounded pendingWrites
gate in RdfUpdater.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(rdf): align ontology-loaded check, predicate URIs, and CURIE fallback

Three real bugs flagged by Copilot's later review passes:

- areOntologiesLoaded() looked for `"boolean" : true` (space before colon) but
  JenaFusekiStorage formats ASK results without that space, so the check never
  matched and reloadOntologies() always threw. recreateIndex=true (now the
  default) ran into this on the very first scheduled run. Normalise whitespace
  before checking.

- bulkAddRelationships wrote `om:<lowercase relationshipType>` directly, while
  removeRelationship uses getRelationshipPredicate which maps a handful of
  types to prov:* (UPSTREAM → prov:wasDerivedFrom, USES → prov:used, etc.).
  Triples written by the indexer were therefore unreachable by the live
  remove hook. Pre-compute predicateUri via getRelationshipPredicate in
  bulkAddRelationships and pass it through a new field on RelationshipData
  so JenaFusekiStorage uses the same URI both paths agree on. The legacy
  RelationshipData(5-arg) ctor still works for callers that don't have a
  predicate handy; bulkStoreRelationships falls back to the old shape there.

- expandPredicateCurie returned bare strings like `customRel` unchanged, but
  createPropertyFromUri's default branch writes `<baseUri>ontology/customRel`.
  Custom relation predicates expressed as local names would never match the
  cleanup FILTER. Mirror createPropertyFromUri: full URIs pass through, bare
  local names get the OM-ontology prefix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(rdf): schema default + migration force entities=[all] for safe full reindex

- rdfIndexingAppConfig.json: flip recreateIndex.default from false to true so
  any UI form / config generation path that surfaces the schema default agrees
  with the install JSON files and the new full-rebuild semantics.

- 2.0.1 migration (MySQL + Postgres): in addition to flipping recreateIndex=true
  and the weekly Saturday cron, also rewrite appConfiguration.entities to
  ["all"]. Pre-upgrade an operator could have narrowed RDF indexing to a subset
  of entity types; the new recreateIndex=true semantics issues CLEAR ALL before
  indexing, which would otherwise wipe triples for excluded entity types and
  leave the graph permanently missing them. Forcing entities back to ["all"]
  ensures the post-CLEAR-ALL run repopulates the graph fully. Operators can
  re-narrow after the migration if they need partial indexing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(rdf): scope storeEntity DELETE to translator-managed predicates

Replace the literal-only FILTER(!isIRI(?o)) in JenaFusekiStorage.storeEntity
with a predicate-scoped DELETE so translator-emitted URI triples (tags,
glossary terms, owner, domain, tier, data products, structured sub-resources)
are refreshed from the new model on every entity write, while hook-managed
predicates (om:UPSTREAM, om:hasLineageDetails, om:owns / om:contains / ...)
stay intact.

Previously, with !isIRI(?o), every URI-valued triple survived storeEntity
forever — when a tag was removed or an owner changed, the old URI coexisted
with the new one because no hook ever cleans those up (tags live in
tag_usage, not entity_relationship; owners' translator-side predicate
om:hasOwner is not what the OWNS hook writes).

The DELETE set is the union of:
- RdfPropertyMapper.TRANSLATOR_MANAGED_DIRECT_PREDICATES, a static list of
  predicates that may shrink to empty between writes (so the current model
  walk wouldn't see them) — rdf:type, om:hasOwner, prov:wasAttributedTo,
  om:hasTag, om:hasGlossaryTerm, om:hasTier, om:belongsToDomain,
  om:hasDataProduct, dct:source, om:sourceUrl, plus the structured-resource
  attachment predicates (om:hasLifeCycle / hasCertification / hasExtension /
  hasCustomProperty).
- the predicates the current model actually emits for the entity subject,
  covering JSON-LD context-driven predicates that aren't in the static list.

Added two coverage tests on RdfPropertyMapperTest: the static set contains
the documented core predicates, and never contains lineage-hook predicates
(om:UPSTREAM, prov:wasDerivedFrom, om:hasLineageDetails) — that overlap
would let storeEntity wipe lineage edges on every entity update.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(rdf): scope reconciliation DELETE to relationship-hook predicates only

Both clearOutgoingEntityRelationships (in RdfRepository) and the per-source
DELETE inside JenaFusekiStorage.bulkStoreRelationships used to clear ANY
outgoing edge whose object was a baseUri/entity/ URI (with only the three
lineage predicates excluded). That swept up translator-managed URI triples
(om:hasTag, om:hasGlossaryTerm, om:hasOwner, om:belongsToDomain, …) which
bulkAddRelationships does not re-emit, so reconciliation runs were
permanently destroying tag/owner/domain links.

Switch the filter to opt-in: only delete triples whose predicate is in
RELATIONSHIP_HOOK_PREDICATES, derived from the Relationship enum via the
existing getRelationshipPredicate mapping. The set excludes the lineage
predicates by skipping the UPSTREAM enum value (managed by
addLineageWithDetails). Translator-managed predicates aren't relationship
types so they're naturally not in the set; the new
RdfPredicatePartitionTest enforces the partition.

Refactored getRelationshipPredicate into a static
getRelationshipPredicateUri so it can be reused at class-init time to build
the predicate set without an instance. Added a small buildPredicateInList
helper exposed at package level for JenaFusekiStorage to reuse.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(rdf): scope bulk reconciliation to batch entities, not all relationship sources

bulkStoreRelationships used to compute its per-source DELETE set from the
relationships list, so any source URI mentioned by any row in the batch was
reconciled. RdfBatchProcessor passes BOTH outgoing relationships (sources
inside the batch) and incoming UPSTREAM lineage (sources outside the batch
where this batch's entity is the target). The outside-batch sources had
their OTHER outgoing edges wiped, even though the indexer never planned to
re-index them.

Add a 2-arg overload to RdfStorageInterface.bulkStoreRelationships that
takes an explicit Set<String> sourcesToReconcile. The default 1-arg method
keeps the legacy "derive from relationships" behavior for any plugin caller
that hasn't migrated. RdfRepository.bulkAddRelationships gains a matching
overload taking Set<EntitySourceRef>; RdfBatchProcessor passes its
batchSources (the entities IT is actually indexing in this pass).

JenaFusekiStorage.bulkStoreRelationships now iterates sourcesToReconcile for
the per-source DELETE instead of computing distinctSources from
relationships. The new buildEntityUri helper on the interface lets callers
(or the default delegate) build consistent source URIs.

QLeverStorage stubs the new overload (still UnsupportedOperationException).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(rdf): time-bound HTTP request bodies via CompletableFuture wrapper

Wrap every blocking RDFConnection call in the hot read/write paths
(storeEntity DELETE+LOAD, storeRelationship, bulkStoreRelationships,
getEntity, deleteEntity, executeSparqlQuery, executeSparqlUpdate) with a
CompletableFuture-based 10s request timeout. When Fuseki accepts the TCP
connection and then stalls on the response, the caller thread now frees
after 10s instead of waiting until the OS gives up on the socket (~60s).

We chose CompletableFuture over Jena's QueryExecution.setTimeout because
that overload was removed in Jena 5 (broke integration tests already once
in this PR), and over Jena's QueryExecutionHTTPBuilder / UpdateExecHTTPBuilder
because their API surface differs between Jena 4 and Jena 5 and our two
classpaths use different versions. The CompletableFuture wrapper is Jena-
API-agnostic.

On timeout the underlying HTTP request still leaks its (virtual) thread
until OS-level TCP give-up; that's bounded by the existing circuit breaker
(after 5 timeouts the breaker opens for 30s, short-circuiting subsequent
traffic).

Lower-traffic paths (loadTurtleFile, clearGraph, getAllGraphs, getTripleCount,
loadOntology, testConnection) keep using the direct connection.update /
connection.query / connection.load calls — they're protected by the
circuit breaker and the connect timeout, and adding wrappers there is
churn without proportional benefit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(rdf): document RdfUpdater async-ordering trade-off in submitAsync

Add a comment block in RdfUpdater.submitAsync explaining why we accept the
loss of per-entity ordering when submitting through AsyncService:
- EntityUpdater diff-applies changes per request, so add-then-remove of the
  same edge within one API call nets to no-op (no hooks fire).
- Cross-request races reconcile at the next weekly recreate-index, which
  rebuilds from MySQL.
- The alternative (per-entity striped lock) costs memory and adds latency
  for the no-contention common case.
- Pointers for the future maintainer if an observed-in-production race
  emerges: gate via ConcurrentHashMap<UUID, Semaphore>.

No behavior change. The two open Copilot threads on this trade-off
(M6CQYup, M6CYbM2) stay open so a future PR can pick them up if needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(rdf): atomic clear+insert, broader fallback predicate set, close temp models

Three follow-up findings from the latest Copilot pass:

- Atomicity (3249716506): clearOutgoingEntityRelationships + bulkAddRelationships
  ran as two separate SPARQL updates. If bulkAddRelationships failed after the
  clear succeeded, the batch entities had their relationships wiped without
  the replacement edges in place — they stayed gone until the next weekly
  recreate-index. Combine the per-source DELETE and the INSERT DATA into a
  single SPARQL update inside JenaFusekiStorage.bulkStoreRelationships and
  drop the now-redundant separate clear call from RdfBatchProcessor. Either
  the whole reconciliation commits or none of it does. Also let
  bulkStoreRelationships handle the zero-edge case (relationships empty,
  sourcesToReconcile non-empty) so RdfBatchProcessor doesn't need a separate
  clear for entities whose relationships were all removed in MySQL.

- Fallback predicate set (3249716532): when SettingsCache returns null,
  getGlossaryTermRelationPredicate falls back to literal
  `https://open-metadata.org/ontology/<relationType>` — so `broader` /
  `narrower` / `exactMatch` get written as om:broader/om:narrower/om:exactMatch,
  not skos:* equivalents. Without those URIs in DEFAULT_GLOSSARY_TERM_RELATION_
  PREDICATES, a cleanup run during a transient settings-cache outage would
  miss them. Added the three om:* fallback variants alongside the existing
  skos:*/rdfs:* bootstrap defaults.

- Temp Model leaks (3249319886): bulkAddRelationships and removeRelationship
  each create an ephemeral Jena Model just to mint property URIs. Wrapped
  both in try/finally close() so the in-memory graphs are released right after
  use. Jena 4's Model has a close() method but doesn't implement
  java.lang.AutoCloseable so try-with-resources isn't possible there.

Copilot's "still deleting only non-IRI" finding (3249716480) is a stale-
snapshot false positive — JenaFusekiStorage.storeEntity has used predicate-
scoped DELETE via TRANSLATOR_MANAGED_DIRECT_PREDICATES since 22d5825.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(rdf): make buildPredicateInList public so JenaFusekiStorage can use it

JenaFusekiStorage (org.openmetadata.service.rdf.storage) lives in a different
package than RdfRepository (org.openmetadata.service.rdf), so the
package-private buildPredicateInList helper introduced in 857c09 couldn't be
called from JenaFusekiStorage.bulkStoreRelationships — CI was failing with:

  [ERROR] JenaFusekiStorage.java:[606,51] buildPredicateInList(Set<String>)
  is not public in RdfRepository; cannot be accessed from outside package

Promote it to public alongside RELATIONSHIP_HOOK_PREDICATES (which is the
only data this helper renders) so the cross-package call resolves. Local
javac across the touched RDF files now reports zero new errors; the only
remaining build failures are the pre-existing es.co.elastic.clients shading
issues unrelated to this PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(rdf): normalise sourcesToReconcile to empty-set to prevent NPE in iteration

bulkStoreRelationships' early-return guard accepts sourcesToReconcile == null
as a valid input, but the subsequent per-source DELETE loop iterates
sourcesToReconcile directly — so a caller passing null with a non-empty
relationships list would skip the guard and crash at the for-loop.

Today no caller hits this path (RdfRepository.bulkAddRelationships always
passes non-null, and the 1-arg default interface method derives a set), but
the null-check in the guard explicitly encodes null as supported, so the
contract should match the iteration. Normalise once after the guard:

    Set<String> effectiveSources =
        sourcesToReconcile != null ? sourcesToReconcile : Set.of();

and use effectiveSources for both the loop and the success-log size.

Local filtered compile passes cleanly (zero NEW errors from RDF files;
remaining errors are the pre-existing es.co.elastic.clients shading mess).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(rdf): update RdfIndexAppTest verifications for the new bulkAddRelationships 2-arg signature

Three test failures after the Fix-I / atomic-clear-insert changes:

- testProcessBatchRelationshipsStoresResults verified
  `bulkAddRelationships(captor.capture())` (1-arg) but RdfBatchProcessor now
  calls the 2-arg `bulkAddRelationships(relationships, batchSources)` — Mockito
  surfaced this as "different arguments" because the actual call had a
  Set<EntitySourceRef> tail. Updated the verify to
  `bulkAddRelationships(captor.capture(), anySet())`.

- The two event-subscription skip tests previously verified
  `clearOutgoingEntityRelationships(anySet())` as the only interaction; that
  method is no longer called from RdfBatchProcessor (the clear was folded
  into bulkAddRelationships' atomic SPARQL transaction for safety). Replace
  with `verify(mockRdfRepository).bulkAddRelationships(eq(List.of()), anySet())`
  — bulkAdd is still invoked with an empty list to drive the per-source
  reconciliation for the batch entity, even when the only fetched
  relationship pointed at an excluded entity type.

Filtered local compile + test-compile passes cleanly (no NEW errors from RDF
files; only pre-existing es.co.elastic.clients shading errors remain).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(rdf): four follow-up findings from Copilot review 4299978111

- collectTranslatorPredicates over-broad (3249798300): RdfRepository.addRelationship
  passes storeEntity a model loaded from Fuseki PLUS the new relationship, so the
  dynamic walk was pulling hook-managed predicates (om:owns, etc.) into the DELETE
  scope. With async writes, two concurrent additions for the same source could
  each read the old model and each storeEntity wipe the other's relationship.
  Exclude RELATIONSHIP_HOOK_PREDICATES from the walk result (and defensively from
  the static-set union too).

- ForkJoinPool.commonPool starvation (3249798327): runWithTimeout used
  CompletableFuture.supplyAsync's default executor, so a Fuseki that stalls would
  leak workers on the JVM-wide commonPool and starve unrelated CompletableFuture
  / parallel-stream work. Introduce a dedicated virtual-thread executor
  (Thread.ofVirtual().name("rdf-storage-timeout-", 0)) and route all timeout
  wrappers through it — virtual threads are cheap to leak and the circuit breaker
  bounds the pile-up.

- Shrink-to-empty for literal predicates (3249798383): the predicate-scoped DELETE
  no longer caught stale literals when a literal-valued field (description /
  displayName / …) was cleared and the new model simply omitted the triple. Chain
  a "DELETE … FILTER(!isIRI(?o))" pass with the URI-scoped pass so hook-managed
  URI triples stay intact while stale literals get swept on every write.

- UI schema default (3249798439): the UI form schema at
  utils/ApplicationSchemas/RdfIndexApp.json still declared recreateIndex.default
  = false. Flipped to true to match the backend openmetadata-spec schema and the
  install JSON files. (The sibling jsons/applicationSchemas/ is gitignored
  generated output, no source change needed there.)

Local verification before push: spotless:apply, filtered compile + test-compile
(zero new errors), and `mvn test -Dtest='RdfIndexAppTest,RdfPropertyMapperTest,
RdfPredicatePartitionTest,RdfStorageIdempotencyTest'` — 64 tests, 0 failures.

The "buildPredicateInList package-private" finding from the same review
(3249798351) is already addressed in 03c5d4f and surfaces here only because
Copilot reviewed an earlier commit.

The "lineage incremental cleanup" finding (3249798415) is a known architectural
trade-off: addLineageWithDetails handles current lineage rows but removed edges
have no row to trigger a per-edge delete, and adding UPSTREAM/wasDerivedFrom to
RELATIONSHIP_HOOK_PREDICATES would conflict with the inline addLineageWithDetails
call that runs BEFORE bulkAddRelationships in RdfBatchProcessor. The weekly
recreateIndex=true run (the new default) wipes and rebuilds from MySQL, which
reconciles stale lineage; left this thread open as a documented gap rather
than reordering processBatchRelationships in this PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code-Eat-Rabbit pushed a commit that referenced this pull request Jun 30, 2026
…ta#28224)

* feat(spec): add ContextMemory + CreateContextMemory JSON schemas

* feat(jdbi3): add ContextMemoryDAO

* feat: register contextMemory entity type constant

* feat(service): add ContextMemory repository, resource, mapper

* feat(bootstrap): add context_memory table DDL

* test(service): ContextMemory resource CRUD test

* fix(context-memory): address review (relationship types, stable FQN, status msg, test name)

- storeRelationships: rootMemory -> Relationship.CONTAINS, parentMemory -> Relationship.HAS
  so the root-ancestor and direct-parent hierarchies are distinguishable.
- setFullyQualifiedName: derive from the immutable name only (drop mutable
  primaryEntity/owner derivation that destabilized nameHash on update).
- validateStatusTransition: separate "no transitions defined" from "disallowed transition".
- Rename ContextMemoryResourceTest -> ContextMemoryStatusTransitionTest (pure unit test).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(context-memory): add ContextMemoryIT + SDK ContextMemoryService

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(spec): register contextMemory in EntityLink.g4 ENTITY_TYPE grammar

EntityLinkGrammarTest.testAllEntityTypesHaveGrammarOrExclusion enumerates every
Entity.java constant and requires each to be in the EntityLink grammar or the
test's exclusion list. ContextMemory is a normal EntityRepository-backed
top-level entity (like learningResource / contextFile), so it belongs in the
ENTITY_TYPE rule.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(context-memory): override owner ITs for creator-as-owner default

ContextMemoryMapper.defaultOwners() intentionally assigns the creating
user as owner when the create request omits owners. BaseEntityIT's
patch_entityUpdateOwner_200 and patch_entityUpdateOwnerFromNull_200
assert "no owner initially" for any supportsOwners entity, so both
failed for ContextMemory.

Override both in ContextMemoryIT: keep the PATCH-replace-owner contract,
change only the precondition to expect the creator as the sole initial
owner (asserted by count, not a hardcoded principal). Mapper unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Update generated TypeScript types

Add the generated ContextMemory TS types (entity/context/contextMemory.ts,
api/context/createContextMemory.ts). The schemas were on the branch but their
generated types were missing, failing the TypeScript Type Generation check on
this fork PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(context-memory): address review (relationship cleanup, owner scope, validations)

Copilot review on the ContextMemory entity:
- #1 record primaryEntity/relatedEntities/root/parent/source*/machineRepresentation
  in version history; usageCount/lastUsedAt documented as untracked telemetry
- #2 clear stale HAS/RELATED_TO/CONTAINS edges before re-adding in storeRelationships
- #4 default creator as owner only on create; PUT without owners no longer
  silently replaces previously set owners
- #5 schema documents that any status is allowed at creation; transitions
  enforced only on update
- #6 setFullyQualifiedName via FullyQualifiedName.build with skip-if-set guard
- #7 validate shared principal type is user/team/domain
- #8 reject self-reference for parentMemory/rootMemory
- #10 inline Entity.CONTEXT_MEMORY, drop redundant constant

Regenerate ContextMemory TS types for the schema doc change; add IT coverage
for the self-reference and invalid-shared-principal validations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(context-memory): don't blanket-delete relationships (domain data loss)

The #2 cleanup via deleteTo(memory, CONTEXT_MEMORY, HAS, null) also matched the
framework's domain --HAS--> memory edge (storeDomains runs before
storeRelationships in storeRelationshipsInternal, on every create and update),
silently dropping domain assignments.

storeRelationships is now add-only (addRelationship upserts, so re-running on
update is idempotent). Stale-edge cleanup moved to ContextMemoryUpdater using
the framework's updateFromRelationship(s) helpers, which delete only the
specific changed refs and record the version change. parentMemory now uses
Relationship.PARENT_OF (distinct from primaryEntity's HAS and the framework's
domain HAS) so the parent edge can be maintained without collision.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(bootstrap): move context_memory DDL from 2.0.1 to 2.0.0

The context_memory table belongs in the 2.0.0 migration. Relocated the
MySQL and Postgres DDL verbatim; the 2.0.1 schemaChanges.sql files are
restored to their original task_migration_mapping-only content.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(bootstrap): add ENGINE=InnoDB to context_memory MySQL DDL

Explicit engine clause, consistent with the task/search-index tables in the
same migration and robust to any server default change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(context-memory): preserve sanitized/validated fields; validate relatedEntities

Review follow-ups:
- ContextMemoryMapper no longer re-sets description/owners/domains/tags/displayName
  after copy(). copy() sanitizes description (stored-XSS) and validates owners and
  domains; re-setting the raw request values bypassed both. Only ContextMemory-
  specific fields are set now.
- prepare() now assigns the result of EntityUtil.populateEntityReferences back onto
  relatedEntities so orphaned/invalid refs are filtered instead of persisted.
- ContextMemoryIT Javadoc now references ContextMemoryRepository#setCreatorAsDefaultOwner
  (the defaultOwners mapper method no longer exists).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Code-Eat-Rabbit pushed a commit that referenced this pull request Jun 30, 2026
…cs (open-metadata#29201)

* feat(ai): add DERIVED_FROM, Metric.provider, ContextMemory.ontologyStats, AISettings schema

- Append DERIVED_FROM as last entry in entityRelationship.json enum (ordinal-safe)
- Add provider (ProviderType) field to metric.json and createMetric.json
- Add ontologyStats definition and property to contextMemory.json (OntologyStats javaType)
- Create configuration/aiSettings.json (AISettings, MemoryExtractionSettings, OntologyAgentSettings, PromptConfig, AIPrompts, AIDeletionPolicy)
- Register aiSettings in settings.json enum and config_value oneOf
- Add default seed openmetadata-service/src/main/resources/json/data/settings/aiSettings.json

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(ai): AISettings handler, cache seed/merge, and system settings REST branch

Implements Task 2: AISettingsHandler (validate + merge), SettingsCache
seed/merge block for aiSettings.json mirroring searchSettings, and
SystemResource PUT branch + reset extension for AI_SETTINGS.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(ai): propagate AISettings default-load IO errors and single-return reset

- loadDefaultAiSettings now declares throws IOException instead of
  swallowing it with an empty AISettings fallback; call sites in
  createOrUpdateSetting and resetSettingToDefault wrap with try/catch
  and re-throw as SystemSettingsException (matching the searchSettings
  error-handling pattern)
- resetSettingToDefault refactored to if/else-if/else with a single
  trailing return, eliminating the two early returns
- AISettingsHandlerTest: add incomingNullReturnsDefaults and
  nullNestedDefaultInheritsIncoming tests covering null-guard branches

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(ai): gate file/page extraction on AISettings and externalize the extraction prompt

- Add AISettingsUtil: cache-backed AISettings accessor (never null, fails open)
  with isFileExtractionEnabled, isPageExtractionEnabled, isOntologyAgentEnabled,
  memoryExtractionPrompt, ontologyAgentPrompt helpers
- ContextMemoryExtractor.callLlm resolves the system prompt from AISettings at
  runtime, falling back to the SYSTEM_PROMPT constant
- ContextFileProcessingService.process and fileStatusAfterText gate on
  AISettingsUtil.isFileExtractionEnabled in addition to LLMClientHolder.isEnabled
- KnowledgePageRepository.schedulePillExtraction gates on
  AISettingsUtil.isPageExtractionEnabled in addition to LLMClientHolder.isEnabled
- TDD: AISettingsUtilTest written first (RED), then implementation (GREEN)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(ai): ontology verdict DTO boundary

Add OntologyDerivation, OntologyVerdict (Jackson record DTOs with @JsonProperty on every component and @JsonIgnoreProperties), and OntologyAction constants (REUSE/CREATE/SKIP) as the anti-corruption boundary between untrusted LLM JSON and the domain model. Mirrors the KnowledgePill pattern. Covered by OntologyDerivationTest (lenient parse, unknown-field tolerance).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(ai): ontology grounding + extractor (pure derive)

Add OntologyContext/OntologyCandidate records, OntologyExtractor (calls LLM via completeStructured, returns SKIP/SKIP on empty result), OntologyPromptBuilder (renders memory + candidate lists), and OntologyGrounding (keyword-searches glossary-term/metric/glossary indexes via Entity.getSearchRepository, caps at 20, fails-safe per axis). Covered by OntologyExtractorTest (mocked LLM, two cases: verdict passthrough + empty-→SKIP/SKIP).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ai): use existing search/entity constants, dedupe helper, shorten prompt builder

Replace raw "_score"/"desc" with SearchConstants.DEFAULT_SORT_FIELD/DEFAULT_SORT_ORDER,
remove local FIELD_NAME/FIELD_DESCRIPTION in favour of Entity constants, dedupe
nullToEmpty via StringUtils.defaultString, extract renderMemory() so build() fits 15
lines, use CommonUtil.nullOrEmpty for the candidates guard, and replace raw "CREATE"/"SKIP"
string literals in tests with OntologyAction constants.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(ai): ontology reconciler with ownership lifecycle (create/reuse/retire/cascade)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(ai): correct RELATED_TO edge direction, honor deletionPolicy on re-derive, no-op on all-SKIP

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(ai): ontology processing engine with throttle and hash-gate

Adds OntologyProcessingEngine: a trailing-throttle debounce (mirroring
PageContextProcessingEngine) that collapses rapid memory edits into one
derivation run, protected by a SHA-256 content hash-gate so unchanged
memories are never re-derived. stampOntologyStats persists via
recordChange(updateVersion=false) in ContextMemoryUpdater, exactly
mirroring KnowledgePageRepository.recordExtractionStats, so no version
churn occurs and no postUpdate event fires for the stats field.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(ai): accurate recursion-contract docs, split term/metric stat counts, dedup scheduler

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(ai): ContextMemory lifecycle hooks trigger the ontology agent + cascade

postCreate/postUpdate schedule the OntologyProcessingEngine (gated on
AISettingsUtil.isOntologyAgentEnabled). softDelete/hardDelete/restore
AdditionalChildren fire in-edge-window so DERIVED_FROM edges exist when
OntologyReconciler.onMemoryDeleted/onMemoryRestored run. Extracted
AISettingsUtil.deletionPolicy() to remove duplication between the engine
and the repository. Added OntologyReconciler.onMemoryRestored which
restores CASCADE-soft-deleted automation-owned entities (Include.ALL
query; ORPHAN/DEPRECATE-released entities correctly excluded because
their DERIVED_FROM edges were dropped at delete time). Three new unit
tests cover: owned restore, human-adopted skip, orphan-released skip.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(ai): ontology-bot seed and adopt-on-touch provider release guards

Adds the ontology-bot principal seed files (bot + botUser JSON, auto-loaded
by BotResource.initialize at startup), the OntologyOwnership utility class
(centralized ONTOLOGY_BOT_NAME constant + releaseIfHumanEdited guard), and
wires the guard into the entitySpecificUpdate of GlossaryTermUpdater,
MetricUpdater and GlossaryUpdater so a human PATCH that changes an
agent-managed field flips provider AUTOMATION → USER permanently.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(ai): AISettings admin page with toggles and prompt editors

Adds the AISettingsPage under Settings > Preferences with master enable
toggle, memory-extraction and ontology-agent toggles, a deletion-policy
Select, and two system-prompt Textareas. Registers the route, menu item,
and all i18n keys (synced to all 17 locales).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ai): i18n the AISettings deletion-policy option labels

Replace raw string literals ('cascade'/'orphan'/'deprecate') in
DELETION_POLICY_OPTIONS with i18n labelKey fields; add new keys
label.cascade / label.deprecate / label.orphan to en-us.json (alphabetical)
and sync all 17 other locales via yarn i18n.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(ai): derived-entity provenance projections on memory/term/metric

Add three read-only derived fields (non-default, resolved at read, never
persisted) to expose File→Memory→Term/Metric provenance:

- ContextMemory.derivedEntities (entityReferenceList): terms+metrics
  created by the Ontology Agent via DERIVED_FROM edges
  (findFrom(memory, CONTEXT_MEMORY, DERIVED_FROM, GLOSSARY_TERM/METRIC))
- ContextMemory.reusedEntities (entityReferenceList): terms+metrics
  reused via RELATED_TO edges
  (findTo(memory, CONTEXT_MEMORY, RELATED_TO, GLOSSARY_TERM/METRIC))
- GlossaryTerm.derivedFrom (entityReference): memory that created the term
  (findTo(term, GLOSSARY_TERM, DERIVED_FROM, CONTEXT_MEMORY))
- Metric.derivedFrom (entityReference): memory that created the metric
  (findTo(metric, METRIC, DERIVED_FROM, CONTEXT_MEMORY))

Edge directions verified against OntologyReconciler Task-6 code:
addDerivedFromEdge stores from=entity→to=memory; reuse() stores
from=memory→to=entity for RELATED_TO.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(ai): integration tests for AISettings + ontology agent lifecycle

AISettingsResourceIT: deterministic GET/PUT/reset tests for /system/settings/aiSettings
covering default values (enabled=true, deletionPolicy=cascade), PUT persistence, and reset.

OntologyAgentIT: deterministic lifecycle tests seeding DERIVED_FROM edges via in-process
repository to replicate the reconciler's CREATE path, then driving cascade delete,
adopt-on-touch (provider flip), and derivedFrom/derivedEntities projection fields through
the public REST API without any LLM dependency.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(ai): File→Memory→Term/Metric provenance panels in the UI

- Add derivedEntities/reusedEntities to ContextMemory generated type
- Add derivedFrom to GlossaryTerm and Metric generated types
- Add DERIVED_FROM/DERIVED_ENTITIES/REUSED_ENTITIES to TabSpecificField enum
- Add getContextMemoryById to contextMemoryAPI
- Create DerivedOntologyCard component (+ interface + test): fetches
  derivedEntities/reusedEntities from a memory and renders linked lists
- Embed DerivedOntologyCard into CreateMemoryModal view-only mode so
  every memory's derived/reused ontology is visible when viewing a memory
- Add derivedFrom field to GLOSSARY_TERM_DEFAULT_FIELDS and METRIC_DEFAULT_FIELDS
- Add "Derived from memory" link in GlossaryTermsV1 and MetricDetails when
  derivedFrom is present on the entity
- Add i18n keys: label.derived-from-memory, label.derived-ontology,
  label.reused, message.no-derived-ontology (synced to 19 locales)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(ai): seed/serve aiSettings correctly so GET /system/settings/aiSettings returns defaults

CollectionDAO$SettingsRowMapper.getSettings lacked an AI_SETTINGS case in its
switch, causing every getConfigWithKey("aiSettings") call to throw
IllegalArgumentException (swallowed, returning null) → HTTP 204 on every GET.
Added the missing case so aiSettings rows deserialise to AISettings.

Also fixed resetSettingToDefault for AI_SETTINGS: it was returning defaults
without persisting them to the DB (unlike the equivalent searchSettings reset
path which calls systemRepository.createOrUpdate). Now it persists first.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(ai): seed AUTOMATION provider via createInternal and set id/updatedAt so cascade, adopt-on-touch, and projection ITs all pass

The test helper createAutomationTerm used create-via-REST + termRepo.update to
set provider=AUTOMATION, but the EntityRepository update framework only writes to
DB when entityChanged=true; provider is not tracked by recordChange so the write
was silently no-op'd. Term stayed provider=null in the DB, causing:
- Scenario A timeout: isAutomationOwned read null, cascade skipped the term
- Scenario B: assertEquals(USER, null) failed at assertion after human PATCH
- Scenario C: cascade skipped term, glossary.delete failed with "glossary is not empty"

Fix: replace the two-step seed with termRepo.createInternal() directly, mirroring
OntologyReconciler.createTerm exactly. Also fixed the reconciler itself: createTerm,
createMetric, and resolveOrMintGlossary all called createInternal without setting id
or updatedAt; PostgreSQL GENERATED columns extract both from the stored JSON with
NOT NULL constraints, so omitting them would cause constraint failures in production.

Result: OntologyAgentIT 3/3 GREEN; OntologyReconcilerTest 20/20 + OntologyOwnershipTest
7/7 no regression.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(ai): wire AISettings reset-to-default, strengthen util test, import @transaction

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(ai): document AISettings + Ontology Agent (§19) in the company-context spec

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Update generated TypeScript types

* fix(ai): use design tokens and carry source-memory identity in provenance links

Replace palette classes (text-gray-500/400/900, text-brand-600) with semantic
tokens (text-tertiary, text-brand-secondary). Link both provenance anchors to
ROUTES.CONTEXT_CENTER_MEMORIES?memory=<name> so the memories-list auto-opens
the correct memory's view modal on arrival.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(ai): gate per-axis derive toggles, validate LLM entity names, narrow catches

- OntologyReconciler: add 5-arg reconcile() with explicit deriveTerms/deriveMetrics
  flags; reconcileAxis() gates BOTH apply and retire per flag, preventing mass-retire
  bug in the reviewer's suggested null-implied approach
- OntologyReconciler: isValidName() guards CREATE paths for term, metric, glossary
  mint — null/blank or FQN-reserved chars (. " /) become a logged SKIP instead of
  thrown exception (Fix #2+#6)
- OntologyProcessingEngine: read AISettings once in derive(), compute axis flags via
  deriveTermsEnabled/deriveMetricsEnabled helpers, pass to 5-arg reconcile() — no
  more settings coupling inside reconciler
- Narrow Exception catches: AISettingsUtil → RuntimeException; OntologyGrounding →
  IOException|RuntimeException; OntologyProcessingEngine#runScheduled → RuntimeException
  with explanatory comment (Fix #5)
- OntologyReconcilerTest: 4 new tests covering null/invalid-name SKIP behavior
  (27 total, 0 failures)
- OntologyAgentIT Scenario E: fully deterministic axis-toggle coverage — seeds
  AUTOMATION-owned metric, calls 5-arg reconcile with deriveMetrics=false, asserts
  0 metrics created + 0 retired + owned metric survives (4 tests, 0 failures)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ai): stamp content hash on reconcile failure to prevent re-derive poison-pill loop

Wrap reconciler.reconcile() in reconcileSafely() which catches RuntimeException,
logs the error with memory id, and returns null so the derive() method always
reaches stampOntologyStats(). buildStats() is made null-safe (zero counts when
result is null). LLM/network stages (fetchCandidates, extractor.derive) remain
outside the guard so transient failures still propagate and retry. Adds
stampsHashEvenWhenReconcileThrows test to prove the loop is broken.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(ai): fetch memories list with generic sourceEntity, not deprecated sourceFile

Page-sourced memories resolve their source via the generic sourceEntity ref;
the deprecated sourceFile alias only covers file-sourced ones.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(ai): reuse existing term/metric on FQN collision and cancel pending run on memory delete

P1: OntologyReconciler.createTerm/createMetric now precheck the target FQN via
findByFqn(repo, fqn) (NON_DELETED) before calling createInternal; on collision
they call reuseExisting() (same RELATED_TO edge + counts.reused++) instead of
throwing a unique-constraint violation. resolveOrMintGlossary also checks by
newGlossaryName before minting a duplicate glossary. Three new unit tests cover
the term-FQN-collision, metric-FQN-collision, and glossary-reuse-by-name paths.

P2: ContextMemoryRepository.softDeleteAdditionalChildren/hardDeleteAdditionalChildren
now call OntologyProcessingEngine.instance().cancel(memoryId) via a shared
cancelAndCascadeOntology helper before cascadeOntology, so any pending scheduled
derivation is cancelled when a memory is deleted, preventing spurious
EntityNotFoundException in runScheduled.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(ai): null-guard the LLM verdict list in OntologyExtractor.derive

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(ui): align AISettingsPage to codegen DeletionPolicy enum name

The TS-codegen bot regenerated aiSettings.ts exporting the enum as
DeletionPolicy (from the deletionPolicy schema key), not AIDeletionPolicy.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(ai): centralize MCP Server and MCP Chat into AI Settings

MCP Server and MCP Chat are no longer internal Applications; they are managed
as platform settings.

- MCP Chat: new aiSettings.mcpChat (enabled + systemPrompt) drives a runtime
  McpChatServiceHolder, re-initialized on AI settings save/reset so chat
  toggles without a restart; McpClientResource reads the holder.
- MCP Server: registerMCPServer gates on mcpConfiguration.enabled (seeded by
  default) instead of an installed app; configurable via /system/mcp/config.
- McpApplicationBot seeded as a system bot with impersonation; McpApplication
  and McpChatApplication entities, marketplace defs and the mcpChatAppConfig
  schema removed.
- MCP usage telemetry re-anchored to a constant identity (read history kept).
- AI Settings page adds MCP Chat and MCP Server sections; chat sidebar gated
  on the setting.
- 2.0.0 migration carries prior app config/enablement into settings, then
  retires the apps (keeping the bots).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Update generated TypeScript types

* refactor(ai): rename Ontology Agent to Memory Agent

Rename the agent's identity from "Ontology Agent" to "Memory Agent"
across all layers: the drive/ontology Java package and its Ontology*
pipeline/DTO classes, OntologyOwnership, the ontology-bot principal,
the aiSettings ontologyAgent config key, the ContextMemory
ontologyStats/OntologyProcessingStatus fields, the OntologyStatusBadge
UI component, and the ontology-agent i18n labels (synced across all
locales). Java models and generated TS types are regenerated from the
renamed schemas.

The unrelated Ontology Explorer (RDF glossary graph) feature and the
output-concept "Derived Ontology" provenance panels are intentionally
left unchanged, since they name the derived term/metric graph rather
than the agent.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Update generated TypeScript types

* fix(ai): show MCP Chat in sidebar via setting + restrict MCP Server config

- MCP Chat sidebar: add authenticated GET /mcp-client/enabled and gate the
  McpChatPlugin sidebar entry on aiSettings.mcpChat.enabled. It was tied to the
  removed app, so it never appeared after enabling chat in AI Settings.
- MCP Server settings: expose only the enable toggle and Origin Header URI; the
  endpoint path is fixed at /api/v1/mcp and no longer editable; drop the origin
  validation and allowed origins fields.
- EnumBackwardCompatibilityTest: account for the appended DERIVED_FROM
  relationship (count 26 -> 27, new last ordinal).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(ai): drop MCP Chat config migration; derive MCP usage appId from name

- The MCP Chat app was never shipped to customers, so the 2.0.0 migration no
  longer carries its config into aiSettings.mcpChat — the seeded default shape
  is kept. Server enable-alignment and dead-app cleanup remain.
- McpToolCallUsage.appId: no MCP-usage query reads it (recorder writes, resource
  reads by appName), but apps_extension_time_series.appId is a NOT NULL generated
  column, so a value is required. Derive it deterministically from MCP_APP_NAME
  instead of a hardcoded UUID.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* format

* fix(ai): commit missing MemoryRelation DTO blocked by .gitignore

The bare `memory` pattern in .gitignore (a claude-flow tooling entry)
also matched the drive/memory Java package, so the newly added
MemoryRelation.java was silently ignored and never committed — breaking
CI with "cannot find symbol MemoryRelation" in MemoryVerdict and
MemoryReconciler. Scope the ignore to root-level /memory/ and add the file.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(ai): make file-extraction LLM gate fully injectable (fix unit tests)

ContextFileProcessingService gated knowledge-pill extraction on BOTH the
injected llmEnabledSupplier AND a direct static
AISettingsUtil.isFileExtractionEnabled(AISettingsUtil.get()). The static
read needs a live SettingsCache, so ContextFileProcessingServiceTest
(which injects the gate as () -> true) could not satisfy it: extraction
was skipped and 3 tests failed (wrong repository.update counts,
runExtraction never invoked).

Fold the AISettings check into the production default supplier and route
both call sites through one shouldExtractContext(...) helper, so the
status machine is unit-testable and the repeated compound condition lives
in one place. Production gate (LLM enabled AND file-extraction toggle) is
unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(ui): add missing label.general i18n key for AI Settings

AISettingsPage referenced t('label.general') but the key was absent
from en-us.json. Added it and synced all locale files via yarn i18n.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(ui): repair AISettingsPage and CreateMemoryModal jest mocks

AISettingsPage imports Input from ui-core-components but the test mock
omitted it, so the component rendered an undefined element ("Element type
is invalid"). Add the Input mock plus the missing settingConfigAPI mocks
(getMcpConfiguration/restoreSettingsConfig/updateMcpConfiguration) the
component calls.

CreateMemoryModal's partial DateTimeUtils mock dropped
getEpochMillisForPastDays, which profiler.constant.ts invokes at module
load via a deep import chain, failing the suite at import. Spread
requireActual to preserve the real exports, and stub DerivedOntologyCard
to cut the heavy transitive chain (EntityUtilClassBase -> DataProductsPage
-> ConnectionStepCard) that also pulled in unmocked antd internals.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* build(deps): bump undici from 6.25.0 to 6.27.0 and form-data to 4.0.5 in /openmetadata-ui/src/main/resources/ui (open-metadata#29241)

* build(deps): bump undici in /openmetadata-ui/src/main/resources/ui

Bumps [undici](https://github.com/nodejs/undici) from 6.25.0 to 6.27.0.
- [Release notes](https://github.com/nodejs/undici/releases)
- [Commits](nodejs/undici@v6.25.0...v6.27.0)

---
updated-dependencies:
- dependency-name: undici
  dependency-version: 6.27.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

* update yarn

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Harsh Vador <harsh.vador@somaiya.edu>

* change index from all to dataAsset for my data (open-metadata#29209)

* change index from all to dataAsset for my data

* fix jest tests

* fix playwrights

* fix playwrights

---------

Co-authored-by: Shrabanti Paul <shrabantipaul@Shrabantis-MacBook-Pro.local>

* fix(fqn): support double-quotes in fully qualified names + guard/repair corrupt FQNs (open-metadata#28697)

* fix(fqn): support double-quotes in fully qualified names + guard/repair corrupt FQNs

Names containing a double-quote could not be represented in an FQN: the Fqn
grammar had no escape mechanism, yet quoteName() backslash-escaped the quote and
stored an unparseable segment. Building the FQN is a pure string op, so such
values were written successfully (insert hashes only the entity's own FQN); they
then detonated later with a 500 (ParseCancellationException) the first time a
nested FQN was hashed (e.g. a tags read), and were painful to migrate.

Three layered fixes:

- Grammar + quoteName: NAME_WITH_RESERVED now allows any character with '"'
  escaped by doubling it (""). quoteName/unquoteName encode/decode accordingly
  and are idempotent. Names without a quote encode identically to before, so
  existing FQNs and their hashes are unchanged (no reindex/migration needed).

- Ingest guard: FullyQualifiedName.validateFqnName() asserts a name round-trips
  through encode->parse->decode, wired into every nested-FQN setter (columns,
  pipeline tasks, topic/searchIndex/apiEndpoint fields, mlFeatures). A name that
  cannot be hashed is now rejected at ingest with a clear 400 instead of being
  stored to fail later.

- Heal-on-read: FullyQualifiedName.isValid() detects legacy-corrupt FQNs;
  PipelineRepository repairs unparseable task FQNs on the fly by re-deriving them
  from the task name, so existing poisoned data reads cleanly (200) without a
  migration. The repair is in-memory and persists on the next update.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(fqn): repair corrupt pipeline task FQNs via migration, not heal-on-read

Heal-on-read (PipelineRepository.repairTaskFqns) ran a full ANTLR parse for
every task on every pipeline read to subsidize a finite set of already-corrupt
rows, was incomplete (the bulk/LIST/search path still 500'd), and could NPE on
a null task FQN. Replace it with a one-time migration so the corruption leaves
the stored data and reads pay no per-request cost.

- Remove repairTaskFqns and its setFields() call; keep the validateFqnName
  write-path guard that rejects un-representable names at ingest (400).
- Add migration v11211 (mysql + postgres): re-derive task FQNs where !isValid,
  persist only when changed.
- Harden FullyQualifiedName.isValid to treat null/empty as invalid (no NPE).
- Require >=1 char inside a quoted FQN segment (grammar + not *), rejecting
  empty quoted segments ("").

FullyQualifiedNameTest: 17/17.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(fqn): reject empty nested-object names at write time

validateFqnName returned early when quoteName(name) was unchanged, letting
empty names through (quoteName("") == ""). An empty pipeline task name (the
schema sets no minLength on task.name) then produced an unhashable empty FQN
segment ("parent.") that 500'd on the next FQN hash -- the same failure class
as unrepresentable names. Treat null/empty as invalid so every nested-FQN
setter (columns, tasks, fields, mlFeatures) rejects them up front with a 400.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(fqn): make pipeline task-FQN migration batched and observable

Address review feedback on the one-time repair migration:

- Performance: scan pipelines in pages of 1000 via listAfterWithOffset instead
  of selecting every id and calling findEntityById per pipeline, dropping the
  N+1 round-trips and the full id list held in memory. Only changed rows are
  written.
- Observability: track scanned/repaired/failed counts and log a prominent WARN
  with up to 100 pipeline ids that could not be repaired, instead of swallowing
  each failure as a lone WARN, so operators get a concrete remediation list.
- Search: document (completion log + schemaChanges) that repaired task FQNs are
  reflected in the search index after the standard post-upgrade reindex, matching
  existing FQN-fix migration behavior.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(fqn): cover pipeline task-FQN repair migration

Add MigrationUtilTest for the v11211 repairPipelineTaskFqns migration:
repair correctness (re-derive unparseable/null task FQNs, leave valid ones
untouched, skip task-less pipelines) and migration-path resilience -- a single
unreadable row or a failing update must not abort the upgrade.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(fqn): cover repair-migration pagination across pages

The existing repair-migration tests stubbed listAfterWithOffset for any
offset and returned data only at offset 0, so every case exercised a
single page. Add a test that stubs distinct pages by offset
(0 -> page 1, 1000 -> page 2, 2000 -> empty) and asserts the second page
is scanned and repaired, locking in correct limit/offset ordering and
offset advancement in repairPipelineTaskFqns.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* chore(migration): retarget pipeline task-FQN repair to 1.13.1

Move the pipeline task-FQN repair migration from 1.12.11 (package v11211)
to 1.13.1 (package v1131): the native SQL placeholder dir, the mysql and
postgres Migration handlers, the MigrationUtil, and its test. The
framework derives the handler package from the version dir via
MigrationFile.getVersionPackageName(), so 1.13.1 -> v1131; no logic
changes. Pure data migration, no DDL.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(fqn): correct repair-migration summary counts; unscope 1.13.1 DDL

Addresses code review on the 1.13.1 pipeline task-FQN repair migration.

1. Failed-to-persist pipelines were double-counted in the summary log.
   repairPipeline kept taskCount > 0 when pipelineDAO.update threw, so a row
   that never persisted was reported as both "re-derived N task FQNs" and
   "could not fix N pipeline(s)", overstating success. Reset taskCount to 0
   on the persistence-failure path so only rows that actually persisted count
   as repaired. repairPipelineTaskFqns now returns a RepairSummary so the
   counts are asserted directly (doesNotCountFailedPersistAsRepaired fails
   without the fix). The Migration handlers ignore the return value.

2. Revert bootstrap/sql/migrations/native/1.13.1/{mysql,postgres}/
   schemaChanges.sql to match main exactly. The intake_form_entity DDL there
   belongs to main (consumed by IntakeFormDAO) and arrived via the main
   merge, not this PR; only a local comment was added on top. Dropping that
   comment makes this PR's net change to those files zero and removes the
   "data migration only" text that contradicted the DDL. The FQN repair runs
   via the Java v1131 Migration handler and needs no SQL.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Add migration for variuos childresn entities

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Sriharsha Chintalapani <harsha@getcollate.io>
Co-authored-by: Sriharsha Chintalapani <harshach@users.noreply.github.com>

* fix(mcp): OAuth login fails with 400 when SSO returns id_token in URL fragment (open-metadata#29228)

* fix(mcp): handle active-session shortcut and implicit-flow fragment at /mcp/callback

Root cause: when a user has an active Google/Azure SSO session,
AuthenticationCodeFlowHandler.handleLogin() bypasses pac4j entirely and
calls sendRedirectWithToken() directly, committing the response as
302 /mcp/callback#id_token=... (token in URL fragment, no pac4j state).

This caused three failures:
1. UserSSOOAuthProvider.handleSSOAuthorization() found no pac4j state in
   session (expected — pac4j was never invoked) and threw AuthorizeException.
2. AuthorizationHandler.exceptionally() turned the exception into an error
   redirect URL; handleAuthorizeRequest() then called sendRedirect() on an
   already-committed response → IllegalStateException: Committed at line 503.
3. McpCallbackServlet received the callback with #id_token=... in the URL
   fragment (browser-only, server never sees it), so both pac4jState and
   idTokenParam were null → 400 'missing state'.

Fixes:
- UserSSOOAuthProvider: check response.isCommitted() before throwing;
  return SSO_REDIRECT_INITIATED for the active-session path.
- OAuthHttpStatelessServerTransportProvider: guard sendRedirect() with
  response.isCommitted() check to prevent Committed exception.
- McpCallbackServlet: serve a JS fragment-extraction page instead of
  returning 400 — JS reads window.location.hash, extracts id_token,
  retries /mcp/callback?id_token=... so handleDirectIdTokenFlow() runs.

Also adds debug logging throughout the MCP auth flow for easier
diagnosis of future SSO/OAuth callback issues.

* fix(mcp): address review — POST token from fragment, fix log field, import Collections

- serveFragmentExtractionPage: switch from GET redirect to form POST so
  the id_token never appears in a URL, browser history, or access logs
  (RFC 6819 §5.3.5). Add doPost() that reads id_token from the body and
  delegates to handleDirectIdTokenFlow().
- McpCallbackServlet debug log: rename hasFragment→refererPresent with a
  boolean so the field is meaningful (server cannot observe the fragment).
- UserSSOOAuthProvider: replace java.util.Collections FQN with import +
  simple name per project Java standards.

* Update openmetadata-mcp/src/main/java/org/openmetadata/mcp/server/auth/handlers/McpCallbackServlet.java

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

* fix(mcp): address review — if/else in doPost, textContent XSS fix, unit tests

- doPost: replace early-return guard clauses with if/else per project standard
  (one return statement per method)
- serveFragmentExtractionPage: replace innerHTML with textContent in both
  error branches to eliminate the JS XSS antipattern (greptile finding)
- McpCallbackServlet: package-private constructor for test injection;
  resolveSsoHandler() promoted to protected for subclass override in tests
- Add McpCallbackServletTest: 7 tests covering serveFragmentExtractionPage
  (content-type, form POST shape, textContent usage) and doPost edge cases
  (null handler→503, null/empty id_token→400)
- Add OAuthHttpStatelessServerTransportProviderTest: 4 tests covering
  sanitizeRedirectUrlForLogging (with/without query, null) and the
  committed-response guard

* fix(mcp): CSRF protection on doPost; remove vacuous guard test; real CSRF tests

Security (P1): doPost accepted cross-origin form submissions without any
CSRF check. A malicious site holding a valid id_token for a different user
could craft a form targeting /mcp/callback and hijack a victim's pending
MCP auth session (victim's Claude Desktop authenticates as the attacker).

Fix: add isOriginAllowed() — rejects any POST whose Origin header does not
match the server's own base URL (resolved from MCP config or system
settings). Absent Origin (same-origin browsers may omit for non-CORS
requests) is treated as allowed. Package-private for testability.

Test quality: remove the vacuous handleAuthorizeRequest guard test that
never called the method under test and trivially passed — replaced with
a comment noting the guard is covered at integration level. Replace with
4 real CSRF tests: Origin absent, matching, mismatched, and full doPost
403 path verification.

* fix(mcp): CSRF default-port normalization and reject-on-unknown-origin

* refactor(mcp): string constants + lazy-cache server origin

- Extract all sendError message strings to package-visible static final
  constants (ERR_SSO_UNAVAILABLE, ERR_CSRF_ORIGIN_MISMATCH,
  ERR_MISSING_ID_TOKEN, ERR_CALLBACK_FAILED, ERR_MISSING_STATE,
  ERR_STATE_NOT_FOUND). Eliminates magic strings, gives callers a stable
  contract, catches typos at compile time.
- Add cachedServerOrigin volatile field + getServerOrigin() lazy-init
  helper: resolveServerOrigin() is now called at most once per server
  lifetime instead of on every POST. Non-null results are cached;
  null results (transient DB miss at startup) are not cached so the
  next request retries — fail-secure without permanently breaking CSRF.
- Update tests to reference constants instead of duplicating literals.
- Add two cache-behaviour tests: verifies resolveServerOrigin() is
  called exactly once after a successful resolution, and called again
  on each request when it returns null (no caching of null).

---------

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

* Load learning drawer chunk without a page-loader flash (open-metadata#29280)

LearningIcon always mounts the lazy LearningDrawer, so its Suspense
fallback rendered the centered page Loader inline next to the icon on
mount, before any user interaction.

Make the fallback configurable on withSuspenseFallback (still defaults to
the existing Loader, so all current callers are unchanged) and have
LearningIcon opt out with null. The chunk now loads silently while the
drawer stays mounted, preserving the close animation and the in-drawer
resource player that relies on the drawer not unmounting.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* Fixes open-metadata#27945: surface exact/prefix matches first in QuickFilter aggregation (open-metadata#29231)

* fix(search): surface exact/prefix matches first in QuickFilter aggregation

Fixes open-metadata#27945

The /search/aggregate endpoint used a single terms aggregation with
include: ".*term.*", ordered alphabetically with a fixed size=10. When
more than 10 values matched the pattern, exact matches (e.g. "name")
were silently dropped in favour of alphabetically-earlier contains
matches (e.g. "first_name", "display_name").

Replace the single agg with three targeted sub-aggregations sent in one
ES/OS round-trip:
  • __exact   – include: "term"     (size 1, O(1) dict lookup)
  • __prefix  – include: "term.*"   (size N, B-tree prefix scan)
  • __contains – include: ".*term.*" (size N, full wildcard, unchanged)

The backend merges the three bucket lists in priority order
(exact → prefix → contains), deduplicates, trims to the requested size,
and rewrites the response under the original sterms#field key so the
frontend requires no changes.

Add SearchUtils helpers: isBestMatchSearchPattern, extractRawSearchValue,
exactAggKey/prefixAggKey/containsAggKey, mergeBestMatchAggregations.
Cover all helpers and merge edge cases in SearchUtilsTest (12 new tests).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(search): address PR review — safe fallback, dot-escaping, null guard

- mergeBestMatchAggregations: on merge failure, degrade to renaming
  the __contains sub-agg to sterms#<field> so the UI always receives
  the key it expects; double-catches so even fallback parse failures
  are silent
- buildBestMatchAggregations (ES + OS): escape '.' in rawValue before
  using as Lucene regexp include for exact and prefix sub-aggs, preventing
  field names like 'user.id' from being treated as wildcards and preventing
  unbalanced-regexp 500s
- isBestMatchSearchPattern: guard against null input
- SearchUtilsTest: fix two broken parametrized cases (.* → expected '',
  remove null-passing CSV row), add dedicated null test, add fallback-path
  and dot-escaping tests

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(search): add missing ObjectNode import in SearchUtilsTest

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(search): fix dot-escaping test expectations — trailing .* wildcard must not be escaped

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(search): cap oversized dataModel column tree at index time (open-metadata#29212)

* fix(search): cap oversized dataModel column tree at index time

A container/table with a pathologically large dataModel (a wide/nested schema with hundreds of thousands of columns) produces a multi-hundred-MB search document. The existing oversized-doc guard strips lineage but then ships the doc anyway, so the server stores ~196MB, parses it whole on every read and reindex, and can OOM.

Extend stripDocMapIfOversized (live index) and stripLineageForSize (bulk reindex) to also strip the nested column children and the derived columnNames/columnNamesFuzzy when the doc is still over the cap after lineage stripping. Top-level columns are kept, so column search and the column grid still work; the full schema stays available via the entity API. Gated by size, so normal entities are untouched. The nested children are mapped enabled:false (stored, not indexed), so nothing searchable is lost.

This bounds the indexed document at the source, so the server never holds the giant doc on read or reindex — complementing the read-side response streaming.

* perf(search): compute oversized-doc size once per mutation in strip path

Thread the serialized byte size through a local in stripDocMapIfOversized and stripLineageForSize instead of re-serializing the full document for each size gate and log line. On the oversized path this avoids redundant ~hundreds-of-MB pojoToJson/getBytes allocations exactly when memory and CPU are most constrained. Addresses review feedback on open-metadata#29212.

* refactor(search): guard upstreamLineage strip and reuse post-strip size in logs

Guard the upstreamLineage removal in stripLineageForSize with a null check to match stripDocMapIfOversized, avoiding a wasted full-doc serialization and a misleading WARN when the field is absent on an oversized doc. Store the post-strip serialized size in a local for the column-strip log lines instead of recomputing inline. Addresses review feedback on open-metadata#29212.

* fix(ui): exclude dataModel from Explore/suggestion search payloads

Complements the index-time column-tree strip (open-metadata#29212): exclude dataModel from Explore and search-suggestion payloads, lazy-fetch it in the container summary panel via getContainerByFQN when absent, and make the service-insights asset-count query aggregation-only (pageSize 0, fetchSource false). Ported from open-metadata#29200 so the index-side and UI-side fixes ship together. Relates to open-metadata#29210.

* fix(search): address PR review — strip docs/logs + summary-panel error handling

Backend: document the column-tree strip in stripLineageForSize JavaDoc and include columnNamesFuzzy in both oversized-doc WARN logs. UI: in the container summary panel's on-demand dataModel fetch, reset previously-fetched columns when the container changes (so the prior container's schema isn't shown) and surface fetch failures via showErrorToast instead of silently rendering 'No data'. Addresses review comments on open-metadata#29212.

* fix(ui): clear stuck loader on container switch + test on-demand dataModel fetch

Address follow-up review on the summary-panel lazy-fetch: clear isColumnsLoading in the effect's early-return so a now-cancelled in-flight getContainerByFQN can't leave the loader stuck on. Add tests covering the on-demand fetch (fires only when columns are absent from the search hit, not fetched when inline, and surfaces a toast on failure). Addresses review comments on open-metadata#29212.

* fix(ui): prevent No-data flash on summary-panel mount + fix import order

Lazily initialize isColumnsLoading (loading when columns must be fetched on demand) so the container summary panel shows the loader on first render instead of briefly flashing 'No data available' (greptile P1). Also reorder imports (TablePureUtils before ToastUtils, drop stray blank) to fix UI checkstyle. Addresses review on open-metadata#29212.

* fix(ui): align glossary term Related Terms section inside the left panel (open-metadata#29284)

* fix(ui): align glossary term Related Terms section inside the left panel

* added unit test

* fix lint checks

* fix(ui): prevent ontology relations graph from crashing on large glossaries (open-metadata#29270)

* fix(ui): prevent ontology relations graph from crashing on large glossaries

* nit

* fix(playwright): stop SSORenewal nightly flake from too-short token TTL (open-metadata#29268)

The SSO Session Renewal suite swaps the server to a short SAML JWT TTL before
logging in. At 10s, on a loaded CI runner the initial app bootstrap
(loggedInUser, config, permissions) outran that window, so the first
/permissions fetch 401'd mid-load; the silent refresh succeeded but the
bootstrap request was not retried, leaving the app wedged on the global loading
spinner. dropdown-profile never rendered and the renewal tests timed out,
exhausting all retries on the 2026-06-22 nightly (run 27929661599).

- Raise SHORT_ACCESS_TTL_SECONDS 10 -> 30 so the token outlives bootstrap. 30s
  stays under EXPIRY_THRESHOLD_MILLES (60s), so the proactive-renewal timer
  still fires immediately and the refresh-on-expiry behavior under test is
  unchanged.
- Wait for dropdown-profile at the end of loginViaSaml so login is only "done"
  once the app shell has rendered, making any future bootstrap hang fail in
  beforeAll with a clear cause instead of mid-test.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(exasol): flush statistics before polling Exasol audit tables in integration tests (open-metadata#29278)

* test(ui): update MCP Playwright specs for settings-driven MCP

- AISettings.spec: MCP Server section now exposes only Enable + Origin Header
  URI (path/originValidation removed), so assert/fill the origin header instead
  of the removed mcp-server-path field.
- McpChat.spec: enable MCP Chat via aiSettings.mcpChat.enabled (PUT
  /system/settings) and reset it afterwards, instead of installing the retired
  McpChatApplication app (which now 404s).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Harsh Vador <harsh.vador@somaiya.edu>
Co-authored-by: shrabantipaul-collate <shrabanti.paul@getcollate.io>
Co-authored-by: Shrabanti Paul <shrabantipaul@Shrabantis-MacBook-Pro.local>
Co-authored-by: Mohit Yadav <105265192+mohityadav766@users.noreply.github.com>
Co-authored-by: Sriharsha Chintalapani <harsha@getcollate.io>
Co-authored-by: Sriharsha Chintalapani <harshach@users.noreply.github.com>
Co-authored-by: Vishnu Jain <121681876+Vishnuujain@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: Sid <30566406+siddhant1@users.noreply.github.com>
Co-authored-by: Chirag Madlani <12962843+chirag-madlani@users.noreply.github.com>
Co-authored-by: sonika-shah <58761340+sonika-shah@users.noreply.github.com>
Co-authored-by: Anujkumar Yadav <anujf0510@gmail.com>
Co-authored-by: harshsoni2024 <64592571+harshsoni2024@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants