Fixes #26729: Add DB fallback and TTL refresh to TypeRegistry for multi-pod cache consistency#26730
Conversation
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
…istry for multi-pod cache consistency Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
b69ecfc to
6bd506f
Compare
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
Updated PR description to reflect the 15 minute interval. Custom Property creation via API probably isn't so frequent that it warrants refreshing the cache every 2 minute in a multi-pod setup. With the DB fallback and the debounce logic, could probably be bumped up even higher than 15 minutes |
|
@hkhan925 thanks for the PR. will review cc @sonika-shah |
Code Review
|
| Compact |
|
Was this helpful? React with 👍 / 👎 | Gitar
Describe your changes:
Fixes #26729
In multi-replica Kubernetes deployments,
TypeRegistryis a JVM-local singleton that loads custom property definitions once at startup and never refreshes. When a custom property is created on one pod, other pods remain stale — causingUnknown custom fieldvalidation errors on ~(N-1)/N of requests. This has been reported since version 1.6.0 (#25532, #21865) and remains unfixed.Changes to
TypeRegistry.java:DB fallback on cache miss:
getSchema(),getCustomPropertyType(), andgetCustomPropertyConfig()now fall back to the database when a property is not found in the local cache. The full entity type is loaded viaTypeRepository.getByName()and all threeConcurrentHashMapcaches are warmed atomically through the existingaddType()method.TTL-based staleness detection: A per-entity-type timestamp tracks when data was last refreshed. After 15 minutes, the next access triggers a DB reload. This handles property deletions and config modifications propagating across pods.
Cache-miss debounce: To prevent unbounded DB queries when a non-existent property is queried repeatedly, cache-miss-triggered refreshes are debounced to at most once per 30 seconds per entity type. The TTL-based staleness check (15 min) operates independently.
Stale entry cleanup: On refresh, existing custom property entries for the entity type are cleared via
removeIf()before re-adding fresh data from DB, ensuring deleted properties don't persist in the cache.NPE fix:
addType()now useslistOrEmpty()for the custom properties iteration, fixing a latent NPE for Field-category types with nullcustomProperties. This matches the existing guard invalidateCustomProperties().This approach mirrors the established DB-fallback pattern used by other caches in the codebase (
SettingsCache,SubjectCache,BotTokenCache, etc.) and introduces no new infrastructure or dependencies.Type of change:
Checklist:
Fixes <issue-number>: <short explanation>