fix(schema): Add critical performance indexes to resolve create_namespace latency >30s#3939
fix(schema): Add critical performance indexes to resolve create_namespace latency >30s#3939machov wants to merge 1 commit intoapache:mainfrom
Conversation
…pace latency Add database schema v5 with performance-critical indexes that resolve create_namespace operation timeouts from 30+ seconds to under 2 seconds. Changes: - idx_grants_realm_grantee on grant_records(realm_id, grantee_id) - idx_grants_realm_securable on grant_records(realm_id, securable_id) - idx_entities_catalog_id_id on entities(catalog_id, id) These indexes eliminate sequential scans on grant_records table during permission checks and optimize bulk entity lookups with large IN clauses. Fixes apache#3685
dimas-b
left a comment
There was a problem hiding this comment.
Thanks for your contribution, @machov ! The change LGTM 👍
schema diff against v4 for reference:
19,21c19,21
< -- Changes from v2:
< -- * Added `events` table
< -- * Added `idempotency_records` table for REST idempotency
---
> -- Changes from v4:
> -- * Added performance-critical indexes for grant_records table to fix create_namespace latency (Issue #3685)
> -- * Added optimized index for entities bulk lookups
31c31
< VALUES ('version', 4)
---
> VALUES ('version', 5)
59a60,61
> -- Additional index for bulk entity lookups (Issue #3685)
> CREATE INDEX IF NOT EXISTS idx_entities_catalog_id_id ON entities (catalog_id, id);
98a101,107
>
> -- Performance-critical indexes for grant_records (Issue #3685)
> -- These indexes resolve create_namespace latency from 30+ seconds to under 2 seconds
> CREATE INDEX IF NOT EXISTS idx_grants_realm_grantee
> ON grant_records (realm_id, grantee_id);
> CREATE INDEX IF NOT EXISTS idx_grants_realm_securable
> ON grant_records (realm_id, securable_id);
Given that quite a few people are involved in JDBC persistence, let's give this PR a few extra days in review.
|
@machov : Do you rely on having this fix an a released version soon? Note: 1.4.0 is in the works ATM. |
|
Can you elaborate what else needs to be done? I see all tests passed |
|
@machov : The PR is good to merge from my POV, I merely wanted it to have some more time in review in case other interested people have opinions on the new indexes. The question about 1.4.0 was basically to check whether you need this fix in the 1.4.0 release or you're ok with merging it after 1.4.0. |
|
couple of feedbacks :
|
I'm personally fine with adding the new indexes to v4 DDL files. However, from a more rigorous perspective, it makes sense to version the schema every time there is a material change. This way, it is easier to track how the Polaris database is expected to behave... For example, we could (hypothetically) deny the expensive operations with a v4 schema. I do not mean to do that in current PR, just exposing options to consider 🙂 |
flyrain
left a comment
There was a problem hiding this comment.
Thanks for the change. Echo @singhpk234, we probably reuse the v4 as 1.4.0 isn't release yet.
|
Good point - I missed that v4 schema was added after 1.3.0. Let's update v4 in this PR and merge it before 1.4.0 then. Adding to milestone. |
Problem
Issue #3685 reports critical performance degradation where
create_namespaceAPI operations consistently timeout (>30s), causing cascading failures with 504 Gateway Timeouts.Root Cause Analysis
Database analysis revealed:
grant_recordsSolution
Added schema v5 with three performance-critical indexes:
idx_grants_realm_granteeongrant_records(realm_id, grantee_id)idx_grants_realm_securableongrant_records(realm_id, securable_id)idx_entities_catalog_id_idonentities(catalog_id, id)Performance Impact
Based on issue reporter's testing:
Database Compatibility
IF NOT EXISTSfor safe deploymentFixes #3685