fix(fe): clean DynamicPartitionScheduler.runtimeInfos on DROP TABLE#62884
fix(fe): clean DynamicPartitionScheduler.runtimeInfos on DROP TABLE#62884horus-leonardo wants to merge 1 commit intoapache:masterfrom
Conversation
DynamicPartitionScheduler.runtimeInfos accumulates entries indefinitely when tables are dropped or lose their dynamic_partition properties. removeRuntimeInfo(tableId) is called from ShowDynamicPartitionCommand but only opportunistically: it requires a user to issue SHOW DYNAMIC PARTITION and only catches tables still present in the catalog that have lost their dynamic_partition property. No catalog mutation path calls it. Fix: - Call removeRuntimeInfo() in InternalCatalog.unprotectDropTable() so the entry is cleared when a table is dropped. - Call removeRuntimeInfo() in executeDynamicPartition() at the two cleanup points where the iterator removes a table from the scheduling set (db gone, olapTable null/MTMV/no-dynamic-partition). In a high-DDL-churn workload (CREATE/DROP loops on tables with dynamic_partition.enable=true or partitionRetentionCount > 0) this map can grow unbounded and cause FE OOM after extended uptime. Closes apache#62883 Signed-off-by: Leonardo Constanski <leonardo@horusbi.com.br>
|
/review |
zclllyybb
left a comment
There was a problem hiding this comment.
thanks for your meaningful fix! please add regression test to keep the behaviour
There was a problem hiding this comment.
Pull request overview
This PR addresses an FE memory leak where DynamicPartitionScheduler.runtimeInfos retains per-table runtime entries long after tables are no longer eligible for dynamic partition scheduling (notably after DROP TABLE / missing DB / invalid table cases), which can lead to unbounded heap growth in high-DDL-churn workloads.
Changes:
- Remove
runtimeInfosentries duringDROP TABLEinInternalCatalog.unprotectDropTable(). - Remove
runtimeInfosentries whenexecuteDynamicPartition()prunes tables due to missing DB or invalid/non-eligible tables.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
fe/fe-core/src/main/java/org/apache/doris/datasource/InternalCatalog.java |
Removes dynamic-partition runtime info when a table is dropped from the catalog. |
fe/fe-core/src/main/java/org/apache/doris/clone/DynamicPartitionScheduler.java |
Removes runtime info when the scheduler evicts a table from its working set due to missing DB or invalid/non-eligible table state. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| db.unregisterTable(table.getId()); | ||
| // Fix DynamicPartitionScheduler.runtimeInfos leak on DROP TABLE. | ||
| Env.getCurrentEnv().getDynamicPartitionScheduler().removeRuntimeInfo(table.getId()); |
There was a problem hiding this comment.
There are existing FE unit tests around dynamic partition scheduling/runtime info (e.g. DynamicPartitionTableTest#testRuntimeInfo). Since this change is intended to prevent a memory leak, it would be good to add a unit test that creates runtime info for a table, drops the table via the catalog path, and asserts the relevant runtime info keys no longer return the previously stored values (i.e. the entry was actually removed).
| || !olapTable.getTableProperty().getDynamicPartitionProperty().getEnable()) | ||
| && olapTable.getPartitionRetentionCount() <= 0) { | ||
| iterator.remove(); | ||
| removeRuntimeInfo(tableId); | ||
| continue; |
There was a problem hiding this comment.
The new cleanup in executeDynamicPartition() only runs when the scheduler itself iterates a (dbId, tableId) entry and decides to iterator.remove(). However, the normal DDL path for disabling dynamic partition / setting partition.retention_count back to 0 calls DynamicPartitionUtil.registerOrRemoveDynamicPartitionTable(...), which removes the table from dynamicPartitionTableInfo via removeDynamicPartitionTable(...) without clearing runtimeInfos. Once removed from the set, executeDynamicPartition() will never visit it again, so its runtimeInfos entry can still become permanent. Consider clearing runtimeInfos as part of the removal path as well (e.g., in removeDynamicPartitionTable(...) or in DynamicPartitionUtil.registerOrRemoveDynamicPartitionTable() when unregistering).
There was a problem hiding this comment.
This issue should be considered. It may be better to handle this issue at the end of the relevant DML statements, rather than within the scheduler.
What problem does this PR solve?
Issue Number: close #62883
Related PR: none
Problem Summary:
DynamicPartitionScheduler.runtimeInfosaccumulates entries indefinitely. The map is keyed bytableIdand gets a new entry every time the scheduler runs against a table withdynamic_partition.enable=trueorpartitionRetentionCount > 0.removeRuntimeInfo(long tableId)is called in exactly one place:ShowDynamicPartitionCommand.doRun(), which only fires when a user issuesSHOW DYNAMIC PARTITIONand only for tables still present in the catalog that have lost theirdynamic_partitionproperty. No catalog mutation path calls it — DROP TABLE, DROP DATABASE, and tables that turn off dynamic_partition or zero outpartitionRetentionCountall leave permanent entries. In automated ETL workloads where nobody runsSHOW, the map grows unbounded.This patch wires
removeRuntimeInfo()into the three canonical cleanup points:InternalCatalog.unprotectDropTable()— alongsidedb.unregisterTable().executeDynamicPartition()db == nullbranch — afteriterator.remove().executeDynamicPartition()olapTableinvalid/lost-properties branch — afteriterator.remove().Found via heap dump analysis after an FE OOM on 4.0.5-rc01 today (2026-04-27) in a high-DDL-churn ETL workload. The map had reached ~1.5M entries / 554 MB retained heap. We are rolling out a patched build to production now and will follow up on the issue thread with steady-state retention numbers after a week of uptime.
Full bug report and heap dump details in #62883.
Release note
Fix FE memory leak in
DynamicPartitionScheduler.runtimeInfosfor tables that are dropped, lose theirdynamic_partition.enableproperty, or havepartitionRetentionCountreset to 0.Check List (For Author)
Manual test: heap dump analysis on a 4.0.5-rc01 FE that OOMed under an ETL workload doing ~24K DDL/hour against
dynamic_partitiontables. The dump showedruntimeInfosholding ~1M–1.5M stale entries (2,097,152-bucketConcurrentHashMap$Node[], 554 MB retained onDynamicPartitionScheduler, 17% of live heap post-GC walk). The patched build is being deployed today; I will report steady-state heap numbers in the issue thread after a week of production uptime.A unit test reproducing the leak would need to drive the dynamic-partition scheduler against a synthetic catalog and assert
runtimeInfos.size()after DROP. Happy to add one if maintainers prefer that over the production validation.Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)