Skip to content

[fix](editlog) Fix BDBEnvironment.removeDatabase() wrong list removal index#62064

Open
dataroaring wants to merge 2 commits intoapache:masterfrom
dataroaring:fix/bdbenv-remove-index
Open

[fix](editlog) Fix BDBEnvironment.removeDatabase() wrong list removal index#62064
dataroaring wants to merge 2 commits intoapache:masterfrom
dataroaring:fix/bdbenv-remove-index

Conversation

@dataroaring
Copy link
Copy Markdown
Contributor

Summary

  • removeDatabase() uses a manual index counter during iteration, then calls openedDatabases.remove(index) after the loop
  • If getDatabaseName() throws (e.g. DatabasePreemptedException for a preempted database), the index counter gets out of sync with actual position, causing removal of the wrong database handle
  • Fix: replace manual index tracking with iterator.remove(), which is both correct and concise

Test plan

  • Verify old journal databases are correctly removed during cleanup
  • Verify no wrong database handles are removed from openedDatabases

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings April 2, 2026 14:29
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes BDBEnvironment.removeDatabase() so it removes the correct database handle from openedDatabases even if getDatabaseName() throws during iteration.

Changes:

  • Replaced manual index tracking + post-loop openedDatabases.remove(index) with Iterator.remove() during iteration.
  • Simplified removal flow by eliminating the targetDbName/index bookkeeping.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

String targetDbName = null;
int index = 0;
for (Database db : openedDatabases) {
for (java.util.Iterator<Database> iter = openedDatabases.iterator(); iter.hasNext();) {
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a fully-qualified type (java.util.Iterator) inside method bodies hurts readability and is inconsistent with typical Java style. Prefer importing java.util.Iterator (or using an existing import) and writing for (Iterator<Database> iter = ...) instead.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Added import and changed to Iterator.

@dataroaring
Copy link
Copy Markdown
Contributor Author

/review

…t removal

removeDatabase() uses a manual index counter to track position in
openedDatabases, then calls openedDatabases.remove(index). But the index
variable is only incremented when the database name does NOT match. When
the match is found and the loop breaks, index correctly points to the
matched element. However, this pattern is fragile and error-prone compared
to using iterator.remove().

More critically, if getDatabaseName() throws (e.g. DatabasePreemptedException
for a preempted database), the index counter gets out of sync with the
actual position, causing removal of the wrong database handle from the list.

Fix: replace the manual index tracking with iterator.remove(), which is
both correct and concise.
@dataroaring dataroaring force-pushed the fix/bdbenv-remove-index branch from 138ab56 to ba6c240 Compare April 3, 2026 01:39
Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found 1 issue that should be addressed before this change lands.

Critical checkpoint conclusions:

  • Goal of the task: Partially met. The patch removes the fragile manual index bookkeeping, but it does not handle the documented DatabasePreemptedException/stale-handle path that motivated the bug report, so removeDatabase() can still fail before removing the target DB. No test proves this path.
  • Change size/focus: Yes. The patch is small and focused.
  • Concurrency: openedDatabases is still protected by the existing write lock, and this patch does not introduce a new lock-order issue.
  • Lifecycle/static init: No special lifecycle or static initialization changes.
  • Configuration: No config changes.
  • Compatibility: No FE/BE protocol or storage compatibility impact.
  • Parallel paths: Not fully handled. openDatabase() already treats preempted databases as expected stale state, but removeDatabase() still does not mirror that behavior.
  • Special conditions: The special preempted-database condition is documented in the same class, but this method still lacks the required exception handling.
  • Test coverage: Insufficient for the changed bug path. Existing tests cover normal removal and removing twice, not a preempted/stale handle in openedDatabases.
  • Observability: Existing logging is adequate for this path.
  • Transaction/persistence: The change touches journal database cleanup; if removeDatabase() aborts early, old journal DBs can remain undeleted.
  • Data write/modification correctness: Cleanup correctness is still at risk because a stale handle can stop removal of the intended DB.
  • FE/BE variable passing: Not applicable.
  • Performance: No material performance concern in this patch.
  • Other issues: None beyond the inline comment.

Overall opinion: needs follow-up before merge because the reported stale-handle scenario is still not fully handled.

for (Database db : openedDatabases) {
for (java.util.Iterator<Database> iter = openedDatabases.iterator(); iter.hasNext();) {
Database db = iter.next();
String name = db.getDatabaseName();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iter.remove() fixes the index bookkeeping bug, but this line can still throw on a stale/preempted handle and abort the whole method. The same class already documents this exact DatabasePreemptedException case in openDatabase().

Concrete failure path: if openedDatabases contains a handle that was preempted by a replicated remove, db.getDatabaseName() throws here, removeDatabase() exits before calling replicatedEnvironment.removeDatabase(null, dbName), and the old journal DB is not removed.

Please mirror the stale-handle cleanup pattern from openDatabase() here (catch the exception, remove the stale entry, and continue scanning), and add a unit test that covers a preempted database handle in openedDatabases.

@dataroaring
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 100.00% (4/4) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 100.00% (4/4) 🎉
Increment coverage report
Complete coverage report

The removeDatabase() method calls db.getDatabaseName() which can throw
DatabasePreemptedException on a stale/preempted handle, aborting the
method before replicatedEnvironment.removeDatabase() is reached. This
leaves old journal databases unremoved.

Mirror the stale-handle cleanup pattern from openDatabase(): catch the
exception, close the stale handle, remove it from openedDatabases, and
continue scanning. Also add a unit test covering this scenario.

Generated by ThinkOps
@dataroaring
Copy link
Copy Markdown
Contributor Author

Addressed the review feedback about DatabasePreemptedException handling in removeDatabase():

Fix (commit 70c3d71):

  • Added try-catch around db.getDatabaseName() in removeDatabase() to handle DatabasePreemptedException on stale/preempted database handles
  • Mirrors the existing cleanup pattern from openDatabase(): catch the exception, close the stale handle, remove it from openedDatabases, and continue scanning
  • The db.close() inside the catch is also wrapped defensively, since a preempted handle may throw during close as well
  • Without this fix, a preempted handle would abort removeDatabase() before reaching replicatedEnvironment.removeDatabase(null, dbName), leaving old journal DBs unremoved

Test:

  • Added testRemoveDatabaseWithPreemptedHandle() unit test that injects a mock preempted database handle (throws on getDatabaseName()), verifies removeDatabase() still completes successfully, and confirms both the stale handle and the target database are properly cleaned up

— ThinkOps 🤖

@dataroaring
Copy link
Copy Markdown
Contributor Author

run buildall

2 similar comments
@dataroaring
Copy link
Copy Markdown
Contributor Author

run buildall

@dataroaring
Copy link
Copy Markdown
Contributor Author

run buildall

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants