Fix cluster index and process cleanup by nickva · Pull Request #5986 · apache/couchdb

nickva · 2026-04-28T03:04:41Z

Previously, as described in #5980 we didn't perform a thorough index cleanup when ddocs changed. We only cleaned up on nodes where the design docs were located. That was true for a n=3 cluster and an n=3 db but may not be true in general in a cluster.

To fix the issue, run a small gen_server responsible for performing cluster index cleanup. To avoid spawning Q*N jobs, deduplicate the requests by delaying for up to 30 seconds per clustered db. For cleanup reuse and call the already existing fabric index file cleanup machinery. That accomplishes two things:

Starts a quicker index file cleanup. Previously we only did this during smoosh compaction runs. The view files could linger for a while until compaction in smoosh would be triggered.
Cleaning search index files also stops indexes on their (Java) side, so index file clean-up does "double duty" so to speak, when it comes to index shut down for search indexes

Fix #5980

willholley · 2026-04-28T03:51:11Z

Does this change mean that users who inadvertently trigger a full index rebuild by modifying a design document will have a much smaller window during which to "undo" the change (revert to the previous ddoc content and pick up those index files in disk again)? We've definitely used the delay in index cleanup to recover from that kind of situation, particularly when an index takes hours/days to rebuild.

nickva · 2026-04-28T05:32:36Z

@willholley yeah this would shorten that time. However, when compaction would run is unpredictable in general, so we shouldn't rely on that feature. Previously, a design doc update could "kick" smoosh which could start an immediate cleanup and delete the files. Also, the configure file deletion options still applies (before and with this change) so if that's setup, the view files can still be recoverable with some file moves and copies perhaps.

willholley · 2026-04-28T06:58:03Z

@nickva yep - not a blocker from me, just an operational change we'll need to be aware of.

janl · 2026-04-28T14:30:22Z

revert to the previous ddoc content and pick up those index files in disk again

it feels like we should make this an explicit feature rather than making an incident of the implementation. It might be too big a scope for this PR, but I think we should consider it going forward.

Something along the lines of renaming the file to include the deleted timestamp in the filename so we can purge it later (or rely on last modified times, if they are reliable), and scoop these files up before starting a fresh index build

nickva · 2026-04-28T15:12:14Z

revert to the previous ddoc content and pick up those index files in disk again

it feels like we should make this an explicit feature rather than making an incident of the implementation. It might be too big a scope for this PR, but I think we should consider it going forward.

Something along the lines of renaming the file to include the deleted timestamp in the filename so we can purge it later (or rely on last modified times, if they are reliable), and scoop these files up before starting a fresh index build

Oh that could be interesting. If we open an index we could look through the deleted views files to see if we have something recent (time and/or update seq) we can re-create from

Previously, as described in #5980 we didn't perform a thorough index cleanup when ddocs changed. We only cleaned up on nodes where the design docs were located. That was true for a n=3 db and an n=3, db but may not be true in general in a cluster. To fix the issue, run a small gen_server responsible performing cluster index cleanup. To avoid spawning Q*N jobs, deduplicate the requests by delaying for up to 30 seconds per clustered db. For cleanup reuse and call the already existing fabric index file cleanup machinery. That accomplishes two things: - Starts a quicker index file cleanup. Previously we only did this during smoosh compaction runs. The view files could linger for a while until compaction in smoosh would be triggered. - Cleaning search index files also stops indexes on their (Java) side, so index file clean-up does "double duty" so speak when it comes to index shut down. Fix #5980

nickva force-pushed the fix-clustered-index-cleanup branch from 4720fb7 to c670a11 Compare April 28, 2026 03:29

nickva mentioned this pull request Apr 28, 2026

[Bug]: Index processes are not shutdown on every node when ddoc changes #5980

Open

janl reviewed Apr 28, 2026

View reviewed changes

Comment thread src/couch_index/src/couch_index_cleanup.erl

nickva force-pushed the fix-clustered-index-cleanup branch from c670a11 to 5768a2b Compare April 29, 2026 21:23

nickva force-pushed the fix-clustered-index-cleanup branch from 5768a2b to 578d1b5 Compare April 30, 2026 19:11

nickva requested a review from rnewson May 1, 2026 03:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix cluster index and process cleanup#5986

Fix cluster index and process cleanup#5986
nickva wants to merge 1 commit intomainfrom
fix-clustered-index-cleanup

nickva commented Apr 28, 2026 •

edited

Loading

Uh oh!

willholley commented Apr 28, 2026

Uh oh!

nickva commented Apr 28, 2026

Uh oh!

willholley commented Apr 28, 2026

Uh oh!

janl commented Apr 28, 2026

Uh oh!

Uh oh!

nickva commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

nickva commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

willholley commented Apr 28, 2026

Uh oh!

nickva commented Apr 28, 2026

Uh oh!

willholley commented Apr 28, 2026

Uh oh!

janl commented Apr 28, 2026

Uh oh!

Uh oh!

nickva commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nickva commented Apr 28, 2026 •

edited

Loading