Skip to content

Conversation

@nickva
Copy link
Contributor

@nickva nickva commented Nov 28, 2024

Previously, we included attachment sizes in the active size totals even if the attachments were deleted from the doc revision leafs.

The fix is, unsurprisingly, to do what the compactor does when it copies and updates sizes: only update sizes for leaf revision during the rev-tree traversal. We did this for doc bodies already in #4264, but forgot to consider attachments then.

In general we're aiming for the active size to approximate the size of the file after compaction. That is what drives smoosh (autocompaction) priority. If we don't remove non-leaf attachments from the calculation, the active/file ratio in smoosh may never trigger properly as ratio will stay low for shards with large attachments.

Until a release goes out with this fix, a workaround could be to compact shards with large attachments by hand, or use a lower threshold on slack_dbs channel (and concurrency=1 to keep it in the background) to slowly cycle through all the shards and try to compact them.

While the fix itself is small, to convince ourselves we did the right thing update the tests to check a few corner cases.

Interestingly enough we had a disabled Elixir compaction test which we disabled because it was "mysteriously" not reporting a lower "active" size after compacting. So re-enable it and also improve it a bit to have both a attachment that gets deleted and one that stays live.

Fix #5346

Previously, we included attachment sizes in the active size totals even if they
attachments were deleted from the doc revision leafs.

The fix is, unsurprisingly, to do what the compactor does when it copies and
updates sizes: only update sizes for leaf revision during the rev-tree
traversal. We did this for doc bodies already in #4264, but forgot to consider
attachments then.

In general we're aiming for the active size to approximate the size of the file
after compaction. That is what drives smoosh (autocompaction) priority. If
we don't remove non-leaf attachments from the calculation, the active/file
ratio in smoosh may never trigger properly as ratio will stay low for shards
with large attachments.

Until a release goes out with this fix, a workaround could be to compact shards
with large attachments by hand, or use a lower threshold on `slack_dbs`
channel (and concurrency=1 to keep it in the background) to slowly cycle
through all the shards and try to compact them.

While the fix itself is small, to convince ourselves we did the right thing
update the tests to check a few corner cases.

Interestingly enough we had a disabled Elixir compaction test which we disabled
because it was "mysteriously" not reporting a lower "active" size after
compacting. So re-enable it and also improve it a bit to have both a attachment
that gets deleted and one that stays live.

Fix #5346
Copy link
Member

@rnewson rnewson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice fix!

@nickva nickva merged commit f2f6cd7 into main Nov 28, 2024
@nickva nickva deleted the fix-attachments-size-calculation branch November 28, 2024 14:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Deleted attachments do not reduce the active size

2 participants