Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

archival: clamp uploads to committed offset #18285

Conversation

nvartolomei
Copy link
Contributor

The archival/tiered storage correctness assumption builds on the (wrong) assumption that LSO is monotonic. Tiered storage doesn't have a concept of suffix truncation so if that would happen it would lead violations of correctness properties and diverging logs/undefined behavior.

However, we have discovered that property does not hold if there are no in-progress transaction and acks=0/1 or write caching is in use because LSO falls back to "last visible index"1 which can get truncated.

Ref #18244

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.1.x
  • v23.3.x
  • v23.2.x

Release Notes

  • none

Footnotes

  1. https://github.com/redpanda-data/redpanda/blob/88ac775f9f7954330732024abfa6e9ed5c9c11fd/src/v/cluster/rm_stm.cc#L1322

@nvartolomei nvartolomei requested a review from Lazin May 7, 2024 16:06
@nvartolomei nvartolomei force-pushed the nv/clamp-upload-to-committed-offset branch from 007eea8 to 63aff06 Compare May 7, 2024 16:58
@nvartolomei nvartolomei added the area/cloud-storage Shadow indexing subsystem label May 7, 2024
@vbotbuildovich

This comment was marked as outdated.

@nvartolomei nvartolomei requested review from andrwng and removed request for andijcr May 7, 2024 18:58
src/v/archival/ntp_archiver_service.cc Outdated Show resolved Hide resolved
src/v/archival/ntp_archiver_service.cc Outdated Show resolved Hide resolved
andrwng
andrwng previously approved these changes May 7, 2024
src/v/archival/ntp_archiver_service.cc Outdated Show resolved Hide resolved
abhijat
abhijat previously approved these changes May 8, 2024
Lazin
Lazin previously approved these changes May 8, 2024
nvartolomei and others added 3 commits May 8, 2024 21:41
The archival/tiered storage correctness assumption builds on the
(wrong) assumption that LSO is monotonic. Tiered storage doesn't have a
concept of suffix truncation so if that would happen it would lead
violations of correctness properties and diverging logs/undefined
behavior.

However, we have discovered that property does not hold if there are no
in-progress transaction and acks=0/1 or write caching is in use because
LSO falls back to "last visible index"[^1] which can get truncated.

Ref redpanda-data#18244

[^1]: https://github.com/redpanda-data/redpanda/blob/88ac775f9f7954330732024abfa6e9ed5c9c11fd/src/v/cluster/rm_stm.cc#L1322
This communicates better the type of offset. No functional
changes/cosmetic only.
Outrageously verbose but for a good reason.
@nvartolomei nvartolomei dismissed stale reviews from Lazin, abhijat, and andrwng via 15ac281 May 8, 2024 21:30
@nvartolomei nvartolomei force-pushed the nv/clamp-upload-to-committed-offset branch from 63aff06 to 15ac281 Compare May 8, 2024 21:30
@nvartolomei
Copy link
Contributor Author

At the risk of being too verbose I have updated the code based on the suggestions from the code review. It is less convoluted now albeit at the expense on some longer words.

Copy link
Contributor

@Lazin Lazin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perfect

@piyushredpanda piyushredpanda merged commit 67a6bff into redpanda-data:dev May 10, 2024
14 of 17 checks passed
@vbotbuildovich
Copy link
Collaborator

/backport v24.1.x

@vbotbuildovich
Copy link
Collaborator

/backport v23.3.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cloud-storage Shadow indexing subsystem area/redpanda
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants