Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

archival: handle deletion of topics with spilled over manifests #11003

Merged
merged 18 commits into from
Jun 20, 2023

Conversation

VladLazar
Copy link
Contributor

@VladLazar VladLazar commented May 24, 2023

This PR updates the topic deletion logic to make it aware of spillover manifests.
Deletion of spillover manifests and their segments is done as part of the scrub,
which means it can be retried as many times as required.

The general approach is to:

  1. Issue ListObjects request to find all manifests
  2. Starting with the current manifests, iterate over all manifests found
    in the bucket and delete all their associated segments and, finally,
    the manifest itself.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.1.x
  • v22.3.x
  • v22.2.x

Release Notes

  • none

@VladLazar VladLazar force-pushed the delete-with-spillover branch 5 times, most recently from fbc00d8 to f75c22e Compare May 30, 2023 14:22
@VladLazar
Copy link
Contributor Author

/ci-repeat
release
skip-unit
dt-repeat=5
dt-log-level=trace
tests/rptest/tests/topic_delete_test.py::TopicDeleteCloudStorageTest

@VladLazar VladLazar added the area/cloud-storage Shadow indexing subsystem label May 30, 2023
@VladLazar
Copy link
Contributor Author

/ci-repeat
release
skip-unit
dt-repeat=5
dt-log-level=trace
tests/rptest/tests/topic_delete_test.py::TopicDeleteCloudStorageTest

@VladLazar
Copy link
Contributor Author

/ci-repeat
release
skip-unit
dt-repeat=3
dt-log-level=trace
tests/rptest/tests/topic_delete_test.py::TopicDeleteCloudStorageTest

@VladLazar
Copy link
Contributor Author

/ci-repeat
release
skip-unit
dt-repeat=2
dt-log-level=trace
tests/rptest/tests/topic_delete_test.py::TopicDeleteCloudStorageTest

@VladLazar VladLazar marked this pull request as ready for review May 31, 2023 12:19
@VladLazar VladLazar requested review from jcsp and andijcr May 31, 2023 12:19
andijcr
andijcr previously approved these changes May 31, 2023
Copy link
Contributor

@andijcr andijcr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

src/v/cloud_storage/remote_partition.cc Outdated Show resolved Hide resolved
tests/rptest/services/redpanda.py Outdated Show resolved Hide resolved
@VladLazar
Copy link
Contributor Author

Some topic deletion tests are still failing on ABS (see here). Weirdly enough, they don't fail in my env. Will keep digging.

@VladLazar
Copy link
Contributor Author

/ci-repeat
release
skip-unit
dt-repeat=2
dt-log-level=trace
tests/rptest/tests/topic_delete_test.py::TopicDeleteCloudStorageTest

@VladLazar VladLazar requested a review from a team as a code owner June 1, 2023 15:50
@VladLazar VladLazar requested review from andrewhsu and removed request for a team June 1, 2023 15:50
@VladLazar
Copy link
Contributor Author

/ci-repeat
release
skip-unit
dt-repeat=2
dt-log-level=trace
tests/rptest/tests/topic_delete_test.py::TopicDeleteCloudStorageTest

@VladLazar
Copy link
Contributor Author

/ci-repeat
release
skip-unit
dt-repeat=2
dt-log-level=trace
tests/rptest/tests/topic_delete_test.py::TopicDeleteCloudStorageTest

@VladLazar VladLazar force-pushed the delete-with-spillover branch 2 times, most recently from 8b365d9 to 84a38c4 Compare June 16, 2023 13:15
@VladLazar
Copy link
Contributor Author

/ci-repeat
release
skip-unit
dt-repeat=10
dt-log-level=trace
tests/rptest/tests/topic_delete_test.py::TopicDeleteCloudStorageTest

@VladLazar
Copy link
Contributor Author

/ci-repeat

Vlad Lazar added 18 commits June 19, 2023 15:23
Previously, each vector was copied (despite std::move) being called due
to constness. This patch removes the extra copies.
This commit updates `remote_partition::erase` to take the manifest path
as an argument. This is used by a future commit that deals with the
removal of spillover manifests.

The final call to remove the partition manifest is removed from the
scrubber as `remote_partition::erase` already does this.
This commit updates the scruber to delete spillover manifests and their
associated segments. The general strategy is:
1. Issue ListObjects request to find all manifests
2. Starting with the current manifests, iterate over all manifests found
   in the bucket and delete all their associated segments and, finally,
   the manifest itself.
Previously, `remote_partition::erase` created its own retry chain node
as the usual one can't be used (stop was called). While that's true,
`erase` is now always wrapped by another caller which has a valid retry
chain node. This commit updates `erase` to accept an rtc from said
callers.
This commit introduces a new retry strategy for `retry_chain_node` which
disallows any retries.
This commit binds the retry strategy to a given instance of a retry
chain node. The default behaviour is for, non-root nodes to inherit the
retry strategy from their parent. New constructors were added to permite
overriding of the retry strategy. Nothing should change for pre-existing
code that uses retry chain node.
This commit tidies up the retry chain node usage on the partition scrub
code path and uses `retry_strategy::disallow` to skip any retries for
the scrub's S3 operations. Retries are counterproductive here as they
eat up time from the total partition purge timeout. Scrubbing has a
higher level replay mechanims via the lifecycle markers in the topic
table.
Previously, when an ntp was deleted, one of the replicas would do an
attempt to delete the segments pointed to by the current manifest and
the manifest itself.

Post spillover manifests this makes less sense as this best effort pass
does not attempt to delete all the data. Changing it to perform the full
purge could destabilise the cluster if deletetion occurs during a busy
period.

Instead, this commit removes the code path and deletion of cloud data is
performed only via the scrubbing mechanism, as that will only(1) kick in
when there's no outstanding requests to cloud storage.

(1) - if cloud_storage_housekeeping_interval_ms elapses without a scrub,
then it will run regardless of the system's state.
Previously, updates to `cloud_storage_housekeeping_interval_ms` only
came into effect after the currently scheduled housekeeping ran. This
made it difficult to use in tests.
Previously, the idle timer would not fire if the shard had no
`cloud_storage::remote` activity. This was problematic because it
prevents topics from being purged when cloud storage is idling.

This commit fixes the issue by kicking the timer off in
`upload_housekeeping_service::start` and by re-arming in the idle timer
callback. Delaying the timer on cloud storage activity is done from the
background loop as previously.
Previously, `upload_housekeeping_service` would not transition out of
the paused state even if the cloud storage api was idle (it would only
do it on for the epoch timer).
This patch introduces a grace period between the insertion of the
lifecycle marker to the topic table and the start of the bucket scrub.
The intent is to avoid the race between the finalise stage of partitions
(in which the manifest may be reuploaded) and the scrubber. The grace
period is controlled via the `cloud_storage_topic_purge_grace_period_ms`
cluster config.
When an ABS container is deleted, the blobs left in it are marked for
GC, so there's no need to empty the container out first. This behaviour
differs from S3.

This commit updates the end of test bucket deletion to take advantage of
the above.
This commit extends the `BucketView` utility to make it aware of
spillover manifests. The following changes were made:
* `_do_listing` now downloads spillover manifests and caches them in the
  BucketViewState instance
* Utility functions `get_spillover_metadata` and
  `get_spillover_manifests` were added.
This commit extends the cloud storage topic deletion tests to action on
partitions with spillover manifests.

`SISettings` has also been updated to accomodate the spillover cluster
config.
@VladLazar
Copy link
Contributor Author

/ci-repeat
release
skip-unit
dt-repeat=10
dt-log-level=trace
tests/rptest/tests/topic_delete_test.py::TopicDeleteCloudStorageTest

@VladLazar
Copy link
Contributor Author

/ci-repeat

@@ -71,33 +74,196 @@ ss::future<scrubber::purge_result> scrubber::purge_partition(
archival_log.error,
"Remote delete disabled in tombstone on {}, refusing to purge",
ntp);
co_return result;
co_return purge_result{.status = purge_status::success, .ops = 0};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also need to check the cluster metadata to avoid situation when the marker is deleted but the partition is not deleted yet?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lifecycle markers in the topic table are deleted after the purge is complete, so that shouldn't be possible. I'm not too sure what you mean though.


std::optional<ss::sstring> current_serde;
std::optional<ss::sstring> current_json;
std::vector<ss::sstring> spillover;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: duque/fragmented_vector

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do in a follow-up

@jcsp jcsp merged commit 4036aed into redpanda-data:dev Jun 20, 2023
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cloud-storage Shadow indexing subsystem area/redpanda
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants