transform: gc committed offsets #16526

rockwotj · 2024-02-07T18:14:08Z

Prior to this patch set - it was not possible to cleanup committed offsets. This means that if lots of transforms were created and then deleted one could run into the limit with no escape hatch. This patch set implements an escape hatch to cleanup offsets.

This is implemented as a manual GC mechanism because this can race with deploys and delete offsets that are still in use. A way to get around that is to expose a max transform id or something from the controller log so that we don't ever delete any ids that the target shard has not yet seen. Since IDs are assigned from the first deploy and their offset within the controller log, we could do this by the committed offset of the controller log. However, instead of implementing that and running it periodically, we just make it a manual endpoint to hit, the automatic GC could happen at a later date.

This is a followup to: #16185

Backports Required

Release Notes

none

vbotbuildovich · 2024-02-08T20:13:48Z

new failures in https://buildkite.com/redpanda/redpanda/builds/44903#018d8a18-b4e3-4667-bc3f-e7586c03af85:

"rptest.tests.consumer_group_recovery_test.ConsumerOffsetsRecoveryTest.test_consumer_offsets_partition_recovery"

bharathv

just a high level thought on holding write lock, lgtm otherwise.

bharathv · 2024-02-20T22:40:53Z

src/v/cluster/distributed_kv_stm.h

+
+    ss::future<errc> remove_all(ss::noncopyable_function<bool(Key)> pred) {
+        auto holder = _gate.hold();
+        auto units = co_await _snapshot_lock.hold_read_lock();


I feel like this could hold a write lock since remove_all seems like a rare operation and the side effects of it are easier to explain if the entire operation sees a consistent snapshot of the state.

Yeah it is a rare case, so probably cleaner to make it see consistent state. I've pushed an update.

src/v/cluster/distributed_kv_stm.h

Signed-off-by: Tyler Rockwood <rockwood@redpanda.com>

Based on a set transform IDs, this will be used to GC deleted transforms. Signed-off-by: Tyler Rockwood <rockwood@redpanda.com>

This will be used to GC offsets when transforms are deleted. Signed-off-by: Tyler Rockwood <rockwood@redpanda.com>

note the comments in the transform::service::garbage_collect_committed_offsets about the consistency of these operations. Signed-off-by: Tyler Rockwood <rockwood@redpanda.com>

With all the caveats mentioned in the `transform::service` documentation Signed-off-by: Tyler Rockwood <rockwood@redpanda.com>

This makes sure our debug endpoint to cleanup offsets works. Signed-off-by: Tyler Rockwood <rockwood@redpanda.com>

Signed-off-by: Tyler Rockwood <rockwood@redpanda.com>

In accordance with new documentation standards document some more methods in this stm. Signed-off-by: Tyler Rockwood <rockwood@redpanda.com>

bharathv

lgtm.

vbotbuildovich · 2024-03-04T18:44:06Z

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/45625#018e0a7c-58db-4ee6-ab18-aceb9539c21d

github-actions bot added area/redpanda area/wasm WASM Data Transforms labels Feb 7, 2024

rockwotj self-assigned this Feb 7, 2024

rockwotj force-pushed the gc-committed-offsets branch from ea93e12 to 31343d0 Compare February 7, 2024 19:55

rockwotj marked this pull request as ready for review February 8, 2024 02:12

rockwotj force-pushed the gc-committed-offsets branch from 31343d0 to 297864d Compare February 8, 2024 17:46

rockwotj requested review from dotnwat and bharathv February 8, 2024 17:46

bharathv previously approved these changes Feb 20, 2024

View reviewed changes

rockwotj dismissed bharathv’s stale review via 8776bf6 March 4, 2024 15:05

rockwotj force-pushed the gc-committed-offsets branch from 297864d to 8776bf6 Compare March 4, 2024 15:05

rockwotj requested a review from bharathv March 4, 2024 15:05

bharathv reviewed Mar 4, 2024

View reviewed changes

src/v/cluster/distributed_kv_stm.h Outdated Show resolved Hide resolved

rockwotj added 8 commits March 4, 2024 10:30

c/kv_stm: support batch removals via a predicate

7f5bfcf

Signed-off-by: Tyler Rockwood <rockwood@redpanda.com>

transform/rpc: support deleting offsets

1d53295

Based on a set transform IDs, this will be used to GC deleted transforms. Signed-off-by: Tyler Rockwood <rockwood@redpanda.com>

transform/rpc/client: support deleting committed offsets

2907c4e

This will be used to GC offsets when transforms are deleted. Signed-off-by: Tyler Rockwood <rockwood@redpanda.com>

transform: support gc of committed offsets

55aab38

note the comments in the transform::service::garbage_collect_committed_offsets about the consistency of these operations. Signed-off-by: Tyler Rockwood <rockwood@redpanda.com>

admin/transform: add endpoint to gc committed offsets

5f7b1f0

With all the caveats mentioned in the `transform::service` documentation Signed-off-by: Tyler Rockwood <rockwood@redpanda.com>

rptest/transform: test deleting committed offsets

8a14eaf

This makes sure our debug endpoint to cleanup offsets works. Signed-off-by: Tyler Rockwood <rockwood@redpanda.com>

transform: improve service comments

c7f8957

Signed-off-by: Tyler Rockwood <rockwood@redpanda.com>

c/distributed_kv_stm: document some methods

cfef68c

In accordance with new documentation standards document some more methods in this stm. Signed-off-by: Tyler Rockwood <rockwood@redpanda.com>

rockwotj force-pushed the gc-committed-offsets branch from 8776bf6 to cfef68c Compare March 4, 2024 16:30

rockwotj requested a review from bharathv March 4, 2024 16:30

bharathv approved these changes Mar 4, 2024

View reviewed changes

rockwotj merged commit f8dd25f into redpanda-data:dev Mar 4, 2024
17 checks passed

rockwotj deleted the gc-committed-offsets branch March 4, 2024 20:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

transform: gc committed offsets #16526

transform: gc committed offsets #16526

rockwotj commented Feb 7, 2024 •

edited

vbotbuildovich commented Feb 8, 2024

bharathv left a comment

bharathv Feb 20, 2024

rockwotj Mar 4, 2024

bharathv left a comment

vbotbuildovich commented Mar 4, 2024

transform: gc committed offsets #16526

transform: gc committed offsets #16526

Conversation

rockwotj commented Feb 7, 2024 • edited

Backports Required

Release Notes

vbotbuildovich commented Feb 8, 2024

bharathv left a comment

Choose a reason for hiding this comment

bharathv Feb 20, 2024

Choose a reason for hiding this comment

rockwotj Mar 4, 2024

Choose a reason for hiding this comment

bharathv left a comment

Choose a reason for hiding this comment

vbotbuildovich commented Mar 4, 2024

rockwotj commented Feb 7, 2024 •

edited