Wire `transform::logging::manager` into transform API, rpc client #16485

oleiman · 2024-02-06T00:43:25Z

This commit implements the end-to-end plumbing for data transform logging. This includes:

Implementation of wasm::logger that encapsulates a manager instance
Log manager initialization in transform::service
logging::client implementation to perform record batching, forward writes to rpc::client, create the transform logging topic, and compute output partition IDs for partiticular transforms.
Delete/write protection and health_manager update for logs topic integrity
Integration tests for transform logging

Closes https://github.com/redpanda-data/core-internal/issues/1057
Closes https://github.com/redpanda-data/core-internal/issues/1058

Backports Required

Release Notes

Features

Publish log (i.e. stderr/stdout) output from data transforms exclusively to an internally managed Redpanda topic (_redpanda.transform_logs). Data transform logs will no longer appear in broker logs.

src/v/transform/logging/rpc_client.cc

src/v/transform/logging/rpc_client.h

src/v/transform/rpc/client.cc

src/v/transform/logging/rpc_client.cc

oleiman · 2024-02-07T00:53:52Z

@rockwotj - Fair number of number of open-ended ish TODOs/questions on here still, but I think the overall structure is worth reviewing if/when you have time.

oleiman · 2024-02-07T02:54:02Z

force push

resolve one TODO (vector copy on produce path)

vbotbuildovich · 2024-02-07T05:16:57Z

new failures in https://buildkite.com/redpanda/redpanda/builds/44790#018d81be-4e06-40e8-97e9-070a4d4c8b9c:

"rptest.tests.data_transforms_test.DataTransformsTest.test_identity.transactional=True"
"rptest.tests.data_transforms_test.DataTransformsChainingTest.test_multiple_transforms_chained_together"

new failures in https://buildkite.com/redpanda/redpanda/builds/44790#018d81be-4dfc-4ad3-b32d-94260114f5e0:

"rptest.tests.data_transforms_test.DataTransformsLeadershipChangingTest.test_leadership_changing_randomly"
"rptest.tests.data_transforms_test.DataTransformsTest.test_tracked_offsets_cleaned_up"

new failures in https://buildkite.com/redpanda/redpanda/builds/44790#018d81be-4e00-4f31-aa2e-970722e3ff59:

"rptest.tests.cluster_config_test.ClusterConfigTest.test_valid_settings"

new failures in https://buildkite.com/redpanda/redpanda/builds/44790#018d81be-4e03-49af-8164-bc0b64067eb4:

"rptest.tests.data_transforms_test.DataTransformsTest.test_identity.transactional=False"

new failures in https://buildkite.com/redpanda/redpanda/builds/44790#018d81d0-2abd-4d44-924f-b11d98f8ac73:

"rptest.tests.data_transforms_test.DataTransformsLeadershipChangingTest.test_leadership_changing_randomly"
"rptest.tests.data_transforms_test.DataTransformsTest.test_tracked_offsets_cleaned_up"

new failures in https://buildkite.com/redpanda/redpanda/builds/44790#018d81d0-2ab7-4dfa-afa4-26c43a5b42a5:

"rptest.tests.data_transforms_test.DataTransformsTest.test_identity.transactional=False"

new failures in https://buildkite.com/redpanda/redpanda/builds/44790#018d81d0-2aba-4468-92b6-c0346f96cf21:

"rptest.tests.data_transforms_test.DataTransformsChainingTest.test_multiple_transforms_chained_together"
"rptest.tests.data_transforms_test.DataTransformsTest.test_identity.transactional=True"
"rptest.tests.cluster_config_test.ClusterConfigTest.test_valid_settings"

new failures in https://buildkite.com/redpanda/redpanda/builds/44833#018d850e-573d-4a7a-ad56-883698ba81d7:

"rptest.tests.cluster_config_test.ClusterConfigTest.test_valid_settings"

new failures in https://buildkite.com/redpanda/redpanda/builds/44833#018d850a-4b76-4213-960b-f2f0d8e55d18:

"rptest.tests.cluster_config_test.ClusterConfigTest.test_valid_settings"

new failures in https://buildkite.com/redpanda/redpanda/builds/44921#018d8b3e-c49d-4f33-aeff-e2d4273735e3:

"rptest.tests.partition_balancer_test.PartitionBalancerTest.test_fuzz_admin_ops"

new failures in https://buildkite.com/redpanda/redpanda/builds/44921#018d8b3e-c497-4199-963c-c72f2a2d127b:

"rptest.tests.e2e_shadow_indexing_test.ShadowIndexingWhileBusyTest.test_create_or_delete_topics_while_busy.short_retention=False.cloud_storage_type=CloudStorageType.S3"

new failures in https://buildkite.com/redpanda/redpanda/builds/44953#018d8f72-6cc6-4997-ba6e-74ac3c443dbd:

"rptest.tests.data_transforms_test.DataTransformsLoggingTest.test_logs_volume"
"rptest.tests.data_transforms_test.DataTransformsLoggingTest.test_logs_volume"

new failures in https://buildkite.com/redpanda/redpanda/builds/44953#018d8f72-6cc5-4e5b-b4c5-440523fa56a4:

"rptest.tests.data_transforms_test.DataTransformsLoggingTest.test_logs_volume"

new failures in https://buildkite.com/redpanda/redpanda/builds/44953#018d8f72-6cc5-4d99-b675-087e8654b011:

"rptest.tests.data_transforms_test.DataTransformsLoggingTest.test_logs_volume"

new failures in https://buildkite.com/redpanda/redpanda/builds/44953#018d8f72-6cc6-4e41-9f25-241c2a274785:

"rptest.tests.data_transforms_test.DataTransformsLoggingTest.test_logs_volume"

new failures in https://buildkite.com/redpanda/redpanda/builds/44953#018d8f72-6cc5-4c24-bb12-0a8179fd358f:

"rptest.tests.data_transforms_test.DataTransformsLoggingTest.test_logs_volume"

oleiman · 2024-02-07T18:16:59Z

force push to fix .wasm file path issue in DT

rockwotj

First pass. Just a couple of small things otherwise looks good!

src/v/transform/rpc/client.h

src/v/kafka/server/handlers/produce.cc

rockwotj · 2024-02-07T18:28:06Z

src/v/model/namespace.h

+
+inline const model::topic_namespace transform_log_internal_nt(
+  model::kafka_namespace, model::transform_log_internal_topic);
+
 inline bool is_user_topic(topic_namespace_view tp_ns) {


TODO: understand the ramifications of this...

yeah seriously...I'll write something up today or tomorrow

Sorry this was for myself, but yeah I like @nvartolomei's suggestion to instead have topic properties for the stuff we want to enforce, so things like: redpanda.canproduce, redpanda.candelete, etc.

I think the effects of this particular thing are even a bit more subtle, but yeah, I like that suggestion too.

are we talking about the proliferation of special topic names and collections of names with a growing list of special behavior and traits.

yes, i believe we are. this topic, the schemas topic, and the audit log topic all require special treatment. is this a leading question?

are we talking about the proliferation of special topic names and collections of names with a growing list of special behavior and traits.

Yes, and it's not clear what happens when one adds something to this list. Anyways, not for this PR, but something we should cleanup at somepoint.

src/v/transform/logging/io.h

src/v/transform/logging/log_manager.cc

rockwotj · 2024-02-07T18:41:20Z

src/v/transform/logging/record_batcher.cc

+    size_t estimate_record_size(size_t ksize, size_t vsize) {
+        // NOTE(oren): a size estimate for vint fields on an individual record:
+        //    - key size (known)
+        //    - value size (known)
+        //    - offset delta (pessimize)
+        //    - timestamp delta (always nil)
+        //    - headers size (always nil)
+        //    - record size in bytes (calculate based on sum of above)
+        // NOTE(oren): this seems to overestimate the size of the final batch
+        // by ~1%. That's probably acceptably conservative to avoid exceeding
+        // configured limits without losing too much in terms of maximizing
+        // batch size
+        constexpr size_t base_overhead_est
+          = sizeof(model::record_attributes::type) // attrs
+            + vint::max_length    // offset delta, ultra-conservative
+            + vint::vint_size(0)  // timestamp_delta
+            + vint::vint_size(0); // headers size
+        auto sz = vint::vint_size(static_cast<int64_t>(ksize))   // key size
+                  + ksize                                        //
+                  + vint::vint_size(static_cast<int64_t>(vsize)) // val size
+                  + vsize                                        //
+                  + base_overhead_est;
+        return vint::vint_size(static_cast<int64_t>(sz)) + sz;
+    }


Not your fault, but we really need to centralize these sorts of calculations.

Can we either:

Create record objects (even if only temporary) to compute the size then release the key and value into the batch builder (or just hold a list of records until we roll)? I don't know if we want to over index on performance here.

Create a ticket to move all our batch building stuff into the model namespace so it can be shared. Feels weird to have to depend on storage anyways.

IIRC the kafka client just creates record objects and at the last second turns them into batches, I think that would be fine here.

Yeah it's pretty awful...would be happy to remove, but I was kind of surprised not to find anything more reusable already existing anywhere.

kafka client just creates record objects and at the last second turns them into batches

I must have missed that, will check it out

Wound up with something like:

in append, create a record_batch with a single record

estimate the size of the record therein

perform all the same checks and move the record's guts into the "real" batch_builder

I think this is more or less in the spirit of what the internal kafka client is doing, but I might be misreading.

src/v/transform/logging/record_batcher.cc

src/v/transform/logging/rpc_client.cc

src/v/transform/api.cc

src/v/transform/rpc/client.cc

rockwotj · 2024-02-08T02:24:56Z

src/v/transform/logging/rpc_client.cc

+        for (size_t i = 0; i < static_cast<size_t>(config->partition_count);
+             ++i) {
+            model::partition_id candidate(
+              (sstring_hash{}(name()) + i)


Oh something I forgot to catch here. Absl hashing is not a persistent hash. Another way to say that is that it is seeded differently for every process. Let's use mumur2 or something.

Consider:

redpanda/src/v/cluster/distributed_kv_stm.h

Lines 204 to 205 in 510f459

auto result = model::partition_id{

murmur2(bytes.c_str(), bytes.length()) % num_partitions.value()};

nice one, thanks

Oh something I forgot to catch here. Absl hashing is not a persistent hash. Another way to say that is that it is seeded differently for every process. Let's use mumur2 or something.

yikes. that's an easy and devastating mistake to make

To create the transform logs topic on demand. Also adds code in log_manager.cc to perform this operation lazily at flush time. Additionally, update logging::client to return transform::logging::errc (or a result<> thereof). This becomes increasingly useful for error reporting as we start hooking things up to the rest of the cluster. Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>

A comment on model::packed_record_batch_header_size reflected a presumably outdated value for the constant. Was 57, should be 61. Replaces that comment with a static_assert to avoid similar errors in future. Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>

This is slightly easier to test and reason about, as well as being nominally less costly in terms of buffer capacity. Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>

``` record( record_attributes attributes, int64_t timestamp_delta, int32_t offset_delta, iobuf key, iobuf value, std::vector<record_header> hdrs) ```

oleiman · 2024-02-08T20:53:14Z

force push contents:

better batch size accounting/prediction
remove active partition checks
removed some cruft

src/v/transform/logging/rpc_client.cc

src/v/transform/logging/record_batcher.cc

Encapsulates a storage::record_batch_builder and aims to build up batches of log records. Batches are kept under a specified maximum size to avoid data loss. If appending a given kvp would cause a batch to exceed the specified max, we roll to a new builder, appending the previous batch to a running collection of completed, size-limited record_batches. Includes a simple unit test for constructed record_batch sizes.

oleiman · 2024-02-08T21:48:39Z

force push to remove dead code

rockwotj

Looks really good. Just a couple of small things at this point.

One other question I had, are we going to have a followup to cleanup some of the code here?

src/v/transform/logging/rpc_client.cc

tests/rptest/tests/data_transforms_test.py

rockwotj · 2024-02-08T21:57:12Z

tests/rptest/tests/data_transforms_test.py

+
+    def consume_one_log_record(self, offset=0, timeout=10) -> LogRecord:
+        return self.LogRecord(
+            self._rpk.consume(self.logs_topic.name,


Question: do all these rpk commands need to be wrapped in retries?

I don't really think so. The timeout gets applied to the rpk shell-out under the hood of RpkTool, and consume should just sit there until we either receive a record back OR we kill the subprocess. In the latter case we get a nice RpkException. Did you have a particular failure mode in mind?

No just wanted to make sure these tests are robust as possible. For Admin requests that usually means wrapping stuff in retries.

Yeah, it's a fair point. If rpk fails to establish a connection that'll immediately fail the test.

It's probably fine, we can see if any issues crop up first

oleiman · 2024-02-08T22:23:45Z

are we going to have a followup to cleanup some of the code here?

https://github.com/redpanda-data/core-internal/issues/1009

Doesn't seem urgent (unless I've missed something), but the ticket is queued up for sure. I'd like to at least draft a metrics probe first.

Adapts transform::rpc::client for transform logging purposes. Implements transform::logging::client. Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>

Also Add sharded<metadata_cache> field to transform::service in service of transform::logging::rpc_client. Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>

Previously a static method on a SCRAM test. Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>

Includes a slightly modified identity transform that also logs the incoming record to stderr. Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>

oleiman · 2024-02-08T22:47:14Z

force push few nits

oleiman · 2024-02-09T03:03:53Z

CI Failures:

build.
- CI Failure (Attempted to perform operation: 'offset_data_stream()' on a closed segment) in ShadowIndexingWhileBusyTest.test_create_or_delete_topics_while_busy #16568 (NEW, but not transforms-related)
build
- Looks like CI Failure (kafka-topics.sh returned non-zero exit status 1) in PartitionBalancerTest.test_fuzz_admin_ops #15457, which was duped to CI Failure (CalledProcessError) in MultiTopicAutomaticLeadershipBalancingTest.test_topic_aware_rebalance #12140
- Also not transforms related
build. One of the transform tests timed out. Spoke with @rockwotj, who characterized it as a transient replication issue, not likely related to this change.

oleiman · 2024-02-09T19:51:55Z

/ci-repeat 5
skip-units
skip-redpanda-build
dt-repeat=20
tests/rptest/tests/data_transforms_test.py::DataTransformsLoggingTest

oleiman · 2024-02-09T21:28:23Z

/cdt
num_nodes=6
dt-repeat=20
tests/rptest/tests/data_transforms_test.py

oleiman · 2024-02-09T23:56:37Z

/ci-repeat 1
debug
skip-units
skip-redpanda-build
dt-repeat=10
tests/rptest/tests/data_transforms_test.py

vbotbuildovich · 2024-02-10T00:27:15Z

/backport v23.3.x

vbotbuildovich · 2024-02-10T00:28:14Z

Failed to create a backport PR to v23.3.x branch. I tried:

git remote add upstream https://github.com/redpanda-data/redpanda.git
git fetch --all
git checkout -b backport-pr-16485-v23.3.x-672 remotes/upstream/v23.3.x
git cherry-pick -x 9e357d16ce4b0c174524aed1c876f59e7d817681 64637993d9fd58ef675c0ac15d4678886368a3f6 91ce99ec62ed2aeb0e342dd859e9bd7baffceb19 c41eaa1a64f6a440f2fb18e14b0cb63a44a3160b d59c9283593eb11f95667a16a1f5df27ccb97e3a 5280839a4331e08668b1ff0bb0bcc9324b1625c9 f0369517885112df742c949ef75990cbd82cf625 a637233f124fc95d9f3c7ff2011e8e6e1635a333 ddea1e2bc42ac730a355e188c9ce876f2940bcfa 9562bbdfd1010182f1699c3d72d20a32130fec0b 38f99f9c9dbdda98041f1f81dbb95d2e1e3ebead ad2c4e6b2dc94e2763e6018da2388a95d46c62bc a7f6ba087f356d03314794dd7da1a00e6c48a91d

Workflow run logs.

dotnwat · 2024-02-10T01:38:17Z

src/v/transform/api.cc

+    if (fut.failed()) {
+        vlog(
+          tlog.error,
+          "Failed to start transform::logging::manager: {}",
+          fut.get_exception());
+    }
+    co_return co_await std::move(fut);


I think this will probably crash or hang. Once you eat the exception out of the future, it's put into an invalid state. When you co_await on it, at least from my reading of coroutine.hh, seastar will wait for the future to become available, but that will never happen because it was available (before get_exception) but now it is marked invalid.

Can we just remove the as future thing here. I dont think it adds value

should be something like this, right?

Suggested change

if (fut.failed()) {

vlog(

tlog.error,

"Failed to start transform::logging::manager: {}",

fut.get_exception());

}

co_return co_await std::move(fut);

if (fut.failed()) {

vlog(

tlog.error,

"Failed to start transform::logging::manager: {}",

fut.get_exception());

co_return;

}

co_return co_await std::move(fut);

@rockwotj - oh yeah, that's fine. will do later

What about just

co_await _log_manager->start();

I don't think we want to silently fail to start the manager.

Yeah, that's what I'm committing right now.

github-actions bot added area/redpanda area/wasm WASM Data Transforms labels Feb 6, 2024

oleiman self-assigned this Feb 6, 2024

oleiman force-pushed the xfm-logging/rpc branch from 2bf0e5f to 262b54d Compare February 6, 2024 00:51