Tracking issue: #18479
Depends on: #38465 (skeleton, merged via #38534)
Summary
Second sub-issue under the Kafka Streams Runner GSoC 2026 project. The
skeleton from #38465 currently throws UnsupportedOperationException
for every transform URN. This issue adds:
- A URN-dispatch framework in
KafkaStreamsPipelineTranslator (a
Map<String, TransformTranslator> populated at construction time,
walked via QueryablePipeline in topological order — same shape as
FlinkStreamingPortablePipelineTranslator).
- The first concrete translator: Impulse
(beam:transform:impulse:v1). Per the design doc §4.1, this uses a
dedicated bootstrap topic (e.g. __beam_impulse) so Kafka Streams
has a real source to consume from, emits one empty byte[] element,
records in a state store that it has already fired, and advances
the watermark to BoundedWindow.TIMESTAMP_MAX_VALUE.
- The translation context starts holding the Kafka Streams
Topology
being built and the Map<String, String> from PCollection ID to
processor node name.
After this issue, an Impulse-only pipeline translates and starts a
Kafka Streams topology. Pipelines containing any other URN still fail
fast with No translator registered for URN ... — the message format
unchanged from #38465.
Design doc reference
Portable Kafka Streams Runner for Apache Beam — design doc §4.1, §11.5.
Scope
Acceptance criteria
Out of scope (deferred to follow-up sub-issues)
- ExecutableStage / stateless ParDo (next sub-issue).
- GroupByKey, Combine, Window assignment, Flatten.
- Watermark manager (per Jan: "watermark manager comes last when GBK
forces it").
- Splittable DoFn.
Reference implementation
runners/flink/2.0/src/main/java/.../FlinkStreamingPortablePipelineTranslator.java
— URN dispatch map pattern.
runners/flink/2.0/src/main/java/.../translation/.../ImpulseSourceFunction.java
(and surrounding wiring) for the Flink-side analog. Kafka Streams
needs a different approach because KS requires a real input topic.
cc @je-ik
Tracking issue: #18479
Depends on: #38465 (skeleton, merged via #38534)
Summary
Second sub-issue under the Kafka Streams Runner GSoC 2026 project. The
skeleton from #38465 currently throws
UnsupportedOperationExceptionfor every transform URN. This issue adds:
KafkaStreamsPipelineTranslator(aMap<String, TransformTranslator>populated at construction time,walked via
QueryablePipelinein topological order — same shape asFlinkStreamingPortablePipelineTranslator).(
beam:transform:impulse:v1). Per the design doc §4.1, this uses adedicated bootstrap topic (e.g.
__beam_impulse) so Kafka Streamshas a real source to consume from, emits one empty
byte[]element,records in a state store that it has already fired, and advances
the watermark to
BoundedWindow.TIMESTAMP_MAX_VALUE.Topologybeing built and the
Map<String, String>from PCollection ID toprocessor node name.
After this issue, an Impulse-only pipeline translates and starts a
Kafka Streams topology. Pipelines containing any other URN still fail
fast with
No translator registered for URN ...— the message formatunchanged from #38465.
Design doc reference
Portable Kafka Streams Runner for Apache Beam — design doc §4.1, §11.5.
Scope
KafkaStreamsTranslationContext: holdTopology, plusMap<String, String> pcollectionIdToProcessorNameand accessors.KafkaStreamsPipelineTranslator:Map<String, TransformTranslator>populated at construction; walk via
QueryablePipeline/topological order.
TransformTranslatorinterface (single methodtranslate(PTransformNode, RunnerApi.Pipeline, KafkaStreamsTranslationContext)).translation/ImpulseTranslatorimplementing the dedicated-topicpattern from design doc §4.1.
AdminClientorrequire pre-created — open question 12.1; pick one with note).
KafkaStreamsPipelineRunner.runto actually callKafkaStreams.start()on the built topology (instead ofthrowing). Returns a
PortablePipelineResultthat tracks theKafkaStreamsinstance state.Acceptance criteria
./gradlew :runners:kafka-streams:checkgreen.TopologyTestDriver: build an Impulse-onlypipeline proto, translate it, run the resulting topology, assert
exactly one empty
byte[]element appears at the output nodeand is not re-emitted on restart.
"No translator registered for URN ..." message from [GSoC 2026] Kafka Streams Runner — skeleton Gradle module + pipeline entry points #38465.
Out of scope (deferred to follow-up sub-issues)
forces it").
Reference implementation
runners/flink/2.0/src/main/java/.../FlinkStreamingPortablePipelineTranslator.java— URN dispatch map pattern.
runners/flink/2.0/src/main/java/.../translation/.../ImpulseSourceFunction.java(and surrounding wiring) for the Flink-side analog. Kafka Streams
needs a different approach because KS requires a real input topic.
cc @je-ik