KAFKA-8410: Migrating stateful operators to new Processor API #10507

jeqo · 2021-04-08T22:54:58Z

Continuation of #10381. Migration of Kafka Streams stateful operators (KTable, KStream aggregations, joins).

Committer Checklist (excluded from commit message)

Verify design and implementation
Verify test coverage and CI build status
Verify documentation (including upgrade notes)

…eam-2

vvcephei

Sorry it took so long for me to pick this up. Just a quick first batch of review comments.

vvcephei · 2021-05-21T15:07:47Z

streams/src/main/java/org/apache/kafka/streams/errors/DeserializationExceptionHandler.java

     * calling {@code forward()} (and some other methods) would result in a runtime exception.
     *
     * @param context processor context
     * @param record record that failed deserialization
     * @param exception the actual exception
     */
-    DeserializationHandlerResponse handle(final ProcessorContext context,
+    DeserializationHandlerResponse handle(final ProcessorContext<?, ?> context,


Oh, man. I overlooked this in the KIP, and we can't just change this in-place, as it will break any subclasses.

What we need to do is deprecate this method and introduce a new one with a default implementation that calls back here. We can update the KIP with this change, since it's a simple oversight and follows established patterns for migrating interfaces.

vvcephei · 2021-05-21T15:57:08Z

streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java

-import org.apache.kafka.streams.processor.ProcessorContext;
-import org.apache.kafka.streams.processor.ProcessorSupplier;
 import org.apache.kafka.streams.processor.StreamPartitioner;
 import org.apache.kafka.streams.processor.TopicNameExtractor;
+import org.apache.kafka.streams.processor.api.ProcessorContext;
+import org.apache.kafka.streams.processor.api.ProcessorSupplier;
 import org.apache.kafka.streams.state.KeyValueStore;


Hey @jeqo , based on my understanding of the KIP, nothing in this interface should have changed. Was this changeset intentional?

I skimmed over this interface and changes are in line breaks of comments and renaming of type parameters. In the interest of good reviews, I would not do those changes in this PR but rather open a separate PR for this interface. However, I might have missed an important part. @jeqo Could you clarify?

Regarding the comments, we usually add a break after each sentence.

vvcephei · 2021-05-21T15:57:56Z

streams/src/main/java/org/apache/kafka/streams/kstream/KTable.java

@@ -16,6 +16,7 @@
 */
 package org.apache.kafka.streams.kstream;

+import java.util.function.Function;
 import org.apache.kafka.common.serialization.Serde;


Also, the same question here: do we need any changes to this interface?

vvcephei · 2021-05-21T15:58:51Z

streams/src/main/java/org/apache/kafka/streams/kstream/ValueJoiner.java

- * @param <V1> first value type
- * @param <V2> second value type
- * @param <VR> joined value type


Also here: it doesn't seem strictly necessary to rename the generic parameters as part of this PR.

Specifically, funny story: these params used to be called V and V1, and we renamed them to V1 and V2 because we thought it made more sense :)

* Lay the groundwork for migrating KTable Processors to the new PAPI. * Migrate the KTableFilter processor to prove that the groundwork works. This is an effort to help break up #10507 into multiple PRs. Reviewers: Boyang Chen <boyang@apache.org>

vvcephei · 2021-05-28T20:01:47Z

Hey @jeqo , now that #10744 is merged, what do you think about just closing this PR and breaking it up into a series of PRs that migrate one or two processors at a time?

cadonna

@jeqo I started to review the PR but haven't finished yet.

Could you please rebase the PR because it has some conflicts?

I think you should undo the changes to KStream. AFAIS they are not required for this PR and pollute the PR a lot.

Please look carefully at my comments in CogroupedStreamAggregateBuilder. I am not sure if I missing something there or if there is a bug.

cadonna · 2021-05-31T10:58:16Z

streams/src/main/java/org/apache/kafka/streams/errors/LogAndContinueExceptionHandler.java

@@ -32,7 +32,7 @@
    private static final Logger log = LoggerFactory.getLogger(LogAndContinueExceptionHandler.class);

    @Override
-    public DeserializationHandlerResponse handle(final ProcessorContext context,
+    public DeserializationHandlerResponse handle(final ProcessorContext<?, ?> context,


Do we need to deprecate also this method and add a new one? Technically, it is a class of the public API that can be extended.

cadonna · 2021-05-31T10:58:53Z

streams/src/main/java/org/apache/kafka/streams/errors/LogAndFailExceptionHandler.java

@@ -32,7 +32,7 @@
    private static final Logger log = LoggerFactory.getLogger(LogAndFailExceptionHandler.class);

    @Override
-    public DeserializationHandlerResponse handle(final ProcessorContext context,
+    public DeserializationHandlerResponse handle(final ProcessorContext<?, ?> context,


Do we need to deprecate also this method and add a new one? Technically, it is a class of the public API that can be extended.

cadonna · 2021-05-31T11:09:46Z

streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java

-import org.apache.kafka.streams.processor.ProcessorContext;
-import org.apache.kafka.streams.processor.ProcessorSupplier;
 import org.apache.kafka.streams.processor.StreamPartitioner;
 import org.apache.kafka.streams.processor.TopicNameExtractor;
+import org.apache.kafka.streams.processor.api.ProcessorContext;
+import org.apache.kafka.streams.processor.api.ProcessorSupplier;
 import org.apache.kafka.streams.state.KeyValueStore;


I skimmed over this interface and changes are in line breaks of comments and renaming of type parameters. In the interest of good reviews, I would not do those changes in this PR but rather open a separate PR for this interface. However, I might have missed an important part. @jeqo Could you clarify?

Regarding the comments, we usually add a break after each sentence.

cadonna · 2021-05-31T11:17:04Z

streams/src/main/java/org/apache/kafka/streams/kstream/internals/AbstractStream.java

+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Objects;
+import java.util.Set;


In KAFKA-10787 we agreed on an import order kafka, org.apache.kafka, com, net, org, java, javax and static imports. Additionally, there should be a empty line between import blocks.

Note, PR #10428 introduces check and a formatter for this.

@jeqo Yes, but don't feel so burdened - I am ready to expand the formatter to streams module as soon as the PR (currently formats core module only) is merged. 😉 @cadonna

@cadonna The sooner you merge the PR, I can start to apply the formatter to the streams module sooner. 😃

cadonna · 2021-05-31T11:57:09Z

...rc/main/java/org/apache/kafka/streams/kstream/internals/CogroupedStreamAggregateBuilder.java

        boolean stateCreated = false;
        int counter = 0;
        for (final Entry<KGroupedStreamImpl<K, ?>, Aggregator<? super K, Object, VOut>> kGroupedStream : groupPatterns.entrySet()) {
-            final KStreamAggProcessorSupplier<K, K, ?, ?> parentProcessor =
+            final KStreamAggregateProcessorSupplier<K, K, ?, ?> parentProcessor =


Shouldn't this be KStreamAggregateProcessorSupplier<K, ?, K, ?>? The positions of the parameters KOut and VIn on KStreamAggregateProcessorSupplier changed with respect to KStreamAggProcessorSupplier.

cadonna · 2021-05-31T12:31:13Z

...rc/main/java/org/apache/kafka/streams/kstream/internals/CogroupedStreamAggregateBuilder.java

        boolean stateCreated = false;
        int counter = 0;
        for (final Entry<KGroupedStreamImpl<K, ?>, Aggregator<? super K, Object, VOut>> kGroupedStream : groupPatterns.entrySet()) {
-            final KStreamAggProcessorSupplier<K, K, ?, ?>  parentProcessor =
-                (KStreamAggProcessorSupplier<K, K, ?, ?>) new KStreamWindowAggregate<K, K, VOut, W>(
+            final KStreamWindowAggregate<K, K, VOut, W> parentProcessor =


Shouldn't this be KStreamWindowAggregate<K, VOut, K, W>? Here I am not sure if I am missing something since the type parameter positions did not change. Why is the type parameter for V in KStreamWindowAggregate K and not ??

cadonna · 2021-05-31T12:57:25Z

streams/src/main/java/org/apache/kafka/streams/kstream/internals/KGroupedStreamImpl.java

+import java.util.Objects;
+import java.util.Set;


See my comment above about import order.

cadonna · 2021-05-31T12:58:07Z

streams/src/main/java/org/apache/kafka/streams/kstream/internals/KGroupedStreamImpl.java

+            new KStreamAggregate<>(materializedInternal.storeName(),
+                aggregateBuilder.countInitializer,
+                aggregateBuilder.countAggregator),


Suggested change

new KStreamAggregate<>(materializedInternal.storeName(),

aggregateBuilder.countInitializer,

aggregateBuilder.countAggregator),

new KStreamAggregate<>(

materializedInternal.storeName(),

aggregateBuilder.countInitializer,

aggregateBuilder.countAggregator

),

or

Suggested change

new KStreamAggregate<>(materializedInternal.storeName(),

aggregateBuilder.countInitializer,

aggregateBuilder.countAggregator),

new KStreamAggregate<>(

materializedInternal.storeName(),

aggregateBuilder.countInitializer,

aggregateBuilder.countAggregator),

jeqo added 25 commits April 12, 2021 11:41

add new abstract processor

f7aaa63

migrate kstream flat map values to new processor

bed7c96

apply suggestions

9657601

rollback change in test class

9983c9c

add new processor operator to kstream

a619816

draft ktable mapvalues, filter and others

19c8a8f

ktablesource to new processor api

990ba66

migrate ktable supress, reduce, and aggregate

fdaf408

moar migration

ae995d2

moar migration

a94eb30

migrate table joins

ea67f68

migrate global ktable

c7be1c2

migrate foreign key

0c3cbc9

migrate ktable repartition map

8baf291

compile source

662a10c

migrate test, first draft

75a8f60

migrate recordqueue

8bd1018

migrate testing

31f51b0

tests compile!

c41ecc7

adjust kstream process

d1f29ad

clean

c5fb7eb

renames

fa52447

fix types order

7e83419

add todos;

b25fe00

adjust transformers with adapters

71fa2fd

jeqo force-pushed the new-processor-kstream-2 branch from a2c2e64 to 71fa2fd Compare April 12, 2021 10:53

jeqo added 2 commits April 12, 2021 12:27

checkstail main

32f05a1

checkstyle test

7c18abd

jeqo changed the title ~~KAFKA-8410: Migrating KStream/KTable aggregations and joins to new Processor API~~ KAFKA-8410: Migrating stateful operators to new Processor API Apr 12, 2021

passing tests

34336fe

fix duplicated store bug

b557fe0

jeqo marked this pull request as ready for review April 12, 2021 16:26

jeqo added 8 commits April 12, 2021 17:37

fix missing context

dda022c

add mock for old processor api

1aaadf2

align types ordering

2c9285e

rollback change

846f756

rm todo comment

5057f88

align type names

0f9f492

align type names

5850705

set types

46ca917

vvcephei self-requested a review April 27, 2021 15:32

vvcephei added the streams label Apr 27, 2021

jeqo added 5 commits May 10, 2021 10:48

merged with trunk

ec89277

Add support for KTable transforms with new ProcessorContext

d168526

compose forwarding disabled processor context based on internal

442f351

Merge remote-tracking branch 'upstream/trunk' into new-processor-kstr…

786c339

…eam-2

fix internal processor contexts on tests

486657a

vvcephei reviewed May 21, 2021

View reviewed changes

vvcephei mentioned this pull request May 21, 2021

KAFKA-8410: KTableProcessor migration groundwork #10744

Merged

3 tasks

cadonna reviewed May 31, 2021

View reviewed changes

vvcephei mentioned this pull request Jun 11, 2021

KAFKA-10546: Deprecate old PAPI #10869

Merged

3 tasks

jeqo closed this Oct 19, 2021

jeqo deleted the new-processor-kstream-2 branch October 19, 2021 14:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KAFKA-8410: Migrating stateful operators to new Processor API #10507

KAFKA-8410: Migrating stateful operators to new Processor API #10507

jeqo commented Apr 8, 2021 •

edited

Loading

vvcephei left a comment

vvcephei May 21, 2021

vvcephei May 21, 2021

cadonna May 31, 2021

vvcephei May 21, 2021

vvcephei May 21, 2021 •

edited

Loading

vvcephei commented May 28, 2021

cadonna left a comment

cadonna May 31, 2021

cadonna May 31, 2021

cadonna May 31, 2021

cadonna May 31, 2021

dongjinleekr Jun 3, 2021

dongjinleekr Jun 3, 2021

cadonna May 31, 2021

cadonna May 31, 2021

cadonna May 31, 2021

cadonna May 31, 2021

KAFKA-8410: Migrating stateful operators to new Processor API #10507

KAFKA-8410: Migrating stateful operators to new Processor API #10507

Conversation

jeqo commented Apr 8, 2021 • edited Loading

Committer Checklist (excluded from commit message)

vvcephei left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vvcephei May 21, 2021 • edited Loading

Choose a reason for hiding this comment

vvcephei commented May 28, 2021

cadonna left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jeqo commented Apr 8, 2021 •

edited

Loading

vvcephei May 21, 2021 •

edited

Loading