-
Notifications
You must be signed in to change notification settings - Fork 13.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KAFKA-12508: Disable KIP-557 #10397
KAFKA-12508: Disable KIP-557 #10397
Conversation
A major issue has been raised that this implementation of emit-on-change is vulnerable to a number of data-loss bugs in the presence of recovery with dirty state under at-least-once semantics. This should be fixed in the future when we implement a way to avoid or clean up the dirty state under at-least-once, at which point it will be safe to re-introduce KIP-557 and complete it.
Hey @ableegoldman @cadonna, can you spare a quick review here? All I did was revert the KTableSourceProcessor to its state before the KIP-557 implementation. I just ignored the unit test, and reverted the FK Join test (it had become possible to simplify it after KIP-557). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, although it may be good to consider replacing the ignored testKTableSourceEmitOnChange
with a test that verifies emit-on-change is not enabled.
Also, I believe KIP-557 applied emit-on-change semantics to a few other operators. I don't remember them all of the top of my head but can you confirm that none of the others are impacted by the reported bugs?
streams/src/test/java/org/apache/kafka/streams/kstream/internals/KTableSourceTest.java
Show resolved
Hide resolved
Thanks, @ableegoldman ! As far as I know this was the only operator we updated to KIP-557. This is the only commit I find referencing the KIP: f54cece |
Ah ok, I saw the KIP had included "any aggregation or repartition operation" in addition to the source KTable but I guess this just hasn't been implemented yet. Looks like Richard never got around to making sub-tasks for the other operators so I filed tickets and linked them all in the KIP so it's easier to understand what is/isn't in which versions. Thanks for doing this PR! |
Woah, thanks for that, @ableegoldman ! |
While running tests for apache#10397, I got a test timeout under Java 8. I ran it locally via `./gradlew clean -PscalaVersion=2.12 :clients:unitTest --profile --no-daemon --continue -PtestLoggingEvents=started,passed,skipped,failed -PignoreFailures=true -PmaxTestRetries=1 -PmaxTestRetryFailures=5` (copied from the Jenkins log) and was able to determine that the hanging test is: org.apache.kafka.clients.admin.KafkaAdminClientTest#testClientSideTimeoutAfterFailureToReceiveResponse It's odd, but it hangs most times on my branch, and I haven't seen it hang on trunk, despite the fact that my PR doesn't touch the client or core code at all. Some debugging reveals that when the client is hanging, it's because the listTopics request is still sitting in its pendingRequests queue, and if I understand the test setup correctly, it would never be completed, since we will never advance time or queue up a metadata response for it. I figure a reasonable blanket response to this is just to make sure that the test harness will close the admin client eagerly instead of lazily.
I looked into the Java 8 timeout. It should not be related, as it's caused by the KafkaAdminClientTest hanging. I submitted #10404 to fix that test. |
Nice test debugging 😄 Just to confirm, you're going to backport this to 2.7 & 2.6 right? |
Thanks, @ableegoldman ! Yes, I plan to do it right now. |
A major issue has been raised that this implementation of emit-on-change is vulnerable to a number of data-loss bugs in the presence of recovery with dirty state under at-least-once semantics. This should be fixed in the future when we implement a way to avoid or clean up the dirty state under at-least-once, at which point it will be safe to re-introduce KIP-557 and complete it. Reviewers: A. Sophie Blee-Goldman <ableegoldman@apache.org>
A major issue has been raised that this implementation of emit-on-change is vulnerable to a number of data-loss bugs in the presence of recovery with dirty state under at-least-once semantics. This should be fixed in the future when we implement a way to avoid or clean up the dirty state under at-least-once, at which point it will be safe to re-introduce KIP-557 and complete it. Reviewers: A. Sophie Blee-Goldman <ableegoldman@apache.org>
A major issue has been raised that this implementation of emit-on-change is vulnerable to a number of data-loss bugs in the presence of recovery with dirty state under at-least-once semantics. This should be fixed in the future when we implement a way to avoid or clean up the dirty state under at-least-once, at which point it will be safe to re-introduce KIP-557 and complete it. Reviewers: A. Sophie Blee-Goldman <ableegoldman@apache.org>
A major issue has been raised that this implementation of emit-on-change is vulnerable to a number of data-loss bugs in the presence of recovery with dirty state under at-least-once semantics. This should be fixed in the future when we implement a way to avoid or clean up the dirty state under at-least-once, at which point it will be safe to re-introduce KIP-557 and complete it. Reviewers: A. Sophie Blee-Goldman <ableegoldman@apache.org>
A major issue has been raised that this implementation of emit-on-change is vulnerable to a number of data-loss bugs in the presence of recovery with dirty state under at-least-once semantics. This should be fixed in the future when we implement a way to avoid or clean up the dirty state under at-least-once, at which point it will be safe to re-introduce KIP-557 and complete it. Reviewers: A. Sophie Blee-Goldman <ableegoldman@apache.org>
A major issue has been raised that this implementation of
emit-on-change is vulnerable to a number of data-loss bugs
in the presence of recovery with dirty state under at-least-once
semantics. This should be fixed in the future when we implement
a way to avoid or clean up the dirty state under at-least-once,
at which point it will be safe to re-introduce KIP-557 and
complete it.
Committer Checklist (excluded from commit message)