-
Notifications
You must be signed in to change notification settings - Fork 14k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KAFKA-5361: Add more integration tests for Streams EOS #3276
KAFKA-5361: Add more integration tests for Streams EOS #3276
Conversation
mjsax
commented
Jun 8, 2017
•
edited
Loading
edited
- multi-subtopology tests
- fencing test
- producer fenced bug fix: Streams did not recover correctly from ProducerFencedException
Call for review @enothereska @dguy @bbejeck @guozhangwang |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
383937b
to
cf2c6da
Compare
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
- multi-subtopology tests - fencing test
cleanup fencing test after rebase
cf2c6da
to
5e7fd50
Compare
Call for review @enothereska @dguy @bbejeck @guozhangwang |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
Test failure unrelated: |
Retest this please |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some minor comments.
Could you trigger say 5 consecutive builds on Jenkins and check their results?
@@ -57,7 +56,7 @@ public static void registerAppInfo(String prefix, String id) { | |||
AppInfo mBean = new AppInfo(); | |||
ManagementFactory.getPlatformMBeanServer().registerMBean(mBean, name); | |||
} catch (JMException e) { | |||
log.warn("Error registering AppInfo mbean", e); | |||
log.warn("Error registering AppInfo mbean: {}", e.getMessage()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I though we don't want to print stack traces for WARN level logs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no such policy at the moment. Was there a specific issue with this line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue we observed in Streams is, that people see a stack trace and get worried -- they don't pay attention to the log level that is WARN
-- thus it seems better to never include stack traces on WARN
to not confuse users. I can revert, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WARN
is something to be worried about, right? If it's just informational, it should be INFO
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. But WARN
should still not contain a stack trace -- that's for ERROR
IMHO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems pretty arbitrary though since you just agreed that users should be worried either way. The problem may be that people are using warn when they should use info.
@@ -259,7 +260,7 @@ void retryWithBackoff(final Map<TaskId, Set<TopicPartition>> tasksToBeCreated, f | |||
} catch (final LockException e) { | |||
// ignore and retry | |||
if (!retryingTasks.contains(taskId)) { | |||
log.warn("{} Could not create task {}. Will retry. {}", logPrefix, taskId, e); | |||
log.warn("{} Could not create task {}. Will retry. {}", logPrefix, taskId, e.getMessage()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: could not create task {} due to {}. Will retry.
@@ -272,6 +273,7 @@ void retryWithBackoff(final Map<TaskId, Set<TopicPartition>> tasksToBeCreated, f | |||
try { | |||
Thread.sleep(backoffTimeMs); | |||
backoffTimeMs <<= 1; | |||
backoffTimeMs = Math.min(backoffTimeMs, maxBackoffTimeMs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change again? Or could you add this to the description of the PR (it seems not included)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh. My bad. That is unrelated and I included it to not forget about it -- currently, with exponential increase of waiting time users did report unsatisfying behavior for this. I think we need to fix it... This should also be back ported to 0.10.2
... But maybe it should not be part of this PR. But not sure how else we can get it into 0.11
;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened #3312 for 0.10.2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm.. I was not aware of this, and could you point me to the un-statisfying behavior that user complain about? Would like to make sure if this is really an issue to fix before merging.
For this PR, I'd suggest we do not include it so that it can be merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted this and created #3327
Updated. |
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
22e4792
to
b8075a5
Compare
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
- multi-subtopology tests - fencing test - producer fenced bug fix: Streams did not recover correctly from ProducerFencedException Author: Matthias J. Sax <matthias@confluent.io> Reviewers: Ismael Juma <ismael@juma.me.uk>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com> Closes #3276 from mjsax/kafka-5361-add-eos-integration-tests-for-streams-api (cherry picked from commit 160933b) Signed-off-by: Guozhang Wang <wangguoz@gmail.com>
LGTM and merged to trunk / 0.11.0. |