-
Notifications
You must be signed in to change notification settings - Fork 13.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MINOR: Add thread dumps if broker node cannot be stopped #5373
MINOR: Add thread dumps if broker node cannot be stopped #5373
Conversation
Signed-off-by: Arjun Satish <arjun@confluent.io>
@ijuma could we merge this to v1.1? thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, just a couple comments
|
||
try: | ||
wait_until(lambda: len(self.pids(node)) == 0, timeout_sec=60, err_msg="Kafka node failed to stop") | ||
except Exception: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we should log this exception in case thread_dump
raises an unexpected error. We don't want to lose the original error.
|
||
def thread_dump(self, node): | ||
for pid in self.pids(node): | ||
node.account.signal(pid, signal.SIGQUIT, allow_fail=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's fine to allow failure since this is just for debugging?
Signed-off-by: Arjun Satish <arjun@confluent.io>
3482913
to
9665796
Compare
@hachikuji fixed those issues. thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update, LGTM
In system tests, it is useful to have the thread dumps if a broker cannot be stopped using SIGTERM. Reviewers: Xavier Léauté <xl+github@xvrl.net>, Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
In system tests, it is useful to have the thread dumps if a broker cannot be stopped using SIGTERM. Reviewers: Xavier Léauté <xl+github@xvrl.net>, Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
…:1.1.1-sync to 1.1-nflx * commit '9611672e287c1a7933a78590e3f381da2ae7d136': (57 commits) MINOR: increase dev version from 1.1.1-SNAPSHOT to 1.1.2-SNAPSHOT (apache#5409) MINOR: Add thread dumps if broker node cannot be stopped (apache#5373) MINOR: update release.py MINOR: fix upgrade docs for Streams (apache#5392) MINOR: improve docs version numbers (apache#5372) Update version on the branch to 1.1.2-SNAPSHOT KAFKA-6292; Improve FileLogInputStream batch position checks to avoid type overflow (apache#4928) HOTFIX: Fix checkstyle errors in MetricsTest (apache#5345) KAFKA-7136: Avoid deadlocks in synchronized metrics reporters (apache#5341) MINOR: Close timing window in SimpleAclAuthorizer startup (apache#5318) MINOR: Use kill_java_processes when killing ConsoleConsumer in system tests (apache#5297) KAFKA-7104: More consistent leader's state in fetch response (apache#5305) Revert "MINOR: Avoid coarse lock in Pool#getAndMaybePut (apache#5258)" MINOR: Avoid coarse lock in Pool#getAndMaybePut (apache#5258) MINOR: bugfix streams total metrics (apache#5277) KAFKA-7082: Concurrent create topics may throw NodeExistsException (apache#5259) MINOR: Upgrade to Gradle 4.8.1 KAFKA-7012: Don't process SSL channels without data to process (apache#5237) KAFKA-7058: Comparing schema default values using Objects#deepEquals() KAFKA-7047: Added SimpleHeaderConverter to plugin isolation whitelist ...
In system tests, it is useful to have the thread dumps if a broker cannot be stopped using SIGTERM.
Signed-off-by: Arjun Satish arjun@confluent.io