Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MINOR: Add thread dumps if broker node cannot be stopped #5373

Merged
merged 2 commits into from
Jul 20, 2018

Conversation

wicknicks
Copy link
Contributor

In system tests, it is useful to have the thread dumps if a broker cannot be stopped using SIGTERM.

Signed-off-by: Arjun Satish arjun@confluent.io

Signed-off-by: Arjun Satish <arjun@confluent.io>
@wicknicks
Copy link
Contributor Author

@ijuma, @xvrl could you please look at this change? thanks!

@wicknicks
Copy link
Contributor Author

@ijuma could we merge this to v1.1? thanks!

Copy link

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, just a couple comments


try:
wait_until(lambda: len(self.pids(node)) == 0, timeout_sec=60, err_msg="Kafka node failed to stop")
except Exception:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we should log this exception in case thread_dump raises an unexpected error. We don't want to lose the original error.


def thread_dump(self, node):
for pid in self.pids(node):
node.account.signal(pid, signal.SIGQUIT, allow_fail=False)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's fine to allow failure since this is just for debugging?

Signed-off-by: Arjun Satish <arjun@confluent.io>
@wicknicks wicknicks force-pushed the thread-dump-broker-stop-fail-tests branch from 3482913 to 9665796 Compare July 18, 2018 18:52
@wicknicks
Copy link
Contributor Author

@hachikuji fixed those issues. thanks!

Copy link
Contributor

@ijuma ijuma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update, LGTM

@hachikuji hachikuji merged commit 0b5fd99 into apache:trunk Jul 20, 2018
hachikuji pushed a commit that referenced this pull request Jul 20, 2018
In system tests, it is useful to have the thread dumps if a broker cannot be stopped using SIGTERM.

Reviewers: Xavier Léauté <xl+github@xvrl.net>, Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
hachikuji pushed a commit that referenced this pull request Jul 25, 2018
In system tests, it is useful to have the thread dumps if a broker cannot be stopped using SIGTERM.

Reviewers: Xavier Léauté <xl+github@xvrl.net>, Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
allenxwang pushed a commit to allenxwang/kafka that referenced this pull request Aug 24, 2018
…:1.1.1-sync to 1.1-nflx

* commit '9611672e287c1a7933a78590e3f381da2ae7d136': (57 commits)
  MINOR: increase dev version from 1.1.1-SNAPSHOT to 1.1.2-SNAPSHOT (apache#5409)
  MINOR: Add thread dumps if broker node cannot be stopped (apache#5373)
  MINOR: update release.py
  MINOR: fix upgrade docs for Streams (apache#5392)
  MINOR: improve docs version numbers (apache#5372)
  Update version on the branch to 1.1.2-SNAPSHOT
  KAFKA-6292; Improve FileLogInputStream batch position checks to avoid type overflow (apache#4928)
  HOTFIX: Fix checkstyle errors in MetricsTest (apache#5345)
  KAFKA-7136: Avoid deadlocks in synchronized metrics reporters (apache#5341)
  MINOR: Close timing window in SimpleAclAuthorizer startup (apache#5318)
  MINOR: Use kill_java_processes when killing ConsoleConsumer in system tests (apache#5297)
  KAFKA-7104: More consistent leader's state in fetch response (apache#5305)
  Revert "MINOR: Avoid coarse lock in Pool#getAndMaybePut (apache#5258)"
  MINOR: Avoid coarse lock in Pool#getAndMaybePut (apache#5258)
  MINOR: bugfix streams total metrics (apache#5277)
  KAFKA-7082: Concurrent create topics may throw NodeExistsException (apache#5259)
  MINOR: Upgrade to Gradle 4.8.1
  KAFKA-7012: Don't process SSL channels without data to process (apache#5237)
  KAFKA-7058: Comparing schema default values using Objects#deepEquals()
  KAFKA-7047: Added SimpleHeaderConverter to plugin isolation whitelist
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants