-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
STORM-2052: Kafka Spout - New Client API - Performance Improvements #1649
Conversation
final boolean poll = !waitingToEmit() && numUncommittedOffsets < maxUncommittedOffsets; | ||
|
||
if (!poll) { | ||
final boolean isLoggable = logTimer.isExpiredResetOnTrue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need this check since these are debug logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is here in order to throttle the number of messages that get printed. Although logs are with debug level, a lot of messages will get printed (one each time nextTuple is called), which will spam the logs, and limit the ability to debug. What this code does is to print one message per second, rather than one per call to nextTuple. By doing this, no info is lost, and things are much cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this is debug it won't print in normal op. If I turn on the debug logging my expectations is to see the logs immediately. If I don't see the messages than I will think that the debug log is not taking affect.
Either way I am ok with it.
overall looks good. Can you squash the commits. |
@harshach I can squash the commits, but I left these three commits on purpose. Each commit is a logical groping. One is docs, one is the default values, and one is the logging changes. They all have the same JIRA ticket, but the commit log messages suit the changes. The reason behind this choice is that it is easier to cherry pick only part of the changes to a given branch - if needed. I also believe that it makes it easier for an external person to follow what each patch addresses. I am OK with either approach. This squash is pretty simple. Let me know if you prefer that this is indeed squashed, and I will do so. Thanks! |
@hmcl all the commit title as the same text "Kafka Spout - New Client API - Performance Improvements" |
can you also fix the JIRA title . Given that these are default config changes and not really performance fixes in the code. |
@hmcl we cherry-pick the entire JIRA into other branches. It doesn't make sense to cherry-pick partial commits of a JIRA for releases. So squash the commits into one as they are all related to this JIRA. |
…eter Tuning for Better Performance - Set default config values for good performance - Improve Logging - Log heavy objects with TRACE level - Make log messages more meaningful - Delete a couple of logs - Update README
a81fa6e
to
9dd62da
Compare
@harshach Done! |
+1 |
merged into master and 1.x-branch |
This patch should be back-ported to 1.0.x branch and 1.x-branch. Thank you.