-
Notifications
You must be signed in to change notification settings - Fork 9.2k
YARN-11398 DelegationTokenRenewer timeout feature may cause high utilization of CPU and memory leak #5233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…CPU in idle state and memory leak
|
💔 -1 overall
This message was automatically generated. |
| // If the cluster is idle for some time, futures map is empty or no event handler found which may still cause high CPU utilization | ||
| // Therefore a short nap should be added here. | ||
| try { | ||
| Thread.sleep(1000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set sleep 1000ms, what is the effect of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set sleep 1000ms, what is the effect of this?
Give up CPU
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we give data? The CPU ratio before and after the modification. Why did we choose 1000ms? there seems to be no data support for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@slfan1989
Thanks for review.
The CPU utilization statistics is as follows:
Before optimize:

There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not seem to be a good way to deal with it through thread sleep.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not convinced with the sleep either as @slfan1989 mentioned and it looks very arbitrary to me, worked for you but might cause issues for others
|
@ayushtkn Could you pls help to review this PR? |
|
Hi, any update on this PR? We're suffering high CPU usage in low-spec clusters like alpha and beta environments with yarn too. WDYT to wait for some signal in Line 230 in 55e8301
|
|
We're closing this stale PR because it has been open for 100 days with no activity. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |

ResourceManager DelegationTokenRenewer timeout feature may cause high utilization of CPU and object leak.
1-If yarn cluster is in idle state, that is almost no token renewer event triggered, the DelegationTokenRenewerPoolTracker thread will do nothing but dead loop, it will cause high CPU utilization.
2-The renewer event is hold in a map named futures, will has no remove logic , that is the map will become increasingly great with time going by.