-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove buggy BarrierStat #2740
Remove buggy BarrierStat #2740
Conversation
The implementation of BarrierStat is buggy, and it is not necessary for Paddle to diagnose which node in cluster is slow.
"pushRecv", timeToMicroSecond(*handleRequestBegin_), -1, *statSet_); | ||
} | ||
|
||
#ifndef PADDLE_DISABLE_TIMER |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should also remove the cmake flag WITH_TIMER
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WITH_TIMER is not totally removed. It is used to calculate execution time for each layer and whole neural network in trainer side.
This PR only removes the BarrierStats
, that used in PServer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see.
isSparseServer_ ? "_sparseUpdater" : "_denseUpdater"); | ||
} | ||
|
||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should not remove this {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a diff
display error. That {
is in line 384/368
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM++ will test on cluster later.
The implementation of BarrierStat is buggy, and it is not necessary
for Paddle to diagnose which node in cluster is slow.