-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GCS]Fix flaky testcase: ServiceBasedGcsClientTest #14248
Conversation
actually, the test still failed:
|
Thanks, i have fix it. |
@@ -204,7 +204,7 @@ void ServiceBasedGcsClient::ReconnectGcsServer() { | |||
if (last_reconnect_address_ == address && | |||
(current_sys_time_ms() - last_reconnect_timestamp_ms_) < | |||
RayConfig::instance().minimum_gcs_reconnect_interval_milliseconds()) { | |||
RAY_LOG(INFO) | |||
RAY_LOG(DEBUG) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we keep this log level as INFO or even upgrade it to WARNING? It's useful when we try to figure out what happened while client reconnecting. (although it prints much logs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please avoid using WARNING logs unless the error message clearly indicates what to do for end users (so this sort of implementation details shouldn't be used for WARNING). All of the warning logs from GCS server and raylet will be streamed to drivers. (If we'd like to use WARNING more aggressively, we should change this behavior first).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. How about keeping it as INFO? @ffbin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It prints much logs, so DEBUG is better. If client reconnect, we have print log Reconnected to GCS server:
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
This is another problem. I'm still locating the cause. I'll ping you after fix it, thanks. |
Awesome! Feel free to merge whenever you think its ready. |
Why are these changes needed?
In the process of actor registration, if the callback function of
WaitForActorOutOfScope
is executed first, and then the callback function ofActorTable().Put
is executed, the actor registration fails; otherwise, the actor registration succeeds. So we can't assert whether the actor is registered successfully.Related issue number
#14245
Checks
scripts/format.sh
to lint the changes in this PR.