Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send raylet error logs through the log monitor #5351

Merged
merged 6 commits into from Aug 6, 2019

Conversation

ericl
Copy link
Contributor

@ericl ericl commented Aug 2, 2019

What do these changes do?

Now if your raylet crashes, you get the error sent via the log monitor process:

image

Related issue number

Closes #5322

Linter

  • I've run scripts/format.sh to lint the changes in this PR.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15921/
Test PASSed.

@ericl
Copy link
Contributor Author

ericl commented Aug 5, 2019

(pid=raylet) WARNING: Logging before InitGoogleLogging() is written to STDERR
(pid=raylet) I0804 21:42:03.236798 32216 redis_gcs_client.cc:145] RedisGcsClient::Connect finished with status OK
(pid=raylet) WARNING: Logging before InitGoogleLogging() is written to STDERR
(pid=raylet) I0804 21:42:03.247087 32218 stats.h:48] Succeeded to initialize stats: exporter address is 127.0.0.1:8888
(pid=raylet) I0804 21:42:03.247763 32218 redis_gcs_client.cc:145] RedisGcsClient::Connect finished with status OK
(pid=raylet) I0804 21:42:03.248323 32218 grpc_server.cc:36] ObjectManager server started, listening on 0.0.0.0:44845
(pid=raylet) I0804 21:42:03.249888 32218 grpc_server.cc:36] NodeManager server started, listening on 0.0.0.0:37479
(pid=raylet) I0804 21:42:03.250210 32218 grpc_server.cc:36] Raylet server started, listening on unix:///tmp/ray/session_2019-08-04_21-42-03_009401_32194/sockets/raylet

cc @jovany-wang do you know why these log messages get written to STDERR? From what I can tell in the main, logging is initialized first thing. However, e.g., stats::Init() seems to print the above still to STDERR complaining that InitGoogleLogging hasn't been called yet.

@jovany-wang
Copy link
Contributor

I looked into it and found the log_dir_ is empty here.
https://github.com/ray-project/ray/blob/master/src/ray/util/logging.cc#L126
And I can't see we pass the log dir anywhere.

I'll try to fix it later.

@jovany-wang
Copy link
Contributor

btw, which file do you expect these logs to be written?

@ericl
Copy link
Contributor Author

ericl commented Aug 5, 2019 via email

@jovany-wang
Copy link
Contributor

@ericl I fixed it at #5374

@ericl
Copy link
Contributor Author

ericl commented Aug 5, 2019

@jovany-wang Thanks!

@robertnishihara
Copy link
Collaborator

Looks good to me (I tried it out) assuming that the messages below go away (these are currently printed at startup). I agree putting them in raylet.out makes sense. I don't think we should suppress them entirely.

(pid=raylet) I0805 13:56:15.351514 314066368 redis_gcs_client.cc:145] RedisGcsClient::Connect finished with status OK
(pid=raylet) I0805 13:56:15.362835 248083904 stats.h:48] Succeeded to initialize stats: exporter address is 127.0.0.1:8888
(pid=raylet) I0805 13:56:15.365021 248083904 redis_gcs_client.cc:145] RedisGcsClient::Connect finished with status OK
(pid=raylet) I0805 13:56:15.366199 248083904 grpc_server.cc:36] ObjectManager server started, listening on 0.0.0.0:50380
(pid=raylet) I0805 13:56:15.372203 248083904 grpc_server.cc:36] NodeManager server started, listening on 0.0.0.0:50381
(pid=raylet) I0805 13:56:15.372795 248083904 grpc_server.cc:36] Raylet server started, listening on unix:///tmp/ray/session_2019-08-05_13-56-15_102483_32599/sockets/raylet

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15985/
Test PASSed.

@ericl
Copy link
Contributor Author

ericl commented Aug 5, 2019

Added a StdoutLogger which is the new default logger for glog when no glog log dir is configured (as is the case by default). This achieves the desired behaviour of INFO in raylet.out.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15988/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/15991/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/16002/
Test PASSed.

@ericl ericl merged commit 0a3ff48 into ray-project:master Aug 6, 2019
@ericl
Copy link
Contributor Author

ericl commented Aug 6, 2019

Tests look unrelated.

edoakes pushed a commit to edoakes/ray that referenced this pull request Aug 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

raylet output should be piped to normal stdout in non-cluster mode
4 participants