Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core] Buffered logs lost sometimes #45262

Closed
robertnishihara opened this issue May 11, 2024 · 3 comments · Fixed by #45485
Closed

[Core] Buffered logs lost sometimes #45262

robertnishihara opened this issue May 11, 2024 · 3 comments · Fixed by #45485
Assignees
Labels
bug Something that is supposed to be working; but isn't core Issues that should be addressed in Ray Core P0 Issues that should be fixed in short order

Comments

@robertnishihara
Copy link
Collaborator

What happened + What you expected to happen

When running in IPython, some print statements are sometimes suppressed. We buffer them under the hood and deduplicate them (ignoring numerical values), but then they never seem to get printed.

In [1]: import time

In [2]: import ray

In [3]: @ray.remote
   ...: def f(i):
   ...:     print(i)
   ...:

In [4]: [f.remote(i) for i in range(10)]
2024-05-10 18:01:35,772	INFO worker.py:1740 -- Started a local Ray instance. View the dashboard at http://127.0.0.1:8266
Out[4]:
[ObjectRef(c8ef45ccd0112571ffffffffffffffffffffffff0100000001000000),
 ObjectRef(16310a0f0a45af5cffffffffffffffffffffffff0100000001000000),
 ObjectRef(c2668a65bda616c1ffffffffffffffffffffffff0100000001000000),
 ObjectRef(32d950ec0ccf9d2affffffffffffffffffffffff0100000001000000),
 ObjectRef(e0dc174c83599034ffffffffffffffffffffffff0100000001000000),
 ObjectRef(f4402ec78d3a2607ffffffffffffffffffffffff0100000001000000),
 ObjectRef(f91b78d7db9a6593ffffffffffffffffffffffff0100000001000000),
 ObjectRef(82891771158d68c1ffffffffffffffffffffffff0100000001000000),
 ObjectRef(8849b62d89cb30f9ffffffffffffffffffffffff0100000001000000),
 ObjectRef(80e22aed7718a125ffffffffffffffffffffffff0100000001000000)]

(f pid=31071) 7
In [5]: time.sleep(1)

In [6]: time.sleep(5)

In [7]: ray.put(1)
Out[7]: ObjectRef(00ffffffffffffffffffffffffffffffffffffff0100000001e1f505)

In [8]:

In [8]: f.remote(0)
Out[8]: ObjectRef(359ec6ce30d3ca2dffffffffffffffffffffffff0100000001000000)

(f pid=31068) 0
In [9]:

In [9]:
Do you really want to exit ([y]/n)?

Versions / Dependencies

Python 3.11.4
ray, version 2.21.0
macOS 13.5

Reproduction script

See above

Issue Severity

None

@robertnishihara robertnishihara added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels May 11, 2024
@anyscalesam anyscalesam added the core Issues that should be addressed in Ray Core label May 13, 2024
@rynewang
Copy link
Contributor

@hongchaodeng to repro

@rynewang rynewang added P1 Issue that should be fixed within a few weeks and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels May 20, 2024
@hongchaodeng hongchaodeng added @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. P0 Issues that should be fixed in short order and removed P1 Issue that should be fixed within a few weeks @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. labels May 20, 2024
@hongchaodeng
Copy link
Member

Confirmed that I can reproduce it.

This won't work:

[f.remote(i) for i in range(10)]
ray.get([f.remote(i) for i in range(10)])

This will work but is really bad user experience:

[ray.get(f.remote(i)) for i in range(10)]

@hongchaodeng
Copy link
Member

Root cause

Here's a breakdown of the root cause:

Ray runtime in driver deduplicates similar logs from all workers by default. In this case, all numbers are canonicalized, i.e. treated equal.

This can be disabled by setting env:

export RAY_DEDUP_LOGS=0

Solution

If all that prints are just numbers, it would be really bad user experience to dedup them. Because that's all the information that users want to see. If they can't see them, they will be confused.

Submit #45485 to address this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't core Issues that should be addressed in Ray Core P0 Issues that should be fixed in short order
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants