-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Serve] Add logger with backend and replica tags #14251
Conversation
I wonder if this can a bit more magical and easier to use:
|
That would be really cool @simon-mo, do you have an idea of how to implement that? I guess we would need to wrap the standard python logger and implement each method |
doc/source/serve/deployment.rst
Outdated
For an general overview of logging in Ray, see `Ray Logging <../ray-logging.html>`__. | ||
|
||
When looking through log files of your Ray Serve application, it is useful to know which backend and replica each log line originated from. | ||
To automatically tag your logs with the current backend tag and replica tag (of the form ``backend_tag#<random letters>``), use the following function: | ||
|
||
.. autofunction:: ray.serve.get_backend_logger | ||
|
||
To write your own custom logger using Python's ``logging`` package, you may find the following two methods useful: | ||
|
||
.. autofunction:: ray.serve.get_current_backend_tag | ||
.. autofunction:: ray.serve.get_current_replica_tag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd add a subsection here saying "configuring logger" or something, then another below that says "Loki Tutorial"
doc/source/serve/deployment.rst
Outdated
|
||
Ray Serve logs can be ingested by your favorite external logging agent. Ray logs from the current session are exported to the directory `/tmp/ray/session_latest/logs` and remain there until the next session starts. | ||
|
||
Here is a quick walkthrough of how to explore and filter your logs using `Loki <https://grafana.com/oss/loki/>`__. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note here that we'll set it up manually but it's much easier to configure on kubernetes. Actually, it might be worth having another tutorial on k8s. We can do that in the future.
python/ray/serve/tests/test_api.py
Outdated
@@ -1012,6 +1012,29 @@ def f(starlette_request): | |||
} | |||
|
|||
|
|||
def test_backend_logger(serve_instance): | |||
# Tests that that backend logger can be created and used without errors. | |||
# Does not test the correctness of the log output. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably worth testing this to avoid regressions. Can we redirect the logger to a tempfile and read that file or something similar?
@architkulkarni the heavily lifting is already done by ray_logging.py wrote by @rkooo567. You can just add this
and ask user to use the ray logger:
the context can be automatically filled in.
This eliminate the need for the explicit Additionally, I think we should put |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make sure to make the following changes.
- Use ray logger
- move get_current_x -> some sort of context API
python/ray/serve/api.py
Outdated
|
||
def get_backend_context(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@edoakes thoughts on this API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for context, I based it off the API for get_runtime_context() in Ray Core: https://docs.ray.io/en/master/package-ref.html#runtime-context-apis.
ray_logger = logging.getLogger("ray") | ||
for handler in ray_logger.handlers: | ||
handler.setFormatter( | ||
logging.Formatter( | ||
handler.formatter._fmt + | ||
f" component=serve backend={self.backend_tag} " | ||
f"replica={self.replica_tag}")) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rkooo567 I want to make sure this ok for the ray logging stack. Thoughts?
python/ray/serve/api.py
Outdated
backend_tag = _INTERNAL_REPLICA_CONTEXT.backend_tag | ||
replica_tag = _INTERNAL_REPLICA_CONTEXT.replica_tag | ||
|
||
return BackendContext(backend_tag, replica_tag) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just merge InternalReplicaContext and BackendContext? You can make the controller_name
a private field _internal_controller_name
or something. So we don't have an extra data structure that's just a subset of InternalReplicaContext.
additionally, naming wise, this is ReplicaContext, not BackendContext because it contains the replica_tag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, that makes a lot of sense
Let me also move the doc image from this repo to the external images repo. |
doc/source/serve/deployment.rst
Outdated
|
||
.. code-block:: bash | ||
|
||
INFO -- Some info! component=serve backend=my_backend replica=my_backend#krcwoa |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you update this? it should print ray logging style now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, good catch. Updated
Why are these changes needed?
This PR adds a new API method
serve.get_backend_logger()
. When called from a backend, it returns a logger which prepends the backend tag and replica tag to each log line.The PR also includes an end-to-end tutorial for filtering Ray logs by backend tag using Loki and Grafana.
Related issue number
Closes #13917
Closes #13916
Checks
scripts/format.sh
to lint the changes in this PR.