New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Dashboard] Improve handling of logs and errors in dashboard backend #5857
Conversation
Test FAILed. |
Can one of the admins verify this patch? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few minor comments. Otherwise looks good! I really like the massive refactoring of hostname -> ip
jenkins add to whitelist |
Was anything done to remove the token, or is anyone doing anything interesting with say nginx to make access to the webserver easier than reconfiguring nginx each time? I mean I could of course set up a reverse tunnel each time as well but would be better if the uri anchors worked long-term in a reverse proxy (like nginx). |
Hi @virtualluke, we'll be removing the token and addressing some of the other concerns brought up on the Ray Slack group in a separate PR right after this one, thanks. |
Test FAILed. |
…ay-project#5857) * Improve handling of logs and errors in dashboard backend * Update nested dict comprehension for clarity
Why are these changes needed?
This addresses an efficiency issue in the initial version of the dashboard backend wherein the full set of logs and errors were being sent on every update. This PR introduces a better solution where only the log and error counts are sent by default, with two new API routes for retrieving logs and errors themselves (either for a specific (IP address, PID) pair or everything for a single IP address).
This PR moreover unifies the backend handling of logs and errors through the consistent use of IP addresses as identifiers, instead of using IP addresses for logs and hostnames for errors.
In addition, a minor discrepancy on some systems in the reporter is fixed by replacing a the custom
determine_ip_address()
function with a call toray.services.get_node_ip_address()
.CC @simon-mo @robertnishihara
Related issue number
N/A
Checks
scripts/format.sh
to lint the changes in this PR.