Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent potentially misleading error message about "no heartbeat" #2180

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

okurz
Copy link
Contributor

@okurz okurz commented May 16, 2024

In the context of
github.com/os-autoinst/openQA
we observed error messages coming from Mojo::Server::Prefork
when otherwise there is no harmful effect observed. As the underlying
prefork module should only inform about a potentially erroneous
situation but can not know about the effect on any component that uses
the prefork module the log message instead should not claim an error but
rather only inform. Hence this commit reduces the log level from error
to info accordingly.

Related progress issue: https://progress.opensuse.org/issues/138536

In the context of
github.com/os-autoinst/openQA
we observed error messages coming from Mojo::Server::Prefork
when otherwise there is no harmful effect observed. As the underlying
prefork module should only inform about a potentially erroneous
situation but can not know about the effect on any component that uses
the prefork module the log message instead should not claim an error but
rather only inform. Hence this commit reduces the log level from error
to info accordingly.

Related progress issue: https://progress.opensuse.org/issues/138536
@guest20
Copy link

guest20 commented May 16, 2024

Is there a reason it's not possible for y'all to extended the timeout it with the attribute "heartbeat_timeout" in Mojo::Server::Prefork since your application requires it?

@okurz
Copy link
Contributor Author

okurz commented May 16, 2024

Is there a reason it's not possible for y'all to extended the timeout it with the attribute "heartbeat_timeout" in Mojo::Server::Prefork since your application requires it?

yes, we already use 900s. As far as I understood the evaluation in https://progress.opensuse.org/issues/106759 and https://progress.opensuse.org/issues/138536 regarding how this is used within the context of github.com/os-autoinst/openQA/ we will run into the timeout regardless. Probably something in the implementation is sub-optimal but anyway I still think the message with a log level of "error" is too strict

@kraih
Copy link
Member

kraih commented May 16, 2024

Why info over warn?

@okurz
Copy link
Contributor Author

okurz commented May 16, 2024

Why info over warn?

I would say same reasoning because I think this component can not know if the condition is harmful to the caller but if you guys prefer "warn" that would also help us.

@guest20
Copy link

guest20 commented May 27, 2024

Am I misunderstanding what the log message means?

Something blocked the worker for more than the heartbeat timeout and the supervisor has decided to kill that worker, causing anybody with a pending request to that worker to never receive any more bytes in response... Is that the situation this message warns about?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants