Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase log levels of definitely-broken network disconnects #66473

Closed
DaveCTurner opened this issue Dec 16, 2020 · 3 comments · Fixed by #81768
Closed

Increase log levels of definitely-broken network disconnects #66473

DaveCTurner opened this issue Dec 16, 2020 · 3 comments · Fixed by #81768
Assignees
Labels
:Distributed/Network Http and internode communication implementations >enhancement Team:Distributed Meta label for distributed team

Comments

@DaveCTurner
Copy link
Contributor

In #51612 we promoted log messages regarding unexpected network disconnects from TRACE to DEBUG. Some users have indicated that they'd like even more visibility on these by default, at least for the ones that are definite indications of something going wrong at the network level, which IMO is at least Connection reset and Connection timed out on Linux. Normally we shouldn't be seeing either of these so I'm opening this issue to propose moving them all the way up to WARN.

I'm not decided on whether to push for this change in 7.x or to wait for 8.0. The trouble with doing it in 7.x is that transport clients are still in use in these versions and clients may have less reliable connectivity to Elasticsearch than we expect for node-to-node connections, so some of the logs we emit might not be actionable for the server administrator.

We might need a sync discussion on this topic eventually but for now I'd prefer to gather async thoughts here first.

@DaveCTurner DaveCTurner added >enhancement feedback_needed :Distributed/Network Http and internode communication implementations labels Dec 16, 2020
@elasticmachine elasticmachine added the Team:Distributed Meta label for distributed team label Dec 16, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@original-brownbear
Copy link
Member

We discussed this in our sync today and everyone was in agreement that the benefits of doing this in 7.x outweigh this issue:

 The trouble with doing it in 7.x is that transport clients are still in use in these versions and clients may have less reliable connectivity to Elasticsearch than we expect for node-to-node connections, so some of the logs we emit might not be actionable for the server administrator.

which should be rare and while not always actionable you could argue that knowing that client connections are unstable will often be valuable as well -> Removing team-discuss here and unless you have any concerns @DaveCTurner this one should be good to go.

@DaveCTurner DaveCTurner self-assigned this Aug 26, 2021
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this issue Dec 15, 2021
In elastic#51612 we promoted certain log messages regarding unexpected network
exceptions from `TRACE` to `DEBUG`. In fact it's often useful to see
these exceptions by default, so with this commit we show the message
(but not the stack trace) at `INFO` level. This commit also adds some
commentary about what each of the exceptions means.

Closes elastic#66473
@DaveCTurner
Copy link
Contributor Author

I looked at this some more and concluded that even Connection reset and Connection timed out can have benign causes sometimes so it wouldn't be appropriate to have them at WARN level, but I do see value in reporting them at INFO level so they appear in the logs by default. See #81768.

DaveCTurner added a commit that referenced this issue Dec 16, 2021
In #51612 we promoted certain log messages regarding unexpected network
exceptions from `TRACE` to `DEBUG`. In fact it's often useful to see
these exceptions by default, so with this commit we show the message
(but not the stack trace) at `INFO` level. This commit also adds some
commentary about what each of the exceptions means.

Closes #66473
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed/Network Http and internode communication implementations >enhancement Team:Distributed Meta label for distributed team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants