-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improved logging in nats to nats connections #622
Comments
I think being able to dynamically turn on tracing and debug modes without restart might be a better direction here. |
Debugging would instantly overwhelm my /var partition :) It might be useful but basic error logging and info for major events will go very far |
Just to expand on that, I generally never want to see debug output in production setting with very large numbers of nodes connected sending many things and clients coming/going this would create a storm of noise and you'll miss whats going on. Further its not retrospective, you cannot run in debug all the time. These lines I highlighted though, I always want to see them, they are critical for the operability of the software imo. I want to go back and review logs after an incident and know this happened, it should be safe to always expose this data. Logging appropriately is the correct action here - this is not debug information. Not to say being able to enable debug/trace dynamically would not be good - but it would not solve this problem. |
In general, I kind of agree that these could be elevated, but I have a concern about the one trying to establish the route. If you have a static route but the remote server is not running, this notice/error would then be printed every 2 seconds. Now with config reload you should be able to remove it though. I agree that the dynamic nature of enabling logging may not help once the event you are interested in has already happened. |
@kozlovic good point about the remote server messages, and this will be made worse by the business of announced cluster members never expiring if the node is down - in its self probably something worth knowing. In general I think its normal and expected that those messages would appear though, admins are used to that kind of thing |
Also: and You're CLOSING the connection after logging a debug message, that's so unfriendly to operators. |
I believe we have addressed this in #692 |
Feature Requests
Use Case:
improved operability of clusters
Proposed Change:
We need better logging in clusters, I found for example slow consumer messages in my logs, I believe this is in the cluster node < -> cluster node connections...but its hard to say.
Elevating these lines to info would help:
https://github.com/nats-io/gnatsd/blob/ee7b97e6ee3068900d39f1fe4ae7b75f358416ab/server/route.go#L142
https://github.com/nats-io/gnatsd/blob/ee7b97e6ee3068900d39f1fe4ae7b75f358416ab/server/route.go#L737
Elevating this to error would help:
https://github.com/nats-io/gnatsd/blob/ee7b97e6ee3068900d39f1fe4ae7b75f358416ab/server/route.go#L168
Additional changes would be to better log when the cluster connections drop on the side that iniaited the request initially so we know this happens
Who Benefits From The Change(s)?
cluster operators
Alternative Approaches
n/a
The text was updated successfully, but these errors were encountered: