-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Monitoring and logging HTTP 500 errors & HTTP 200s for Vault? #13072
Comments
Hi @vallamost, Audit logs are often big enough on a busy vault as to be impractical to scan in realtime for monitoring purposes. While I'm not opposed to adding some details like this to the audit log, I would prefer to prioritize adding some metrics to provide this information - something like |
These HTTP codes don't have to be in the audit logs... If there's a better logging location for them then that would be preferred. Wouldn't a proposed If status codes are only available as a metric and they're excluded from logs then it seems like advance debugging is harder than it should be. If an admin saw a large uptick in 400 status codes, I'm sure they would like to filter and search through their logs for 400 status codes and find the culprit. Just having a metric would tell you there's an issue but without Vault web server logs with HTTP status codes you would be stuck having to go to all of your client logs if you even have those or have access to them...that's no bueno. |
What other error codes were you thinking of? The rest of your comments make me think that we have different visions of what the audit log is for and how it's meant to be used. Some differences between the vault audit log and an http server's request log:
There's some overlap between the use cases for audit logging and request logging, but they're not the same thing. In the course of this discussion I realized that Vault should have an (opt-in) request logging feature like you're envisioning, whereby some key fields like (code, path, method) are logged to the regular server log. It's something I've wanted in the past but always assumed we had a reason for not doing... turns out I was wrong! We're already working on a related feature so we'll throw that in there. |
The related feature in question will be in 1.10, changelog entry:
Docs for the logging part of it: |
Based on the comments, it seems that the requested capability in vault is already addressed. I am going to close this ticket for now. Please reopen this issue or open a new one for further discussions. |
Is your feature request related to a problem? Please describe.
It appears that Vault does not return an HTTP status code in the response logs to a client as documented here at the time of writing this: https://support.hashicorp.com/hc/en-us/articles/360000995548-Audit-and-Operational-Log-Details
I am trying to diagnose and root cause HTTP 500 errors being received by our clients that are interacting with our Vault service's API. At this time it is impossible to know the amount of HTTP 500 errors being thrown by our Vault service. If I could set up a logging filter to know when and where HTTP 500s are being thrown as well as getting their request IDs to troubleshoot the request in the stack that would be super valuable.
Describe the solution you'd like
A clear and concise description of what you want to happen.
I'd like to see Vault add a new attribute in the JSON log output for the
http_response_code
sent in a response to a client calling Vault's REST API.Example log entry
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Log all
error
responses and categorize them to some type of HTTP status?Explain any additional use-cases
If there are any use-cases that would help us understand the use/need/value please share them as they can help us decide on acceptance and prioritization.
Debugging HTTP errors, monitoring for HTTP errors, looking at success rates and availability.
Additional context
Add any other context or screenshots about the feature request here.
The text was updated successfully, but these errors were encountered: