-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add app-id and drain url in the error message #212
Add app-id and drain url in the error message #212
Conversation
- In cases where one application has two or more syslog drains bound, if there are drops in one of the drains, now, the app owner can see the syslog url of the drain dropping the logs. - If one syslog drain is bound to multiple application, and there are drops in the drain, the app owner will see the app-id in the logs. Co-authored-by: Felix Hambrecht <Felix.Hambrecht@sap.com>
hmm, this seems slightly off to me. When using a app drain, I'd expect to see: However, aggregate drains exist and using some of the same data structures. For aggregate drains, the "appid" is blank. So if you have a aggregate drain, I think you would see Otherwise, love it. |
@Benjamintf1 good catch. I guess we'll have to adapt the text of the message depending if the app id is blank or not. I would say that as the aggregate drains are configured by a cf operator and are generally used for central logging platforms, the cf operators would check the state of the drains by adding monitoring and alerting based on the ingress, egress and dropped metrics. This error message would be essential for "external users" who do:
|
yeah, you're right. I think maybe ignore my objection lol. |
If you want to be fancy, maybe switch the code so that it emits the code under the syslog agent itself for aggregate drains and make the app message make sense? |
I'm going to wait for the pr pipeline to run and merge in the morning tomorrow if it succeeds. |
I've checked what would it look like to move the code which formats the log messages and I don't think it's worth moving anything:
It would have been nice if we could've added an alerting function to an existing instance of a DiodeWriter/OneToOneEnvelopeV2 diode, but it seems to me, to be able to to that, we'll need to refactor the code.cloudfoundry.org/go-diodes first. If we want to emit the error message for the aggregate drains as well, we could emit different messages based on the func (w *SyslogConnector) Connect(ctx context.Context, b Binding) (egress.Writer, error) {
...
dw := egress.NewDiodeWriter(ctx, writer, diodes.AlertFunc(func(missed int) {
w.droppedMetric.Add(float64(missed))
drainDroppedMetric.Add(float64(missed))
var logMsg string
if b.AppId == "" {
// aggregate drain
logMsg = fmt.Sprintf(
"Dropped %d %s logs for aggregate drain with url %s",
missed, urlBinding.Scheme(), anonymousUrl.String(),
)
} else {
// application drain
logMsg = fmt.Sprintf("%d messages lost for application %s in user provided syslog drain with url %s", missed, b.AppId, anonymousUrl.String())
}
w.emitLoggregatorErrorLog(b.AppId, logMsg)
w.emitStandardOutErrorLog(b.AppId, urlBinding.Scheme(), anonymousUrl.String(), missed)
}), w.wg)
... and the func (w *SyslogConnector) emitLoggregatorErrorLog(appID, message string) {
if appID == "" {
w.logClient.EmitLog(message)
return
}
option := loggregator.WithAppInfo(appID, "LGR", "")
w.logClient.EmitLog(message, option)
option = loggregator.WithAppInfo(
appID,
"SYS",
w.sourceIndex,
)
w.logClient.EmitLog(message, option)
} TBH, I'm not sure if @Benjamintf1 Do you maybe have some other, better idea? |
For aggregate drains, you'd just want to emit it with syslog agent's source ID instead of the application id. That said, given we don't emit it at all right now, we can handle that with a different issue/pr. That is, IF we want to do that at all, which is maybe unnecessary. |
Here's your release! https://github.com/cloudfoundry/loggregator-agent-release/releases/tag/v7.1.0 |
As discussed in #203 we've seen when an app has multiple syslog drains or one syslog drain was bound to multiple apps in was hard to tell which syslog drain for which app has log drops. We've adjusted the error message which is injected to the applications log, so that this becomes clear.
Now the error messages looks as following:
We've adjusted the unit test and did integration test, deployed the new version of the syslog-agent and checked the app logs for an application which produces many logs/s. We've also tested with a syslog url like
https://user:pass@example.com?p1=v1
and in the logs we only saw the "base url",https://example.com
from the mentioned example. The anonymization that @ctlong implement works properly as well 馃帀Fixes #203
Description
Please include a summary of the change.
Type of change
Testing performed?
Checklist:
main
branch, or relevant version branchIf you have any questions, or want to get attention for a PR or issue please reach out on the #logging-and-metrics channel in the cloudfoundry slack