-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bugs in aggregator and filters, add initial tests #2
Conversation
Close() was not synced through the main dispatcher loop, so it could close channels that were currently being written to by methods called from said dispatcher loop. This leads to a crash. Instead, Close() now writes a closeRequest, which is handled in the dispatcher.
If someone specifies service = "foo-service" ...they probably don't want it to match: service = "foo-servicebar"
case req := <-a.closeRequests: | ||
a.closeInternal() | ||
req.done <- true | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you return, it won't drain pending requests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, how important is this? If we want to stay with the old structure, I could add the foo, closed := <- ...
to the remaining cases and check the value of closed
before doing anything to fix the immediate crash. However, we also have timers that eventually put requests on these channels. So I guess Close()
would need to stop these timers first and then close the channels they write into. I hope I'm not overlooking more, as this is getting a bit confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's another problem: when a channel is closed, the select case for it will not only be triggered once, but repeatedly in the for loop. So the closed
counter is incremented multiple times for the same channel, until this has happened often enough that the for loop will exit (even though other channels might still be open). So we must either track closedness separately per channel or we ignore proper draining for now and fix it up later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ignore proper draining for now with a giant BUG. I am surprised that it
re-emits closed when the damn thing is closed.
Am 19.07.2013 15:04 schrieb "juliusv" notifications@github.com:
In manager/aggregator.go:
@@ -288,8 +288,11 @@ func (a *Aggregator) Dispatch(s SummaryReceiver) {
log.Println("Deleting expired aggregation instance", a)
a.Aggregates[fp].Close()
delete(a.Aggregates, fp)
+
case req := <-a.closeRequests:
a.closeInternal()
req.done <- true
return
There's another problem: when a channel is closed, the select case for it
will not only be triggered once, but repeatedly in the for loop. So the
closed counter is incremented multiple times for the same channel, until
this has happened often enough that the for loop will exit. So we must
either track closedness separately per channel or we ignore proper draining
for now and fix it up later.—
Reply to this email directly or view it on GitHubhttps://github.com//pull/2/files#r5291343
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a BUG comment.
How was the Aggregator writing to itself by sending requests back to itself? This sounds problematic. By virtue of not closing the channels and waiting for them to actual return closed in the second value when getting from them, we have no draining. :-( |
} | ||
|
||
func TestAggregator(t *testing.T) { | ||
scenarios := []struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Make a
testAggregatorScenario struct{}
with these inputs. - Make a function for it
(s *testAggregatorScenario) test(i int, t *testing.T) {}
that runs the actual tests. - Have the TestAggregator body just test via looping through each scenario.
- Maybe have a scenario setup and scenario close method the ensure cleanup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did 1, 2, and 3, didn't do 3.1 because it would just add unneeded code (storing aggregator in struct, then closing from close()
).
That's a good question why it crashes, since actually all the external calls are synced. When I revert the closing commit, the tests fail with:
Looking at it. |
@matttproud Ah, I think that merely by closing the channels, the select{} got another event for each of them and the logic didn't check whether the channel was closed before executing whatever action was tied to that channel. |
@matttproud Is this ok to merge for now? I have more stuff in the pipeline :) |
Please do. 👍 |
Fix bugs in aggregator and filters, add initial tests
…eus#2) * Use the translator pattern for child to parent communication * Correctly navigate to alerts
so third parties, Grafana in particular, can over ride the validation. Grafana wants to do this because other data sources will have label keys with things like spaces, periods, or other characters - and looking for a better integration with alert manager. goes with grafana/grafana#38629 replaces #2694 Signed-off-by: Kyle Brandt <kyle@grafana.com>
Addresses: Scanning your code and 410 packages across 83 dependent modules for known vulnerabilities... === Symbol Results === Vulnerability prometheus#1: GO-2024-2687 HTTP/2 CONTINUATION flood in net/http More info: https://pkg.go.dev/vuln/GO-2024-2687 Module: golang.org/x/net Found in: golang.org/x/net@v0.20.0 Fixed in: golang.org/x/net@v0.23.0 Example traces found: prometheus#1: cli/root.go:122:52: cli.NewAlertmanagerClient calls config.NewClientFromConfig, which eventually calls http2.ConfigureTransports prometheus#2: types/types.go:290:28: types.MultiError.Error calls http2.ConnectionError.Error prometheus#3: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.ErrCode.String prometheus#4: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.FrameHeader.String prometheus#5: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.FrameType.String prometheus#6: types/types.go:290:28: types.MultiError.Error calls http2.GoAwayError.Error prometheus#7: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.Setting.String prometheus#8: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.SettingID.String prometheus#9: types/types.go:290:28: types.MultiError.Error calls http2.StreamError.Error prometheus#10: api/v2/client/silence/silence_client.go:196:35: silence.Client.PostSilences calls client.Runtime.Submit, which eventually calls http2.Transport.NewClientConn prometheus#11: api/v2/client/silence/silence_client.go:196:35: silence.Client.PostSilences calls client.Runtime.Submit, which eventually calls http2.Transport.RoundTrip prometheus#12: notify/email/email.go:253:14: email.Email.Notify calls fmt.Fprintf, which eventually calls http2.chunkWriter.Write prometheus#13: types/types.go:290:28: types.MultiError.Error calls http2.connError.Error prometheus#14: types/types.go:290:28: types.MultiError.Error calls http2.duplicatePseudoHeaderError.Error prometheus#15: test/cli/acceptance.go:362:3: cli.Alertmanager.Start calls http2.gzipReader.Close prometheus#16: test/cli/acceptance.go:366:22: cli.Alertmanager.Start calls io.ReadAll, which calls http2.gzipReader.Read prometheus#17: types/types.go:290:28: types.MultiError.Error calls http2.headerFieldNameError.Error prometheus#18: types/types.go:290:28: types.MultiError.Error calls http2.headerFieldValueError.Error prometheus#19: api/v2/client/silence/silence_client.go:196:35: silence.Client.PostSilences calls client.Runtime.Submit, which eventually calls http2.noDialH2RoundTripper.RoundTrip prometheus#20: types/types.go:290:28: types.MultiError.Error calls http2.pseudoHeaderError.Error prometheus#21: notify/email/email.go:253:14: email.Email.Notify calls fmt.Fprintf, which eventually calls http2.stickyErrWriter.Write prometheus#22: test/cli/acceptance.go:362:3: cli.Alertmanager.Start calls http2.transportResponseBody.Close prometheus#23: test/cli/acceptance.go:366:22: cli.Alertmanager.Start calls io.ReadAll, which calls http2.transportResponseBody.Read prometheus#24: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.writeData.String Your code is affected by 1 vulnerability from 1 module. This scan also found 0 vulnerabilities in packages you import and 2 vulnerabilities in modules you require, but your code doesn't appear to call these vulnerabilities. Use '-show verbose' for more details. Signed-off-by: Holger Hans Peter Freyther <holger@freyther.de>
Addresses: Scanning your code and 410 packages across 83 dependent modules for known vulnerabilities... === Symbol Results === Vulnerability prometheus#1: GO-2024-2687 HTTP/2 CONTINUATION flood in net/http More info: https://pkg.go.dev/vuln/GO-2024-2687 Module: golang.org/x/net Found in: golang.org/x/net@v0.20.0 Fixed in: golang.org/x/net@v0.23.0 Example traces found: prometheus#1: cli/root.go:122:52: cli.NewAlertmanagerClient calls config.NewClientFromConfig, which eventually calls http2.ConfigureTransports prometheus#2: types/types.go:290:28: types.MultiError.Error calls http2.ConnectionError.Error prometheus#3: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.ErrCode.String prometheus#4: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.FrameHeader.String prometheus#5: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.FrameType.String prometheus#6: types/types.go:290:28: types.MultiError.Error calls http2.GoAwayError.Error prometheus#7: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.Setting.String prometheus#8: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.SettingID.String prometheus#9: types/types.go:290:28: types.MultiError.Error calls http2.StreamError.Error prometheus#10: api/v2/client/silence/silence_client.go:196:35: silence.Client.PostSilences calls client.Runtime.Submit, which eventually calls http2.Transport.NewClientConn prometheus#11: api/v2/client/silence/silence_client.go:196:35: silence.Client.PostSilences calls client.Runtime.Submit, which eventually calls http2.Transport.RoundTrip prometheus#12: notify/email/email.go:253:14: email.Email.Notify calls fmt.Fprintf, which eventually calls http2.chunkWriter.Write prometheus#13: types/types.go:290:28: types.MultiError.Error calls http2.connError.Error prometheus#14: types/types.go:290:28: types.MultiError.Error calls http2.duplicatePseudoHeaderError.Error prometheus#15: test/cli/acceptance.go:362:3: cli.Alertmanager.Start calls http2.gzipReader.Close prometheus#16: test/cli/acceptance.go:366:22: cli.Alertmanager.Start calls io.ReadAll, which calls http2.gzipReader.Read prometheus#17: types/types.go:290:28: types.MultiError.Error calls http2.headerFieldNameError.Error prometheus#18: types/types.go:290:28: types.MultiError.Error calls http2.headerFieldValueError.Error prometheus#19: api/v2/client/silence/silence_client.go:196:35: silence.Client.PostSilences calls client.Runtime.Submit, which eventually calls http2.noDialH2RoundTripper.RoundTrip prometheus#20: types/types.go:290:28: types.MultiError.Error calls http2.pseudoHeaderError.Error prometheus#21: notify/email/email.go:253:14: email.Email.Notify calls fmt.Fprintf, which eventually calls http2.stickyErrWriter.Write prometheus#22: test/cli/acceptance.go:362:3: cli.Alertmanager.Start calls http2.transportResponseBody.Close prometheus#23: test/cli/acceptance.go:366:22: cli.Alertmanager.Start calls io.ReadAll, which calls http2.transportResponseBody.Read prometheus#24: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.writeData.String Your code is affected by 1 vulnerability from 1 module. This scan also found 0 vulnerabilities in packages you import and 2 vulnerabilities in modules you require, but your code doesn't appear to call these vulnerabilities. Use '-show verbose' for more details. Signed-off-by: Holger Hans Peter Freyther <holger@freyther.de>
Addresses: Scanning your code and 410 packages across 83 dependent modules for known vulnerabilities... === Symbol Results === Vulnerability prometheus#1: GO-2024-2687 HTTP/2 CONTINUATION flood in net/http More info: https://pkg.go.dev/vuln/GO-2024-2687 Module: golang.org/x/net Found in: golang.org/x/net@v0.20.0 Fixed in: golang.org/x/net@v0.23.0 Example traces found: prometheus#1: cli/root.go:122:52: cli.NewAlertmanagerClient calls config.NewClientFromConfig, which eventually calls http2.ConfigureTransports prometheus#2: types/types.go:290:28: types.MultiError.Error calls http2.ConnectionError.Error prometheus#3: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.ErrCode.String prometheus#4: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.FrameHeader.String prometheus#5: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.FrameType.String prometheus#6: types/types.go:290:28: types.MultiError.Error calls http2.GoAwayError.Error prometheus#7: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.Setting.String prometheus#8: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.SettingID.String prometheus#9: types/types.go:290:28: types.MultiError.Error calls http2.StreamError.Error prometheus#10: api/v2/client/silence/silence_client.go:196:35: silence.Client.PostSilences calls client.Runtime.Submit, which eventually calls http2.Transport.NewClientConn prometheus#11: api/v2/client/silence/silence_client.go:196:35: silence.Client.PostSilences calls client.Runtime.Submit, which eventually calls http2.Transport.RoundTrip prometheus#12: notify/email/email.go:253:14: email.Email.Notify calls fmt.Fprintf, which eventually calls http2.chunkWriter.Write prometheus#13: types/types.go:290:28: types.MultiError.Error calls http2.connError.Error prometheus#14: types/types.go:290:28: types.MultiError.Error calls http2.duplicatePseudoHeaderError.Error prometheus#15: test/cli/acceptance.go:362:3: cli.Alertmanager.Start calls http2.gzipReader.Close prometheus#16: test/cli/acceptance.go:366:22: cli.Alertmanager.Start calls io.ReadAll, which calls http2.gzipReader.Read prometheus#17: types/types.go:290:28: types.MultiError.Error calls http2.headerFieldNameError.Error prometheus#18: types/types.go:290:28: types.MultiError.Error calls http2.headerFieldValueError.Error prometheus#19: api/v2/client/silence/silence_client.go:196:35: silence.Client.PostSilences calls client.Runtime.Submit, which eventually calls http2.noDialH2RoundTripper.RoundTrip prometheus#20: types/types.go:290:28: types.MultiError.Error calls http2.pseudoHeaderError.Error prometheus#21: notify/email/email.go:253:14: email.Email.Notify calls fmt.Fprintf, which eventually calls http2.stickyErrWriter.Write prometheus#22: test/cli/acceptance.go:362:3: cli.Alertmanager.Start calls http2.transportResponseBody.Close prometheus#23: test/cli/acceptance.go:366:22: cli.Alertmanager.Start calls io.ReadAll, which calls http2.transportResponseBody.Read prometheus#24: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.writeData.String Your code is affected by 1 vulnerability from 1 module. This scan also found 0 vulnerabilities in packages you import and 2 vulnerabilities in modules you require, but your code doesn't appear to call these vulnerabilities. Use '-show verbose' for more details. Signed-off-by: Holger Hans Peter Freyther <holger@freyther.de>
Addresses: Scanning your code and 410 packages across 83 dependent modules for known vulnerabilities... === Symbol Results === Vulnerability prometheus#1: GO-2024-2687 HTTP/2 CONTINUATION flood in net/http More info: https://pkg.go.dev/vuln/GO-2024-2687 Module: golang.org/x/net Found in: golang.org/x/net@v0.20.0 Fixed in: golang.org/x/net@v0.23.0 Example traces found: prometheus#1: cli/root.go:122:52: cli.NewAlertmanagerClient calls config.NewClientFromConfig, which eventually calls http2.ConfigureTransports prometheus#2: types/types.go:290:28: types.MultiError.Error calls http2.ConnectionError.Error prometheus#3: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.ErrCode.String prometheus#4: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.FrameHeader.String prometheus#5: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.FrameType.String prometheus#6: types/types.go:290:28: types.MultiError.Error calls http2.GoAwayError.Error prometheus#7: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.Setting.String prometheus#8: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.SettingID.String prometheus#9: types/types.go:290:28: types.MultiError.Error calls http2.StreamError.Error prometheus#10: api/v2/client/silence/silence_client.go:196:35: silence.Client.PostSilences calls client.Runtime.Submit, which eventually calls http2.Transport.NewClientConn prometheus#11: api/v2/client/silence/silence_client.go:196:35: silence.Client.PostSilences calls client.Runtime.Submit, which eventually calls http2.Transport.RoundTrip prometheus#12: notify/email/email.go:253:14: email.Email.Notify calls fmt.Fprintf, which eventually calls http2.chunkWriter.Write prometheus#13: types/types.go:290:28: types.MultiError.Error calls http2.connError.Error prometheus#14: types/types.go:290:28: types.MultiError.Error calls http2.duplicatePseudoHeaderError.Error prometheus#15: test/cli/acceptance.go:362:3: cli.Alertmanager.Start calls http2.gzipReader.Close prometheus#16: test/cli/acceptance.go:366:22: cli.Alertmanager.Start calls io.ReadAll, which calls http2.gzipReader.Read prometheus#17: types/types.go:290:28: types.MultiError.Error calls http2.headerFieldNameError.Error prometheus#18: types/types.go:290:28: types.MultiError.Error calls http2.headerFieldValueError.Error prometheus#19: api/v2/client/silence/silence_client.go:196:35: silence.Client.PostSilences calls client.Runtime.Submit, which eventually calls http2.noDialH2RoundTripper.RoundTrip prometheus#20: types/types.go:290:28: types.MultiError.Error calls http2.pseudoHeaderError.Error prometheus#21: notify/email/email.go:253:14: email.Email.Notify calls fmt.Fprintf, which eventually calls http2.stickyErrWriter.Write prometheus#22: test/cli/acceptance.go:362:3: cli.Alertmanager.Start calls http2.transportResponseBody.Close prometheus#23: test/cli/acceptance.go:366:22: cli.Alertmanager.Start calls io.ReadAll, which calls http2.transportResponseBody.Read prometheus#24: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.writeData.String Your code is affected by 1 vulnerability from 1 module. This scan also found 0 vulnerabilities in packages you import and 2 vulnerabilities in modules you require, but your code doesn't appear to call these vulnerabilities. Use '-show verbose' for more details. Signed-off-by: Holger Hans Peter Freyther <holger@freyther.de>
Addresses: Scanning your code and 410 packages across 83 dependent modules for known vulnerabilities... === Symbol Results === Vulnerability #1: GO-2024-2687 HTTP/2 CONTINUATION flood in net/http More info: https://pkg.go.dev/vuln/GO-2024-2687 Module: golang.org/x/net Found in: golang.org/x/net@v0.20.0 Fixed in: golang.org/x/net@v0.23.0 Example traces found: #1: cli/root.go:122:52: cli.NewAlertmanagerClient calls config.NewClientFromConfig, which eventually calls http2.ConfigureTransports #2: types/types.go:290:28: types.MultiError.Error calls http2.ConnectionError.Error #3: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.ErrCode.String #4: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.FrameHeader.String #5: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.FrameType.String #6: types/types.go:290:28: types.MultiError.Error calls http2.GoAwayError.Error #7: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.Setting.String #8: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.SettingID.String #9: types/types.go:290:28: types.MultiError.Error calls http2.StreamError.Error #10: api/v2/client/silence/silence_client.go:196:35: silence.Client.PostSilences calls client.Runtime.Submit, which eventually calls http2.Transport.NewClientConn #11: api/v2/client/silence/silence_client.go:196:35: silence.Client.PostSilences calls client.Runtime.Submit, which eventually calls http2.Transport.RoundTrip #12: notify/email/email.go:253:14: email.Email.Notify calls fmt.Fprintf, which eventually calls http2.chunkWriter.Write #13: types/types.go:290:28: types.MultiError.Error calls http2.connError.Error #14: types/types.go:290:28: types.MultiError.Error calls http2.duplicatePseudoHeaderError.Error #15: test/cli/acceptance.go:362:3: cli.Alertmanager.Start calls http2.gzipReader.Close #16: test/cli/acceptance.go:366:22: cli.Alertmanager.Start calls io.ReadAll, which calls http2.gzipReader.Read #17: types/types.go:290:28: types.MultiError.Error calls http2.headerFieldNameError.Error #18: types/types.go:290:28: types.MultiError.Error calls http2.headerFieldValueError.Error #19: api/v2/client/silence/silence_client.go:196:35: silence.Client.PostSilences calls client.Runtime.Submit, which eventually calls http2.noDialH2RoundTripper.RoundTrip #20: types/types.go:290:28: types.MultiError.Error calls http2.pseudoHeaderError.Error #21: notify/email/email.go:253:14: email.Email.Notify calls fmt.Fprintf, which eventually calls http2.stickyErrWriter.Write #22: test/cli/acceptance.go:362:3: cli.Alertmanager.Start calls http2.transportResponseBody.Close #23: test/cli/acceptance.go:366:22: cli.Alertmanager.Start calls io.ReadAll, which calls http2.transportResponseBody.Read #24: notify/notify.go:998:21: notify.TimeActiveStage.Exec calls log.jsonLogger.Log, which eventually calls http2.writeData.String Your code is affected by 1 vulnerability from 1 module. This scan also found 0 vulnerabilities in packages you import and 2 vulnerabilities in modules you require, but your code doesn't appear to call these vulnerabilities. Use '-show verbose' for more details. Signed-off-by: Holger Hans Peter Freyther <holger@freyther.de>
From the individual commits:
Close() was not synced through the main dispatcher loop, so it could close
channels that were currently being written to by methods called from said
dispatcher loop. This leads to a crash. Instead, Close() now writes a
closeRequest, which is handled in the dispatcher.
Fix regex filters to match complete string.
If someone specifies
service = "foo-service"
...they probably don't want it to match:
service = "foo-servicebar"