Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notification spamming on constantly shifting data sources #10

Open
LouwMarais opened this issue Nov 1, 2017 · 3 comments
Open

Notification spamming on constantly shifting data sources #10

LouwMarais opened this issue Nov 1, 2017 · 3 comments
Assignees

Comments

@LouwMarais
Copy link

Constantly shifting data sources like NewRelic have no Uniquely Identifiable attributes.

example NRQL

select average(runtime) from myMetric since 1 day ago TIMESERIES 5 minute

has result data like:
{ "results": [ { "average": 0.2327708216880936 } ], "beginTimeSeconds": 1509491911, "endTimeSeconds": 1509492151, "inspectedCount": 3578 }

metric config in fox:
"metrics": {"MYTAB:MYTHING": { "description": "MyMetrics average Response Time (All) - 5min/24hrs", "tab": "MYTAB", "dataSource": "NewRelicInsights", "dataSourceQueries": [ "select average(runtime) from myMetric since 1 day ago TIMESERIES 5 minute" ], "detection": "EmaBasedDetection.isPointOutsideEnvelopeStrategy", "detectionParams": { "emaInterval": 5, "envelopKoeff": 0.33, "ignoreHours": [ "23:50", "00:10" ] }, "updateInterval": 5000, "style": { "width": "50%" } }

update interval of 5 seconds as configured here resulted in a notification every 5 seconds.
I am assuming that due to there being no identifiable unique attributes in the new relic response, the message triggers every 5 seconds as it could not determine that it has already sent it before.

beginTimeSeconds/endTimeSeconds constantly shift depending on exactly when the NRQL is executed on new relic servers thus can never be the same.

@maraisr
Copy link
Contributor

maraisr commented Nov 1, 2017

@idooo so basically what's happening is the point is being alerted on every new wave of data. As the point's average changes on every wave too.

Like 0.54 is the average for that point at that time, then the next 5 min window, the time stamp changes, and the average changes to 0.55 for instance.

So as far as Fox is concerned the point is a new point, and sends an alert. We have had over 300 emails from fox overnight.

Any ideas?

@idooo
Copy link
Collaborator

idooo commented Nov 3, 2017

But isn't that how things work all the time. You moving average are going to change on every new point of data because it's calculated based on the last N points.

At time A1 you have points 22 23 24 (average 23), then after five seconds at A2 you have: 23 24 25 (average 24)... and etc. That's ok because you don't compare to raw average but instead create an envelope. If your latest point changed slightly like from 24 to 25 in our example it should be inside the acceptable envelope. If you have anomaly like 24 -> 35 then it should be outside of the envelope and be alerted

@LouwMarais
Copy link
Author

The issue is not in that the average changes, I understand that, the issue as that the same alert is sent out over and over as the software has no way to determine what it has already alerted on in the past.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants