-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an alerting engine #71
Comments
random idea: if we mark a metric as 'has alert', when we aggregate, we would know this, and we can broadcast to an alert engine to evaluate rule? although this might be a bit aggressive and create a lot of extra rule evaluations at the beginning of new metric... not sure how we schedule triggers for alert evaluations either. |
@sileht has exactly the same idea. Since |
Ceph have an interresting API to monitor objects: http://docs.ceph.com/docs/master/rados/api/librados/#rados_watch3 But I have no idea how it scales. |
watch seems interesting. i'm not sure what the notification it's sending is... if we read the object, will it notify as well? it does create a watch object so it will bump up the storage size (not sure how much) |
Hi @jd, From Hawkular Alerting we are working in a 2.x of lightweight components that can be used separately and independently. The aim is to complement a gap where today an integration between systems is mandatory. I would be interested to know your feedback and if this could be useful for the Gnocchi ecosystem. Thanks ! |
@lucasponce is hawkular polling prometheus/elasticsearch with the query defined in 'expression'? |
@chungg Yes. Hawkular defines Triggers, which contains Conditions, on this objects Hawkular can read data from multiple sources, as Prometheus, Apache Kafka or Elasticsearch (or your own source of data using an API). |
i see. i imagine it'd behave similar to how Aodh interacts with Gnocchi currently (ie. create an alarm definition with query, run query periodically, trigger action, repeat). i don't see having Hawkular support Gnocchi as a bad thing; a good thing actually, since it removes barriers to adoption. that said, i think the hope is that we don't need to define another polling interval and instead leverage the fact that when Gnocchi stores data, it already has time series loaded, and therefore it can check and trigger alarms as data is entered into Gnocchi rather than be dependent on a polling cycle. if we can do that, i still think we can connect to Hawkular by possibly 'streaming' the triggered alarm to Hawkular rather than have Hawkular poll Gnocchi. |
Yes, the streaming approach was the primary integration for Hawkular Metrics as well. Perhaps we can explore the streaming approach in Gnocchi, then perhaps a plugin would be necessary and Gnocchi could get the info related what are the metrics that have definitions and send the data into the alerting engine. |
are there existing 'streaming' plugins? i'm not sure what plans the other contributors have for implementing this functionality. i don't have it queued up so not sure when we can move forward with this. |
I think we can try two ways. |
seems like a sane approach to me 👍 |
influxdb has the same approach, you can create "subscribers" on time series, and each time a new value arrive for a serie, it is streamed to the subscriber (for example kapacitor for alerting). |
it streams the entire series? i was thinking maybe after it processes a metric, it just sends a msg to a udp/MQ target and whatever alerting engine is consuming the message will know that it should trigger a request to gnocchi for data and an evaluation on their end. this is different from my original thought of leveraging the fact that gnocchi already loads the series when adding new measures. i don't believe that works because gnocchi doesn't actually load the entire series when adding new measures (just the back_window, which i guess still works if the back_window matches the timespan of your alarm) |
Hawkular Alerting doesn't need the entire series. |
oh. so hawkular keeps a running history of the datapoints corresponding to it's evaluation period? that also works as well. is there a specific format/schema and protocol used to stream data? |
Yes, it depends of the type of condition and evaluation performed but the alerting engine is stateful in that sense, that the state of an evaluation is stored. This helps to perform operations like "dampenings" (I want to be alerted when x < 10 happens 5 times from 10 evaluations, or during a period of time), "rates"/"stateful evaluation" (when I need to evaluate a stream of data points), or even "missing" (alert me when I haven't received or something has not happened in the system in the last 5 minutes). Also, when an alert is fired, there is a second part to explain why, and for that purpose we store this info to provide details (this is also optional).
Yeah, REST API with json objects to send data/events to evaluate http://www.hawkular.org/docs/rest/rest-alerts-v2.html#POST__data In version 1.x we could use also a native Java API to access to these services, but for 2.x version we have lightweight the whole architecture and we are focus on REST API only. If interested Hawkular Alerting provides a nice set of examples to jump into and take a look into this But also, it is good to collect requeriment about what could be ideal way to integrate. |
There's no alerting engine in Gnocchi. It should offer an easy solution to trigger actions on e.g. threshold.
The Aodh project from OpenStack supports Gnocchi, but it does that mainly by polling the API regularly. It's pretty slow at the end of the day and OpenStack specific.
We might need to add some other features first, but this issue should be a placeholder to discuss a design.
The text was updated successfully, but these errors were encountered: