API definitions for health check & outlier detection event services#10407
API definitions for health check & outlier detection event services#10407htuch merged 1 commit intoenvoyproxy:masterfrom
Conversation
e96377a to
10ef01f
Compare
junr03
left a comment
There was a problem hiding this comment.
These two look very aligned with Access Log Service, and Metrics Service, so it generally looks good to me. Left a few comments.
I do wonder if there is anything we could be doing to reduce the boiler plate in this "reporting sink" services we have.
@htuch or @lizan do you want to take a look on the API front?
|
@baranov1ch also please fix DCO, thanks! |
e9a48d3 to
702467f
Compare
junr03
left a comment
There was a problem hiding this comment.
lgtm, thanks for updating.
I still wonder about
I do wonder if there is anything we could be doing to reduce the boiler plate in this "reporting sink" services we have.
Given the increase in number of this type of services. Lets see what @htuch has to say.
There was a problem hiding this comment.
How does this compare/contrast with HDS? Would be ideal to have a single way to tackle this.
There was a problem hiding this comment.
I'll put in my 2 cents. In my mind this is different. This is an event reporting stream vs HDS is asking for information to change data plane behavior. So this is closer in nature to the other 3 event reporting services (metrics, logs, outlier) than it does with HDS. And it fits more nicely into the roll up we are describing in #10407 (comment)
There was a problem hiding this comment.
Should we have a single Envoy event streaming service? It seems less boiler plate and implementation work.
There was a problem hiding this comment.
+1, the events service could take a repeated oneof of events, or even could take typed Any objects?
As long as we allow each event source to specify an independent service it would remain flexible on how the deployment implements it.
There was a problem hiding this comment.
Yeah, I wonder if we want a model like ADS where it's possible to have multiple event sources share a service. I'm thinking of a use case like Envoy anomaly detection, where everything gets fed in-sequence into a giant ML soup.
There was a problem hiding this comment.
aggregated service seems good, but I wonder what would be the best way to configure such an endpoint. Should it be some sort of a global bootstrap-level endpoint, or per cluster?
There was a problem hiding this comment.
I would still make it per source like you have now and just use a common service definition. IMO in the future we could add aggregation support?
There was a problem hiding this comment.
Yep, this is what I was thinking it would look like. My other thought is that in V4 we could fold metrics service sink and the grpc log sink into the "event sink" service definition as well.
There was a problem hiding this comment.
I've changed this to a single AggregatedEventService, PTAL
|
@junr03 yeah, I'm thinking that if we're going to add some event triggered streaming service in Envoy, that we have a generic protocol for this and adding a new event message/implementation is low overhead, without having to implement a new streaming service in Envoy or the API. My request here would be for either more discussion in-PR or a short design doc. I think this could complement metrics/logging as a new service, but we shouldn't have N of these. @envoyproxy/api-shepherds please weigh in on this one. |
7dab1f5 to
e7282d1
Compare
There was a problem hiding this comment.
For future proofing, I would add a EventServiceConfig message to the API and use it everywhere that GrpcService appears right now. Then put a GrpcService in there. In the future, this will give you the ability to add aggregated streaming without breaking any of the config sites.
There was a problem hiding this comment.
My $0.02 is for the naming EventReportingService (analogos to Load Reporting Service (LRS)). The reason AggregatedEventService might not be great is that it might be used in non-aggregated (today) and aggregated (future) ways on the same stream. If that sounds awful, ProxyEventService? IDK.
There was a problem hiding this comment.
EventReportingService sgtm +1
There was a problem hiding this comment.
Have you considered now or in the future adopting a pattern similar to LRS/HDS, where the event sink might want to negotiate the events of interest after the client (Envoy) advertises them in a request? If this makes sense, I'd make this fully bidi as a future proofing, even if we don't implement right now.
There was a problem hiding this comment.
+1 for this. I think it would be useful to have the event sink decide that events it is interested in
mattklein123
left a comment
There was a problem hiding this comment.
In general this LGTM modulo remaining comments, thank you!
There was a problem hiding this comment.
EventReportingService sgtm +1
e7282d1 to
d56b4bd
Compare
htuch
left a comment
There was a problem hiding this comment.
Thanks, looks really generic and extensible now. A few more naming/package comments..
/wait
There was a problem hiding this comment.
Arguably could be in its own event_service_config.proto but I don't feel that strongly (it's easy enough to change later with breaking things).
There was a problem hiding this comment.
I think this should have its own tree in the envoy.service hierarchy, e.g. envoy.service.event_reporting.v2alpha.
d56b4bd to
4c28baf
Compare
4c28baf to
402e6df
Compare
htuch
left a comment
There was a problem hiding this comment.
Needs to pass CI check_format, but otherwise the API LGTM, thanks for iterating on this.
98a3fe1 to
6393fc6
Compare
|
@baranov1ch also needs to merge master, to pickup change of location of |
Signed-off-by: Alexey Baranov <me@kotiki.cc>
6393fc6 to
15e607d
Compare
Description: this PR introduces gRPC services interfaces for outlier detection and healthcheck events services.
Risk Level: low
Testing: n.a.
Docs Changes: not yet
Release Notes: not yet
Relates to #8970