Skip to content

API definitions for health check & outlier detection event services#10407

Merged
htuch merged 1 commit intoenvoyproxy:masterfrom
baranov1ch:remote-hc-outlier-events-api
Mar 30, 2020
Merged

API definitions for health check & outlier detection event services#10407
htuch merged 1 commit intoenvoyproxy:masterfrom
baranov1ch:remote-hc-outlier-events-api

Conversation

@baranov1ch
Copy link
Copy Markdown
Contributor

Description: this PR introduces gRPC services interfaces for outlier detection and healthcheck events services.
Risk Level: low
Testing: n.a.
Docs Changes: not yet
Release Notes: not yet
Relates to #8970

@repokitteh-read-only
Copy link
Copy Markdown

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to api/.

🐱

Caused by: #10407 was opened by baranov1ch.

see: more, trace.

@baranov1ch baranov1ch force-pushed the remote-hc-outlier-events-api branch from e96377a to 10ef01f Compare March 16, 2020 20:42
Copy link
Copy Markdown
Member

@junr03 junr03 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two look very aligned with Access Log Service, and Metrics Service, so it generally looks good to me. Left a few comments.

I do wonder if there is anything we could be doing to reduce the boiler plate in this "reporting sink" services we have.

@htuch or @lizan do you want to take a look on the API front?

Comment thread api/envoy/api/v2/core/health_check.proto Outdated
Comment thread api/envoy/service/cluster/v2alpha/health_check_event_service.proto Outdated
Comment thread api/envoy/service/cluster/v2alpha/health_check_event_service.proto Outdated
Comment thread api/envoy/service/cluster/v2alpha/outlier_detection_event_service.proto Outdated
Comment thread api/envoy/api/v2/core/health_check.proto Outdated
@junr03
Copy link
Copy Markdown
Member

junr03 commented Mar 16, 2020

@baranov1ch also please fix DCO, thanks!

@baranov1ch baranov1ch force-pushed the remote-hc-outlier-events-api branch from e9a48d3 to 702467f Compare March 17, 2020 06:48
junr03
junr03 previously approved these changes Mar 17, 2020
Copy link
Copy Markdown
Member

@junr03 junr03 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks for updating.

I still wonder about

I do wonder if there is anything we could be doing to reduce the boiler plate in this "reporting sink" services we have.

Given the increase in number of this type of services. Lets see what @htuch has to say.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this compare/contrast with HDS? Would be ideal to have a single way to tackle this.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll put in my 2 cents. In my mind this is different. This is an event reporting stream vs HDS is asking for information to change data plane behavior. So this is closer in nature to the other 3 event reporting services (metrics, logs, outlier) than it does with HDS. And it fits more nicely into the roll up we are describing in #10407 (comment)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have a single Envoy event streaming service? It seems less boiler plate and implementation work.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, the events service could take a repeated oneof of events, or even could take typed Any objects?

As long as we allow each event source to specify an independent service it would remain flexible on how the deployment implements it.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I wonder if we want a model like ADS where it's possible to have multiple event sources share a service. I'm thinking of a use case like Envoy anomaly detection, where everything gets fed in-sequence into a giant ML soup.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes exactly.

Copy link
Copy Markdown
Contributor Author

@baranov1ch baranov1ch Mar 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aggregated service seems good, but I wonder what would be the best way to configure such an endpoint. Should it be some sort of a global bootstrap-level endpoint, or per cluster?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still make it per source like you have now and just use a common service definition. IMO in the future we could add aggregation support?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this is what I was thinking it would look like. My other thought is that in V4 we could fold metrics service sink and the grpc log sink into the "event sink" service definition as well.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed this to a single AggregatedEventService, PTAL

@htuch htuch self-assigned this Mar 17, 2020
@htuch
Copy link
Copy Markdown
Member

htuch commented Mar 17, 2020

@junr03 yeah, I'm thinking that if we're going to add some event triggered streaming service in Envoy, that we have a generic protocol for this and adding a new event message/implementation is low overhead, without having to implement a new streaming service in Envoy or the API.

My request here would be for either more discussion in-PR or a short design doc. I think this could complement metrics/logging as a new service, but we shouldn't have N of these.

@envoyproxy/api-shepherds please weigh in on this one.

@baranov1ch baranov1ch force-pushed the remote-hc-outlier-events-api branch 2 times, most recently from 7dab1f5 to e7282d1 Compare March 18, 2020 19:43
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future proofing, I would add a EventServiceConfig message to the API and use it everywhere that GrpcService appears right now. Then put a GrpcService in there. In the future, this will give you the ability to add aggregated streaming without breaking any of the config sites.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 good idea.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My $0.02 is for the naming EventReportingService (analogos to Load Reporting Service (LRS)). The reason AggregatedEventService might not be great is that it might be used in non-aggregated (today) and aggregated (future) ways on the same stream. If that sounds awful, ProxyEventService? IDK.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EventReportingService sgtm +1

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered now or in the future adopting a pattern similar to LRS/HDS, where the event sink might want to negotiate the events of interest after the client (Envoy) advertises them in a request? If this makes sense, I'd make this fully bidi as a future proofing, even if we don't implement right now.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for this. I think it would be useful to have the event sink decide that events it is interested in

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general this LGTM modulo remaining comments, thank you!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 good idea.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EventReportingService sgtm +1

@baranov1ch baranov1ch force-pushed the remote-hc-outlier-events-api branch from e7282d1 to d56b4bd Compare March 22, 2020 07:43
Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks really generic and extensible now. A few more naming/package comments..
/wait

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/reporing/reporting/

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguably could be in its own event_service_config.proto but I don't feel that strongly (it's easy enough to change later with breaking things).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, why not:)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should have its own tree in the envoy.service hierarchy, e.g. envoy.service.event_reporting.v2alpha.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, good idea

Comment thread api/envoy/service/cluster/v2alpha/event_reporting_service.proto Outdated
@baranov1ch baranov1ch force-pushed the remote-hc-outlier-events-api branch from d56b4bd to 4c28baf Compare March 23, 2020 23:17
@baranov1ch baranov1ch force-pushed the remote-hc-outlier-events-api branch from 4c28baf to 402e6df Compare March 24, 2020 07:09
Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to pass CI check_format, but otherwise the API LGTM, thanks for iterating on this.

Comment thread api/envoy/config/bootstrap/v3/bootstrap.proto Outdated
@baranov1ch baranov1ch force-pushed the remote-hc-outlier-events-api branch 3 times, most recently from 98a3fe1 to 6393fc6 Compare March 25, 2020 10:17
@baranov1ch baranov1ch requested a review from mattklein123 March 25, 2020 15:02
htuch
htuch previously approved these changes Mar 26, 2020
Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Comment thread api/envoy/service/event_reporting/v2alpha/event_reporting_service.proto Outdated
@htuch
Copy link
Copy Markdown
Member

htuch commented Mar 26, 2020

@baranov1ch also needs to merge master, to pickup change of location of api/docs/BUILD

@htuch htuch added the waiting label Mar 27, 2020
Signed-off-by: Alexey Baranov <me@kotiki.cc>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants