dst: Update `GetProfile`'s stream when pod associated to HostPort lookup changes #11334

alpeb · 2023-09-04T22:17:25Z

Followup to #11328

Implements a new pod watcher, instantiated along the other ones in the Destination server. It also watches on Servers and carries all the logic from ServerWatcher, which has now been decommissioned.

The CreateAddress() function has been moved into a function of the PodWatcher, because now we're calling it on every update given the pod associated to an ip:port might change and we need to regenerate the Address object. That function also takes care of capturing opaque protocol info from associated Servers, which is not new and had some logic that was duped in the now defunct ServerWatcher. getAnnotatedOpaquePorts() got also moved for similar reasons.

Other things to note about PodWatcher:

It publishes a new pair of metrics ip_port_subscribers and ip_port_updates leveraging the framework in prometheus.go.
The complexity in updatePod() is due to only send stream updates when there are changes in the pod's readiness, to avoid sending duped messages on every pod lifecycle event.

Finally, endpointProfileTranslator's endpoint (*pb.WeightedAddr) not being a static object anymore, the Update() function now receives an Address that allows it to rebuild the endpoint on the fly (and so createEndpoint() was converted into a method of endpointProfileTranslator).

Original PR description

(What follows is the original description of this PR's first approach, which evolved as described above after taking into account all the feedback).

Implements a new pod watcher, instantiated along the other ones in the Destination server. It's generic enough to catch all pod events in the cluster, so it's up to the subscribers to filter out the ones they're interested in, and to set up any metrics.

The endointProfileTranslator was refactored so it subscribes to the pod watcher events. Handling of Server subscriptions are now also handled by it, which are recycled whenever the pod changes.

A new gauge metric host_port_subscribers has been created, tracking the number of subscribers for a given HostIP+port combination.

Other Changes

Moved the server.createAddress method into a static function in endpoints_watcher.go, for better reusability.
The "Return profile for host port pods" test introduced in dst: Stop overriding Host IP with Pod IP on HostPort lookup #11328 was extended to track the ensuing events after a pod is deleted and then recreated (:taco: to @adleong for the test).
Given that test consumes multiple events, we had to change the profileStream test helper to have GetProfile run in a goroutine so that it keeps on running. This implies the stream now needs to be manually canceled and access to the stream updates have to be guarded with a mutex.

mateiidavid

Looks great! Think you broke the problem down really well too, it was easy for me to jump in with no context. I left a few comments, I need to understand the adaptor a bit better. Using the pattern here is difficult because it would otherwise be hard to slot the pod watcher into the endpoint translator? (e.g. running a watcher that can propagate updates to endpoint translator listeners). The idea being that the adaptor is the listener that understands pod events?

As far as testing goes, I gave this a try and it works well!

INFO[0000] retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} endpoint:{addr:{ip:{ipv4:2886860802} port:9999} weight:10000 metric_labels:{key:"control_plane_ns" value:"linkerd"} metric_labels:{key:"deployment" value:"nginx-deployment"} metric_labels:{key:"namespace" value:"default"} metric_labels:{key:"pod" value:"nginx-deployment-674b6bd49b-5tj4r"} metric_labels:{key:"pod_template_hash" value:"674b6bd49b"} metric_labels:{key:"serviceaccount" value:"default"} tls_identity:{dns_like_identity:{name:"default.default.serviceaccount.identity.linkerd.cluster.local"}} protocol_hint:{h2:{}}}
//
// RESTART POD
//
INFO[0000] 
INFO[0008] retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} endpoint:{addr:{ip:{ipv4:2886860802} port:9999} weight:10000}
INFO[0008]
INFO[0012] retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} endpoint:{addr:{ip:{ipv4:2886860802} port:9999} weight:10000 metric_labels:{key:"control_plane_ns" value:"linkerd"} metric_labels:{key:"deployment" value:"nginx-deployment"} metric_labels:{key:"namespace" value:"default"} metric_labels:{key:"pod" value:"nginx-deployment-674b6bd49b-6ngm2"} metric_labels:{key:"pod_template_hash" value:"674b6bd49b"} metric_labels:{key:"serviceaccount" value:"default"} tls_identity:{dns_like_identity:{name:"default.default.serviceaccount.identity.linkerd.cluster.local"}} protocol_hint:{h2:{}}}
//
// RESTART POD
//
INFO[0036] retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} endpoint:{addr:{ip:{ipv4:2886860802} port:9999} weight:10000}
INFO[0036]
INFO[0041] retry_budget:{retry_ratio:0.2 min_retries_per_second:10 ttl:{seconds:10}} endpoint:{addr:{ip:{ipv4:2886860802} port:9999} weight:10000 metric_labels:{key:"control_plane_ns" value:"linkerd"} metric_labels:{key:"deployment" value:"nginx-deployment"} metric_labels:{key:"namespace" value:"default"} metric_labels:{key:"pod" value:"nginx-deployment-674b6bd49b-gnd6w"} metric_labels:{key:"pod_template_hash" value:"674b6bd49b"} metric_labels:{key:"serviceaccount" value:"default"} tls_identity:{dns_like_identity:{name:"default.default.serviceaccount.identity.linkerd.cluster.local"}} protocol_hint:{h2:{}}}
INFO[0041]

The only issue is that we first send the endpoint and then the endpoint information in a subsequent update (i.e. pod metrics). idk if this is a big problem, it's not fresh in my brain how the proxy would handle the connections here, since the IP is the same I assume it just re-uses the connection, it won't tear it down.

controller/api/destination/watcher/pod_watcher.go

controller/api/destination/hostport_adaptor.go

alpeb · 2023-09-05T20:17:08Z

Thanks for the detailed review @mateiidavid.

Using the pattern here is difficult because it would otherwise be hard to slot the pod watcher into the endpoint translator? (e.g. running a watcher that can propagate updates to endpoint translator listeners). The idea being that the adaptor is the listener that understands pod events?

You're right inquire about the logic behind that separation. I think it made sense at some point during my implementation process, but I see no reason for that anymore. So I'll be coalescing all the logic into a refactored endpointProfileTranslator.

The only issue is that we first send the endpoint and then the endpoint information in a subsequent update (i.e. pod metrics). idk if this is a big problem, it's not fresh in my brain how the proxy would handle the connections here, since the IP is the same I assume it just re-uses the connection, it won't tear it down.

I think the reason behind those separate messages is that after deleting the pod the hostip:port no longer has a backing pod so the following message only sends the IP. And then when the new pod becomes ready a new message is sent with all the pod's info (metric labels, identity and protocol hint).
This seemed to behave well from my cursory testing, but I'd love to get more confirmation that this is a right behavior.

controller/api/destination/hostport_adaptor.go

olix0r · 2023-09-06T23:45:37Z

This seems big enough that I think we ought to hold merging this until we've branched release/stable-2.14.

adleong · 2023-09-08T23:05:07Z

There are few things here that deviate from the patterns established in other watchers/translators:

subscribing to watches in a translator
having a watcher broadcast ALL events to all listeners without filtering

These are surprising and make this code difficult for me to follow. In particular, the logic in the endpointProfileTransformer has become very complex. I had something different in mind:

Rather than having the PodWatcher broadcast all pod events to all listeners, instead have it contain a map of "<ip>:<port>" to PodPublisher, following a similar pattern to the other watches. PodWatcher would need to register event handlers for both Pods and Servers. When it gets a pod event, it should update all the registered PodPublishers which match the pod IP or which match the hostport if the pod has one. When it gets a server event, it should update all the registered PodPublishers which have a pod which matches the server selector. This moves most of the complexity into the watcher (where it belongs, imo).

This makes PodWatcher follow the same pattern as EndpointsWatcher, but for Pods instead of for Services.

The result of this is that the endpointProfileTranslator just needs to be updated to implement PodListener instead of ServerListener. All endpointProfile queries (i.e. calls through subscribeToEndpointProfile) can use PodWatcher regardless of if the query is for pod network ip or a host network ip. ServerWatcher becomes unused and can be deleted.

Let me know if this makes sense. Happy to discuss further.

alpeb · 2023-09-11T16:31:39Z

Thanks @adleong for the comprehensive guidelines, it all makes sense to me.

instead have it contain a map of ":" to PodPublisher

You mean a map of ip:port?

adleong · 2023-09-11T19:49:58Z

Yes, I got tripped up by markdown formatting. Fixed.

alpeb · 2023-09-14T12:51:05Z

This latest change moves most of the logic into the PodWatcher, which now watches over pods and Servers. As suggested, ServerWatcher has been decomissioned.

The CreateAddress() function has been moved into a function of the PodWatcher, because now we're calling it on every update given the pod associated to an ip:port might change and we need to regenerate the Address object. That function also takes care of capturing opaque protocol info from associated Servers, which is not new and had some logic that was duped in the now defunct ServerWatcher. getAnnotatedOpaquePorts() got also moved for similar reasons.

Other things to note about PodWatcher:

It publishes a new pair of metrics ip_port_subscribers and ip_port_updates leveraging the framework in prometheus.go.
The complexity in updatePod() is due to only send stream updates when there are changes in the pod's readiness, to avoid sending duped messages on every pod lifecycle event. Let me know if that logic isn't clear enough.

Finally, endpointProfileTranslator's endpoint (*pb.WeightedAddr) not being a static object anymore, the Update() function now receives an Address that allows it to rebuild the endpoint on the fly (and so createEndpoint() was converted into a method of endpointProfileTranslator).

adleong · 2023-09-16T00:40:11Z

controller/api/destination/watcher/pod_watcher.go

+		go func(pp *podPublisher) {
+			updated := false
+			for _, listener := range pp.listeners {
+				addr, err := CreateAddress(pw.k8sAPI, pw.metadataAPI, pw.defaultOpaquePorts, pp.pod, pp.ip, pp.port)


I'm a bit confused about how opaque ports flow from servers to pods. It seems like, whenever we process a server updated, we find all publishers with pods which match that server and then call into CreateAddress which then goes and finds all Servers which select that pod and builds the opaque ports that way.

Is the reason we do this because it's not possible to determine the new opaque ports for a pod just by looking at the updated server because it may have opaque ports which overlap with other servers which match that pod? That sounds right, but it's a subtle point that probably warrants a comment.

Actually, taking this thinking a step further, we probably need to potentially update ALL podPublishers, even ones which don't match the server's selector because the server's selector may have changed so that it no longer selects certain pods. Those pods which are no longer selected may therefore need their opaque ports updated.

Taking this all together, I think that any time any server is updated, we need to recalculate all the opaque ports for all the pod publishers. Assuming we're using client-go caches, this shouldn't be too expensive since it won't result in API calls.

Is the reason we do this because it's not possible to determine the new opaque ports for a pod just by looking at the updated server because it may have opaque ports which overlap with other servers which match that pod? That sounds right, but it's a subtle point that probably warrants a comment.

Right, the handler carries information limited to the changed Server, and so it only acts to signal that the final state for the relevant pods needs to be recalculated. I'll add a comment to that effect.

Actually, taking this thinking a step further, we probably need to potentially update ALL podPublishers, even ones which don't match the server's selector because the server's selector may have changed so that it no longer selects certain pods. Those pods which are no longer selected may therefore need their opaque ports updated.

Taking this all together, I think that any time any server is updated, we need to recalculate all the opaque ports for all the pod publishers. Assuming we're using client-go caches, this shouldn't be too expensive since it won't result in API calls.

Good point. The Server watcher and the lister used in CreateAddress() rely on the same informer so indeed, this should result in extra network calls. I'm a little worried though about CPU usage. I think we can be a little smarter in the Server watcher handlers. Currently all events are handled via this single updateServer() method. I think it's fine to keep the same logic for the add and delete events, in that only the pods matching the selectors are concerned. For the update event, we can check whether the selector changed, and consider the union of pods that match both the old and new selectors.

Can we estimate whether this is going to cause problematic CPU usage? Server updates probably wouldn't happen more than once a second. On each Server update we would have to evaluate a selector against every pod in the cluster. Assuming a 10000 pod cluster, doing 10000 selector evaluations every time a server updates doesn't seem like it should be too expensive. Are there other computational costs we should consider?

Ok I wasn't taking into account the frequency of Server updates. I'll leave the logic as is for now and see if I can come up with a way to actually test resource consumption when an update is triggered. Another thing we'd like to avoid is sending dupe messages to every stream on every Server update. I'll add something into the endpointProfileTranslator to avoid that.

controller/api/destination/watcher/pod_watcher.go

alpeb · 2023-09-19T15:51:32Z

I Moved the Address/pod object creation logic into the podWatcher, on the getOrNewPodPublisher() method, called from Subscribe(). The latter receives service, hostname and ip, some of which can be empty depending on the case.

With this change also the auxiliary functions getIndexedPod() and podReceivingTraffic() were moved into the watcher package.

Also the Server updates handler now triggers updates for all podPublishers regardless of the Server selectors. The endpointProfileTranslator is now tracking the last message sent, to avoid triggering unneeded dupe messages.

Other changes:

Added fields defaultOpaquePorts, k8sAPI and metadataAPI into the podPublisher struct.
Removed the fields ip and port from the endpointProfileTranslator, which are not used.

adleong

Very nice!

alpeb · 2023-09-26T18:46:45Z

@mateiidavid and @adleong , while testing this again I think I came across an oversight that I wanted to check with you before fixing it.
SetToServerProtocol will update the Address object's OpaqueProtocol field whenever a Server resource gets updated. But we're only checking for ContainerPort in there, ignoring if there's a match against a HostPort, so I'll have to add that as well. Sounds right?

adleong · 2023-09-26T22:56:08Z

after thinking about this a bit, yes, I think you're right. Because the container is available at both the container port (on the pod network) AND the host port (on the host network).

mateiidavid

wow! great change, lots of moving parts but it all makes sense and the design is imo really clean.

This is ready to ship from my side. Left two concurrency related questions though.

controller/api/destination/server.go

controller/api/destination/watcher/pod_watcher.go

alpeb · 2023-09-27T21:24:49Z

controller/api/destination/watcher/endpoints_watcher.go

@@ -1321,7 +1321,8 @@
 			case intstr.String:
 				for _, c := range address.Pod.Spec.Containers {
 					for _, p := range c.Ports {
-						if p.ContainerPort == int32(port) && p.Name == server.Spec.Port.StrVal {
+						if (p.ContainerPort == int32(port) || p.HostPort == int32(port)) &&


This alert is bogus because right after the strconv.Atoi call mentioned there we're already checking for boundaries.

controller/api/destination/watcher/endpoints_watcher.go

@adleong

Followup to #11328, based off of `alpeb/hostport-fixup-stopgap`. Implements a new pod watcher, instantiated along the other ones in the Destination server. It's generic enough to catch all pod events in the cluster, so it's up to the subscribers to filter out the ones they're interested in, and to set up any metrics. In the Destination server's `subscribeToEndpointProfile` method, we create a new `HostPortAdaptor` that is subscribed to the pod watcher, and forwards the pod and protocol updates to the `endpointProfileTranslator`. Handling of Server subscriptions are now handled by this adaptor, which are recycled whenever the pod changes. A new gauge metric `host_port_subscribers` has been created, tracking the number of subscribers for a given HostIP+port combination. ## Other Changes - Moved the `server.createAddress` method into a static function in `endpoints_watcher.go`, for better reusability. - The "Return profile for host port pods" test introduced in #11328 was extended to track the ensuing events after a pod is deleted and then recreated (:taco: to @adleong for the test). - Given that test consumes multiple events, we had to change the `profileStream` test helper to allow for the `GetProfile` call to block. Callers to `profileStream` now need to manually cancel the returned stream.

This change moves most of the logic into the PodWatcher, which now watches over pods and Servers. As suggested, ServerWatcher has been decomissioned. The `CreateAddress()` function has been moved into a function of the PodWatcher, because now we're calling it on every every update given the pod associated to an ip:port might change and we need to regenerate the Address object. That function also takes care of capturing opaque protocol info from associated Servers, which is not new and had some logic that was duped in the now defunct ServerWatcher. `getAnnotatedOpaquePorts()` shared the same fate for similar reasons. Other things to note about PodWatcher: - It publishes a new pair of metrics `ip_port_subscribers` and `ip_port_updates` leveraging the framework in `prometheus.go`. - The complexity in `updatePod()` is due to only send stream updates when there are changes in the pod's readiness, to avoid sending duped messages on every pod lifecycle event. Let me know if that logic isn't clear enough. Finally, endpointProfileTranslator's `endpoint` (*pb.WeightedAddr) not being a static object anymore, its `Update()` function now receives an Address that allows it to rebuild the endpoint on the fly (and so `createEndpoint()` was converted into a method of endpointProfileTranslator.

Moved the Address/pod object creation logic into the podWatcher, on the `getOrNewPodPublisher()` method, called from `Subscribe()`. The latter receives `service`, `hostname` and `ip`, which can be empty depending on the case. With this change also the auxiliary functions `getIndexedPod()` and `podReceivingTraffic()` were moved into the `watcher` package. Also the Server updates handler now triggers updates for all podPublishers regardless of the Server selectors. The endpointProfileTranslator is now tracking the last message sent, to avoid sending dupe messages. Other changes: - Added fields `defaultOpaquePorts`, `k8sAPI` and `metadataAPI` into the podPublisher struct. - Removed the fields `ip` and `port` from the endpointProfileTranslator, which are not used.

Co-authored-by: Matei David <matei@buoyant.io>

This reverts commit ae0438f.

mateiidavid

You have my 🦭 of ✅ !

alpeb · 2023-09-28T13:35:56Z

Thanks folks for the great feedback 😄
I'm merging this and will shortly followup with an extension of the unit test to cover the bug I fixed (described in the #11334 (comment) above).

Followup to #11334 (comment) This extends the test introduced in #11334 to excercise upgrading a Server associated to a pod's HostPort, and observing how the stream updates the OpaqueProtocol field. Helper functions were refactored a bit to allow retrieving the l5dCRDClientSet used when building the fake API.

@cromulentbanana

This edge release makes Linkerd even better. * Added a controlPlaneVersion override to the `linkerd-control-plane` Helm chart to support including SHA256 image digests in Linkerd manifests (thanks @cromulentbanana!) ([#11406]) * Improved `linkerd viz check` to attempt to validate that the Prometheus scrape interval will work well with the CLI and Web query parameters ([#11376]) * Fixed an issue where the destination controller would not update pod metadata for profile resolutions for a pod accessed via the host network (e.g. HostPort endpoints) ([#11334]). * Added a validating webhook config for httproutes.gateway.networking.k8s.io resources (thanks @mikutas!) ([#11150]) [#11150]: #11150 [#11334]: #11334 [#11376]: #11376 [#11406]: #11406

@cromulentbanana

This edge release makes Linkerd even better. * Added a controlPlaneVersion override to the `linkerd-control-plane` Helm chart to support including SHA256 image digests in Linkerd manifests (thanks @cromulentbanana!) ([#11406]) * Improved `linkerd viz check` to attempt to validate that the Prometheus scrape interval will work well with the CLI and Web query parameters ([#11376]) * Fixed an issue where the destination controller would not update pod metadata for profile resolutions for a pod accessed via the host network (e.g. HostPort endpoints) ([#11334]). * Added a validating webhook config for httproutes.gateway.networking.k8s.io resources (thanks @mikutas!) ([#11150]) [#11150]: #11150 [#11334]: #11334 [#11376]: #11376 [#11406]: #11406

@cromulentbanana

This edge release makes Linkerd even better. * Added a controlPlaneVersion override to the `linkerd-control-plane` Helm chart to support including SHA256 image digests in Linkerd manifests (thanks @cromulentbanana!) ([#11406]) * Improved `linkerd viz check` to attempt to validate that the Prometheus scrape interval will work well with the CLI and Web query parameters ([#11376]) * Improved CLI error handling to print differentiated error information when versioncheck.linkerd.io cannot be resolved (thanks @dtaskai) ([#11377]) * Fixed an issue where the destination controller would not update pod metadata for profile resolutions for a pod accessed via the host network (e.g. HostPort endpoints) ([#11334]). * Added a validating webhook config for httproutes.gateway.networking.k8s.io resources (thanks @mikutas!) ([#11150]) * Introduced a new `multicluster check --timeout` flag to limit the time allowed for Kubernetes API calls (thanks @moki1202) ([#11420]) [#11150]: #11150 [#11334]: #11334 [#11376]: #11376 [#11377]: #11377 [#11406]: #11406 [#11420]: #11420

Followup to #11334 (comment) This extends the test introduced in #11334 to excercise upgrading a Server associated to a pod's HostPort, and observing how the stream updates the OpaqueProtocol field. Helper functions were refactored a bit to allow retrieving the l5dCRDClientSet used when building the fake API.

…kup changes (#11334) Followup to #11328 Implements a new pod watcher, instantiated along the other ones in the Destination server. It also watches on Servers and carries all the logic from ServerWatcher, which has now been decommissioned. The `CreateAddress()` function has been moved into a function of the PodWatcher, because now we're calling it on every update given the pod associated to an ip:port might change and we need to regenerate the Address object. That function also takes care of capturing opaque protocol info from associated Servers, which is not new and had some logic that was duped in the now defunct ServerWatcher. `getAnnotatedOpaquePorts()` got also moved for similar reasons. Other things to note about PodWatcher: - It publishes a new pair of metrics `ip_port_subscribers` and `ip_port_updates` leveraging the framework in `prometheus.go`. - The complexity in `updatePod()` is due to only send stream updates when there are changes in the pod's readiness, to avoid sending duped messages on every pod lifecycle event. - Finally, endpointProfileTranslator's `endpoint` (*pb.WeightedAddr) not being a static object anymore, the `Update()` function now receives an Address that allows it to rebuild the endpoint on the fly (and so `createEndpoint()` was converted into a method of endpointProfileTranslator).

Followup to #11334 (comment) This extends the test introduced in #11334 to excercise upgrading a Server associated to a pod's HostPort, and observing how the stream updates the OpaqueProtocol field. Helper functions were refactored a bit to allow retrieving the l5dCRDClientSet used when building the fake API.

This stable release fixes issues in the proxy and Destination controller which can result in Linkerd proxies sending traffic to stale endpoints. In addition, it contains a bug fix for profile resolutions for pods bound on host ports and includes patches for security advisory [CVE-2023-44487]/GHSA-qppj-fm5r-hxr3 * Control Plane * Fixed an issue where the Destination controller could stop processing changes in the endpoints of a destination, if a proxy subscribed to that destination stops reading service discovery updates. This issue results in proxies attempting to send traffic for that destination to stale endpoints ([#11483], fixes [#11480], [#11279], [#10590]) * Fixed an issue where the Destination controller would not update pod metadata for profile resolutions for a pod accessed via the host network (e.g. HostPort endpoints) ([#11334]) * Addressed [CVE-2023-44487]/GHSA-qppj-fm5r-hxr3 by upgrading several dependencies (including Go's gRPC and net libraries) * Proxy * Fixed a regression where the proxy rendered `grpc_status` metric labels as a string rather than as the numeric status code ([linkerd2-proxy#2480]; fixes [#11449]) * Fixed a regression introduced in stable-2.13.0 where proxies would not terminate unusred service discovery watches, exerting backpressure on the Destination controller which could cause it to become stuck ([linkerd2-proxy#2484]) [#10590]: #10590 [#11279]: #11279 [#11483]: #11483 [#11480]: #11480 [#11334]: #11334 [#11449]: #11449 [CVE-2023-44487]: GHSA-qppj-fm5r-hxr3 [linkerd2-proxy#2480]: linkerd/linkerd2-proxy#2480 [linkerd2-proxy#2484]: linkerd/linkerd2-proxy#2484 Signed-off-by: Matei David <matei@buoyant.io>

This stable release fixes issues in the proxy and Destination controller which can result in Linkerd proxies sending traffic to stale endpoints. In addition, it contains a bug fix for profile resolutions for pods bound on host ports and includes patches for security advisory [CVE-2023-44487]/GHSA-qppj-fm5r-hxr3 * Control Plane * Fixed an issue where the Destination controller could stop processing changes in the endpoints of a destination, if a proxy subscribed to that destination stops reading service discovery updates. This issue results in proxies attempting to send traffic for that destination to stale endpoints ([#11483], fixes [#11480], [#11279], [#10590]) * Fixed an issue where the Destination controller would not update pod metadata for profile resolutions for a pod accessed via the host network (e.g. HostPort endpoints) ([#11334]) * Addressed [CVE-2023-44487]/GHSA-qppj-fm5r-hxr3 by upgrading several dependencies (including Go's gRPC and net libraries) * Proxy * Fixed a regression where the proxy rendered `grpc_status` metric labels as a string rather than as the numeric status code ([linkerd2-proxy#2480]; fixes [#11449]) * Fixed a regression introduced in stable-2.13.0 where proxies would not terminate unusred service discovery watches, exerting backpressure on the Destination controller which could cause it to become stuck ([linkerd2-proxy#2484]) [#10590]: #10590 [#11279]: #11279 [#11483]: #11483 [#11480]: #11480 [#11334]: #11334 [#11449]: #11449 [CVE-2023-44487]: GHSA-qppj-fm5r-hxr3 [linkerd2-proxy#2480]: linkerd/linkerd2-proxy#2480 [linkerd2-proxy#2484]: linkerd/linkerd2-proxy#2484 --------- Signed-off-by: Matei David <matei@buoyant.io> Co-authored-by: Alejandro Pedraza <alejandro@buoyant.io> Co-authored-by: Oliver Gould <ver@buoyant.io>

This removes `endpointProfileTranslator`'s dependency on the k8sAPI and metadataAPI, which are not used. This was introduced in one of #11334's refactorings, but ended up being not required. No functional changes here.

…11587) This removes `endpointProfileTranslator`'s dependency on the k8sAPI and metadataAPI, which are not used. This was introduced in one of #11334's refactorings, but ended up being not required. No functional changes here.

alpeb requested a review from a team as a code owner September 4, 2023 22:17

alpeb force-pushed the alpeb/hostport-podwatcher branch from 87ffe11 to fcbd993 Compare September 4, 2023 22:52

alpeb changed the title ~~Complete fix for hostport staleness - pod watcher~~ dst: Update GetProfile's stream when pod associated to HostPort lookup changes Sep 5, 2023

mateiidavid reviewed Sep 5, 2023

View reviewed changes

mateiidavid reviewed Sep 6, 2023

View reviewed changes

controller/api/destination/hostport_adaptor.go Outdated Show resolved Hide resolved

controller/api/destination/hostport_adaptor.go Outdated Show resolved Hide resolved

Base automatically changed from alpeb/hostport-fixup-stopgap to main September 6, 2023 15:35

alpeb force-pushed the alpeb/hostport-podwatcher branch from 0f77654 to 25342c5 Compare September 6, 2023 16:00

adleong reviewed Sep 16, 2023

View reviewed changes

adleong approved these changes Sep 25, 2023

View reviewed changes

mateiidavid reviewed Sep 27, 2023

View reviewed changes

github-advanced-security bot found potential problems Sep 27, 2023

View reviewed changes

alpeb and others added 9 commits September 27, 2023 17:48

@mateiidavid's feedback

a6303b5

Move hostport adaptor logic into the endpoint profile translator

352b518

Apply suggestions from code review

33f5683

Co-authored-by: Matei David <matei@buoyant.io>

Added missing lock

5f0f7e9

SetServerProtocol() should also check for HostPort

9c53067

Revert "Apply suggestions from code review"

9d932f8

This reverts commit ae0438f.

alpeb force-pushed the alpeb/hostport-podwatcher branch from 6090828 to 9d932f8 Compare September 27, 2023 22:48

mateiidavid approved these changes Sep 28, 2023

View reviewed changes

alpeb closed this Sep 28, 2023

alpeb reopened this Sep 28, 2023

alpeb merged commit 65ddba4 into main Sep 28, 2023
66 checks passed

alpeb deleted the alpeb/hostport-podwatcher branch September 28, 2023 13:57

alpeb mentioned this pull request Sep 28, 2023

Extend unit test for HostPort subscriptions #11439

Merged

olix0r mentioned this pull request Sep 28, 2023

edge-23.9.4 #11441

Merged

mateiidavid mentioned this pull request Oct 26, 2023

stable-2.14.2 #11539

Merged

alpeb mentioned this pull request Nov 7, 2023

Remove k8sAPI/metadataAPI dependency from endpointProfileTranslator #11587

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dst: Update `GetProfile`'s stream when pod associated to HostPort lookup changes #11334

dst: Update `GetProfile`'s stream when pod associated to HostPort lookup changes #11334

alpeb commented Sep 4, 2023 •

edited

mateiidavid left a comment

alpeb commented Sep 5, 2023

olix0r commented Sep 6, 2023

adleong commented Sep 8, 2023 •

edited

alpeb commented Sep 11, 2023

adleong commented Sep 11, 2023

alpeb commented Sep 14, 2023

adleong Sep 16, 2023

adleong Sep 16, 2023

alpeb Sep 18, 2023

adleong Sep 18, 2023

alpeb Sep 19, 2023

alpeb commented Sep 19, 2023

adleong left a comment

alpeb commented Sep 26, 2023

adleong commented Sep 26, 2023

mateiidavid left a comment

alpeb Sep 27, 2023

mateiidavid left a comment

alpeb commented Sep 28, 2023

dst: Update GetProfile's stream when pod associated to HostPort lookup changes #11334

dst: Update GetProfile's stream when pod associated to HostPort lookup changes #11334

Conversation

alpeb commented Sep 4, 2023 • edited

Original PR description

Other Changes

mateiidavid left a comment

Choose a reason for hiding this comment

alpeb commented Sep 5, 2023

olix0r commented Sep 6, 2023

adleong commented Sep 8, 2023 • edited

alpeb commented Sep 11, 2023

adleong commented Sep 11, 2023

alpeb commented Sep 14, 2023

adleong Sep 16, 2023

Choose a reason for hiding this comment

adleong Sep 16, 2023

Choose a reason for hiding this comment

alpeb Sep 18, 2023

Choose a reason for hiding this comment

adleong Sep 18, 2023

Choose a reason for hiding this comment

alpeb Sep 19, 2023

Choose a reason for hiding this comment

alpeb commented Sep 19, 2023

adleong left a comment

Choose a reason for hiding this comment

alpeb commented Sep 26, 2023

adleong commented Sep 26, 2023

mateiidavid left a comment

Choose a reason for hiding this comment

alpeb Sep 27, 2023

Choose a reason for hiding this comment

mateiidavid left a comment

Choose a reason for hiding this comment

alpeb commented Sep 28, 2023

dst: Update `GetProfile`'s stream when pod associated to HostPort lookup changes #11334

dst: Update `GetProfile`'s stream when pod associated to HostPort lookup changes #11334

alpeb commented Sep 4, 2023 •

edited

adleong commented Sep 8, 2023 •

edited