New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ambient: service entry initial impl #45621
ambient: service entry initial impl #45621
Conversation
Adds support for a subset of the service entry api to ambient mesh. The api supported is as follows: - addresses (the VIPS) - auto VIP address allocation (addresses is optional) - endpoints (static only) - location (prefer mesh external for now for DIRECT passthrough in ztunnel) - workloadSelector (selects pods and workload entries) notable exclusions: - hosts (coming per istio/ztunnel#536) - exportTo - resolution (everything is static) - subjectAltNames (pending ztunnel support) Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>
c.ambientIndex.byPod[networkAddr] = newWl | ||
} | ||
c.ambientIndex.byUID[c.generatePodUID(pod)] = newWl | ||
updates[model.ConfigKey{Kind: kind.Address, Name: newWl.ResourceName()}] = struct{}{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Insert
should be used here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and L480 L492
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in 946171f
if m.serviceEntryController != nil { | ||
m.serviceEntryController.AppendServiceHandler(kubeRegistry.ServiceEntryHandler) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we need to append this service entry handler regardless of features.EnableK8SServiceSelectWorkloadEntries
(see line 247)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in 7055f90
framework.NewTest(t). | ||
Features("traffic.ambient"). | ||
Run(func(t framework.TestContext) { | ||
t.Skip("this will work if we set DNS_AUTO_ALLOCATE=true and once we have https://github.com/istio/ztunnel/pull/536") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now that this PR has merged we should be able to get the hostname resolved trivially (maybe a small CNI change to ensure DNS udp+tcp traffic is captured by ztunnel?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tested this locally. it worked with some minor changes to ztunnel, is sort of blocked by istio/ztunnel#582 or other ztunnel work for now
// RequestHeader=Accept:*/* | ||
// RequestHeader=User-Agent:curl/7.81.0 | ||
// Hostname=uncaptured-v1-868c9b59b5-rxvfq | ||
Check: check.BodyContains(`Hostname=uncaptured-v`), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit; stronger assertion to have both v1 +v2 checks serially to ensure we are load balancing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed
// RequestHeader=Accept:*/* | ||
// RequestHeader=User-Agent:curl/7.81.0 | ||
// Hostname=uncaptured-v1-868c9b59b5-rxvfq | ||
Check: check.BodyContains(`Hostname=uncaptured-v`), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit; stronger assertion to have both v1 +v2 checks serially to ensure we are load balancing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed
pilot/pkg/model/service.go
Outdated
// The service entry spec (if any) this service was derived from. | ||
ServiceEntry *networking.ServiceEntry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this part in particular is the most awkward part of this PR; open to other ideas on how to integrate with existing VIP auto-allocation and also allow access to the service entry fields we need in the ambient index. @howardjohn for any ideas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another idea im having. create a new handler interface and pass the original SE along
istio/pilot/pkg/serviceregistry/serviceentry/controller.go
Lines 356 to 357 in 8ba3e35
currentServiceEntry := curr.Spec.(*networking.ServiceEntry) | |
cs := convertServices(curr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah unfortunately this only works in one of the two places we need this.. the other one is during VIP re-auto allocation where we don't have the parent SEs. this is why I added this to the Service
itself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will hopefully come back to this later with a more productive comment but my initial reaction is pretty negative to this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After looking through things I don't have any obvious suggestions but I feel highly uncomfortable with this.
We were already maxing out our tech debt in the ambient controller v1. Then we added WE support and added a lot more (basically 2x). Now SE doubles that again. And all of that is mostly fine since its all isolated to the mess of ambientindex.go -- but I feel very uncomfortable having this leak out into our fundamental core types that risks permeating through the rest of Istio
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah i also was not a fan, was hoping you'd have a concrete idea or suggestion. I suppose I could just switch this to use a k8s controller and mirror the way we do it for WEs; and we can punt for now on auto allocation in ambient. my biggest concern there being, if and when we do want to support it, we may need to refactor a fair bit. in fact, i already feel that way; it feels more natural to reuse model.Service
in ambient index if we can rather than treat SE and k8s services differently
how do you feel about the mergeability of this PR without the vip auto allocation in mind? do you think we will definitely want to support that at some point in the future? (I know personally I don't use it often but wanted to plan to support for backwards compat)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think the whole thing needs a rewrite -- which I have something in mind for (#45426 for ambient only, https://docs.google.com/document/d/1-ywpCnOfubqg7WAXSPf4YgbaFDBEU9HIqMWcxZLhzwE/edit for all of Istio) but I don't want to block on that. I think its plausible we could get something reasonable in the next 3 months or so - I don't want to block anything on it but I don't mind a niche feature like auto allocation waiting
it would probably be good long term to support auto_allocation but it is not GA I think so its not necessarily strictly required. I personally think its a bad idea and we should kill it (#43952 (comment)).
I also wouldn't mind if it was just inefficient for now.
I guess in order of preference:
- Remove the need for this and have no auto alloc support for now
- Remove the need for this and have inefficient auto alloc support for now
- Just do it anyways and feel bad about it
I can live with any of them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I was thinking is to make the handler pass Service+ServiceEntry but its kind of hairy to pipe that through the whole serviceentry controller
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
decided to just remove auto vip allocation per discussion above in 37e605d
idx.mu.Lock() | ||
defer idx.mu.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit; move this lock below the nil check on line 420
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in 2473dc2
for _, svc := range c.ambientIndex.servicesMap { | ||
|
||
if svc.Attributes.ServiceEntry == nil { | ||
// if we are here then this is dev error | ||
log.Warn("dev error: service entry spec is nil; it should have been populated by the service entry handler") | ||
continue | ||
} | ||
|
||
if svc.Attributes.ServiceEntry.WorkloadSelector == nil { | ||
// nothing to do. we construct the ztunnel config if `endpoints` are provided in the service entry handler | ||
continue | ||
} | ||
|
||
if svc.Attributes.ServiceEntry.Endpoints != nil { | ||
// it is an error to provide both `endpoints` and `workloadSelector` in a service entry | ||
continue | ||
} | ||
|
||
sel := svc.Attributes.ServiceEntry.WorkloadSelector.Labels | ||
if !labels.Instance(sel).SubsetOf(pod.Labels) { | ||
continue | ||
} | ||
|
||
vipsToPorts := getVIPsFromServiceEntry(svc, nil) | ||
for vip, ports := range vipsToPorts { | ||
vips[vip] = ports | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be DRY'ed with the other implementation (which is more correct, since it has the necessary vip := c.network.String() + "/" + vip // services must be on our network, don't inherit from workload
lines)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in 5a0ee02
vipsToPorts := getVIPsFromServiceEntry(oldServiceEntry, nil) | ||
for vip := range vipsToPorts { | ||
// we also need to remove the VIP from any pod that has it (e.g. from service entry `workloadSelector`) | ||
for nwAddr := range a.byPod { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we also need to add this logic for standalone WEs that are just selected by SEs. we should update the unit tests to cover this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the unit tests actually already covered this scenario, and this code can be removed since pod cleanup is handled during re-construction of the pod workloads on SE update. removed dead code in 0eb683e
a3d6589
to
7858ff5
Compare
Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>
Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>
7858ff5
to
2473dc2
Compare
…rkloadEntries setting Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>
@@ -617,7 +627,9 @@ func (s *Controller) Cluster() cluster.ID { | |||
} | |||
|
|||
// AppendServiceHandler adds service resource event handler. Service Entries does not use these handlers. | |||
func (s *Controller) AppendServiceHandler(_ model.ServiceHandler) {} | |||
func (s *Controller) AppendServiceHandler(h model.ServiceHandler) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
istio/pilot/pkg/model/controller.go
Line 67 in c8fdbc7
type ControllerHandlers struct { |
We have a util class to dedupe most of the handler stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was necessary so the service entry controller could fire events on new auto vip allocation
} | ||
} | ||
|
||
workloadEntries := c.getAllControllerWorkloadEntries() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really understand this logic. Shouldn't it be more like getWorkloadEntriesInService
, possibly making that func reusable with labels.Selector?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same with pods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in eb05c10
3aa0e6e
to
ce14466
Compare
Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>
3f7840c
to
001c2c6
Compare
Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>
001c2c6
to
bef3d72
Compare
Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>
Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>
Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>
Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>
@howardjohn PTAL |
pilot/pkg/model/service.go
Outdated
// The service entry spec (if any) this service was derived from. | ||
ServiceEntry *networking.ServiceEntry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will hopefully come back to this later with a more productive comment but my initial reaction is pretty negative to this
@@ -340,7 +354,7 @@ func (a *AmbientIndexImpl) extractWorkload(p *v1.Pod, c *Controller) *model.Work | |||
|
|||
policies := c.selectorAuthorizationPolicies(p.Namespace, p.Labels) | |||
policies = append(policies, c.convertedSelectorPeerAuthentications(p.Namespace, p.Labels)...) | |||
wl := c.constructWorkload(p, waypoint, policies) | |||
wl := c.constructWorkload(p, waypoint, policies, a.servicesMap) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we could just move constructWorkload
to ambientIndex
and we don't need to pass a.servicesMap
since it can access it directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in 99b26b2
} | ||
} | ||
|
||
func getWorkloadServices(serviceEntries map[types.NamespacedName]*model.Service, workloadEntry *v1alpha3.WorkloadEntry, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: getWorkloadServiceEntries
may be more clear?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in 04b7435
a.servicesMap[serviceEntryNamespacedName] = svc | ||
} | ||
|
||
sel := klabels.Set(svc.Attributes.ServiceEntry.WorkloadSelector.GetLabels()).AsSelectorPreValidated() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: us ValidatedSetSelector instead, its (much) faster
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in b632775
pilot/pkg/model/service.go
Outdated
// The service entry spec (if any) this service was derived from. | ||
ServiceEntry *networking.ServiceEntry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After looking through things I don't have any obvious suggestions but I feel highly uncomfortable with this.
We were already maxing out our tech debt in the ambient controller v1. Then we added WE support and added a lot more (basically 2x). Now SE doubles that again. And all of that is mostly fine since its all isolated to the mess of ambientindex.go -- but I feel very uncomfortable having this leak out into our fundamental core types that risks permeating through the rest of Istio
Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>
Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>
Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>
Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>
12744d3
to
37e605d
Compare
addressed all feedback again; most importantly, I removed the auto vip allocation support since it was the root cause of tech debt escaping into core istio types ( PTAL @howardjohn |
/test unit-tests |
Adds support for a subset of the service entry api to ambient mesh. The api supported is as follows:
notable exclusions:
To help us figure out who should review this PR, please put an X in all the areas that this PR affects.