Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ambient upstream: upstream XDS codegen and some misc #43422

Merged

Conversation

howardjohn
Copy link
Member

Closes #40879

This is the final ambient upstream PR. With this PR, ambient mesh will be fully runn-able from the master branch, a part of our regular CI/CD, release, etc. experimental-ambient branch will be deprecated and not used any longer.

This PR imports the final xDS changes, as well as some minor ports that slipped through the cracks in previous PRs.

Most of this PR is new XDS for waypoints. This code, of course, only applies to waypoints so it is off-by-default. There is some changes to sidecar HBONE as well, which is off by default already. There should be no changes for sidecar users; everything is off by default

Quoting from the issue:

Reviewer guide

It is suggested that reviewers treat these PRs differently than normal, as the code has already been reviewed. PRs should be thoroughly reviewed for risk to non-ambient users, ensuring any new/risky/expensive code is disabled by default. However, code quality comments, nit picks, or even bugs in ambient code should not block a merge; modifying these PRs will be high cost and lead to a 3-way drift from master, experimental-ambient, and the PR. Instead, these comments should be turned into issues to complete following the merge.

Release notes will not be added for any PR. Instead, they will be written holistically for the ambient feature in 1.18 at a later point.

@howardjohn howardjohn requested review from a team as code owners February 17, 2023 01:12
@howardjohn
Copy link
Member Author

/test ?

@istio-testing
Copy link
Collaborator

@howardjohn: The following commands are available to trigger required jobs:

  • /test analyze-tests
  • /test gencheck
  • /test integ-ambient
  • /test integ-basic-arm64
  • /test integ-cni
  • /test integ-distroless
  • /test integ-ds
  • /test integ-helm
  • /test integ-ipv6
  • /test integ-operator-controller
  • /test integ-pilot
  • /test integ-pilot-istiodremote
  • /test integ-pilot-istiodremote-mc
  • /test integ-pilot-multicluster
  • /test integ-security
  • /test integ-security-istiodremote
  • /test integ-security-multicluster
  • /test integ-telemetry
  • /test integ-telemetry-istiodremote
  • /test integ-telemetry-mc
  • /test lint
  • /test release-notes
  • /test release-test
  • /test unit-tests
  • /test unit-tests-arm64

The following commands are available to trigger optional jobs:

  • /test benchmark
  • /test integ-assertion

Use /test all to run the following jobs that were automatically triggered:

  • analyze-tests_istio
  • gencheck_istio
  • integ-basic-arm64_istio
  • integ-cni_istio
  • integ-distroless_istio
  • integ-ds_istio
  • integ-helm_istio
  • integ-ipv6_istio
  • integ-operator-controller_istio
  • integ-pilot-istiodremote-mc_istio
  • integ-pilot-istiodremote_istio
  • integ-pilot-multicluster_istio
  • integ-pilot_istio
  • integ-security-istiodremote_istio
  • integ-security-multicluster_istio
  • integ-security_istio
  • integ-telemetry-istiodremote_istio
  • integ-telemetry-mc_istio
  • integ-telemetry_istio
  • lint_istio
  • release-notes_istio
  • release-test_istio
  • unit-tests-arm64_istio
  • unit-tests_istio

In response to this:

/test ?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@istio-testing istio-testing added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Feb 17, 2023
@howardjohn
Copy link
Member Author

/test integ-ambient

// Setup inbound clusters
inboundPatcher := clusterPatcher{efw: envoyFilterPatches, pctx: networking.EnvoyFilter_SIDECAR_INBOUND}
clusters = append(clusters, configgen.buildInboundClusters(cb, proxy, instances, inboundPatcher)...)
clusters = append(clusters, configgen.buildInboundHBONEClusters(cb, proxy, instances)...)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: this looks suspicious but we check EnableHBONE in buildInboundHBONEClusters so it does NOT impact non-HBONE users

@howardjohn howardjohn added the release-notes-none Indicates a PR that does not require release notes. label Feb 17, 2023
@howardjohn
Copy link
Member Author

/retest

@howardjohn
Copy link
Member Author

/test integ-ambient

@hzxuzhonghu
Copy link
Member

Copy link
Member

@hzxuzhonghu hzxuzhonghu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LG, not find a risky point.

@@ -82,6 +82,7 @@ func (s *Server) initKubeRegistry(args *PilotArgs) (err error) {
args.RegistryOptions.ClusterRegistriesNamespace,
args.RegistryOptions.KubeOptions,
s.serviceEntryController,
s.configController,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mark to check its usage

@@ -40,6 +40,8 @@ import (

func TestConfigureIstioGateway(t *testing.T) {
test.SetForTest(t, &features.EnableAmbientControllers, true)
// Recompute with ambient enabled
classInfos = getClassInfos()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed per the comment to rebuild. Since SetForTest is run after it is computed

pilot/pkg/networking/core/v1alpha3/cluster.go Show resolved Hide resolved
pilot/pkg/networking/core/v1alpha3/cluster.go Outdated Show resolved Hide resolved
ts := c.TransportSocket
c.TransportSocket = nil

c.TransportSocketMatches = []*cluster.Cluster_TransportSocketMatch{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

abstract like HboneOrPlaintextSocket

l := builder.getListeners()
if builder.node.EnableHBONE() {
if builder.node.IsAmbient() {
l = append(l, outboundTunnelListener(builder.push, builder.node))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this after L107?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to finish patch listener before we do this. May be we should add buildWaypointListeners that calls buildSidecarListeners and does this stuff

Comment on lines +188 to +219
if lb.node.Type == model.Router {
return lb.gatewayListeners
}
nInbound, nOutbound := len(lb.inboundListeners), len(lb.outboundListeners)
nHTTPProxy, nVirtual := 0, 0
if lb.httpProxyListener != nil {
nHTTPProxy = 1
}
if lb.virtualOutboundListener != nil {
nVirtual = 1
}

listeners := make([]*listener.Listener, 0, nListener)
listeners = append(listeners, lb.outboundListeners...)
if lb.httpProxyListener != nil {
listeners = append(listeners, lb.httpProxyListener)
}
if lb.virtualOutboundListener != nil {
listeners = append(listeners, lb.virtualOutboundListener)
}
listeners = append(listeners, lb.inboundListeners...)
nListener := nInbound + nOutbound + nHTTPProxy + nVirtual

log.Debugf("Build %d listeners for node %s including %d outbound, %d http proxy, "+
"%d virtual outbound",
nListener,
lb.node.ID,
nOutbound,
nHTTPProxy,
nVirtual,
)
return listeners
listeners := make([]*listener.Listener, 0, nListener)
listeners = append(listeners, lb.outboundListeners...)
if lb.httpProxyListener != nil {
listeners = append(listeners, lb.httpProxyListener)
}

return lb.gatewayListeners
if lb.virtualOutboundListener != nil {
listeners = append(listeners, lb.virtualOutboundListener)
}
listeners = append(listeners, lb.inboundListeners...)

log.Debugf("Build %d listeners for node %s including %d outbound, %d http proxy, "+
"%d virtual outbound",
nListener,
lb.node.ID,
nOutbound,
nHTTPProxy,
nVirtual,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is the func updated? Just a refactoring router first?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Comment on lines 447 to 450
workloads := findWaypoints(b.push, e)
if len(workloads) > 0 {
// TODO: load balance
tunnelAddress = workloads[0].String()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this translate multi eps with same waytpoint address?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same waypoint yes, but the endpoints are not identical, so we cannot merge them and add weight. Each one has two addresses associated: the waypoint, and the destination (set in :authority) we want.

@@ -425,10 +460,54 @@ func buildEnvoyLbEndpoint(proxyless bool, e *model.IstioEndpoint) *endpoint.LbEn
},
}
}
if dir == model.TrafficDirectionInboundVIP {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to move these to a waypoint endpoint builder like listener and cluster?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be a bit hard since this buildEnvoyLbEndpoint is a common function used by all types, we just have a small bit of custom logic here

@hzxuzhonghu
Copy link
Member

cc @kyessenov to take a look at the waypoint builder

Copy link
Contributor

@ramaraochavali ramaraochavali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally Looks great. Nicely done. Did not fully review the waypoint* files. We can do once this is merged. Left some comments. None of them are blockers.

pilot/pkg/networking/core/v1alpha3/cluster.go Outdated Show resolved Hide resolved
pilot/pkg/networking/core/v1alpha3/cluster.go Outdated Show resolved Hide resolved
@@ -445,6 +449,7 @@ func (cb *ClusterBuilder) buildInboundClusterForPortOrUDS(clusterPort int, bind
}
localCluster := cb.buildDefaultCluster(clusterName, clusterType, localityLbEndpoints,
model.TrafficDirectionInbound, instance.ServicePort, instance.Service, allInstance)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove

pilot/pkg/networking/util/util.go Outdated Show resolved Hide resolved
model.TrafficDirectionInbound, &port, nil, nil)

// Ensure VIP cluster has services metadata for stats filter usage
im := getOrCreateIstioMetadata(localCluster.cluster)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be enabled only when telemetry is on?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, not sure its going to matter. The waypoint XDS size is going to be orders of magnitude smaller than sidecars, so a bit of metadata here and there may be fine. We can followup as needed though, telemetry for waypoint is still under construction

l := builder.getListeners()
if builder.node.EnableHBONE() {
if builder.node.IsAmbient() {
l = append(l, outboundTunnelListener(builder.push, builder.node))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to finish patch listener before we do this. May be we should add buildWaypointListeners that calls buildSidecarListeners and does this stuff

lb.inboundListeners = lb.buildInboundListeners()
if lb.node.EnableHBONE() {
lb.inboundListeners = append(lb.inboundListeners, lb.buildInboundHBONEListeners()...)
if lb.node.IsWaypointProxy() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need HBone stuff for waypoint?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do but its in buildWaypointInbound. We have recently improved the HBONE impl for waypoint, so its a bit different than the sidecar one. There is work in #42789 to bring these in alignment which may clean this up

supportsTunnel = false
}

// For outbound case, we selectively add tunnel info if the other side supports the tunnel
if supportsTunnel {
if dir != model.TrafficDirectionInboundVIP && supportsTunnel {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parse dir only if supportsTunnel is true and move this inside if supportsTunnel?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually need part of it even with !supportsTunnel. Let me add dir to endpointBuilder so we don't re-parse it and clean up this logic.

if supportsTunnel {
if dir != model.TrafficDirectionInboundVIP && supportsTunnel {
// Support connecting to server side waypoint proxy, if the destination has one. This is for sidecars and ingress.
if dir == model.TrafficDirectionOutbound && !b.proxy.IsWaypointProxy() && !b.proxy.IsAmbient() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this logic be enabled only when ambient mesh is enabled? Do we have such gloabl flag? findingWaypoints etc are not needed if ambient mesh is not enabled?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already the case :

	if !b.proxy.EnableHBONE() {
		supportsTunnel = false
	}

which is controlled by bot a global flag AND a per-proxy flag enabled

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But can't HBONE be enabled independently?

metadata:
labels:
gateway.istio.io/managed: istio.io-mesh-controller
name: namespace-istio-waypoint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a TODO list of bug to track reviewing/changing all user-visible names ?

I don't want to delay this PR - it is a merge - but the review criteria in an experimental branch is slightly different from the review on main - that's why it's experimental. I expect code has been tested and reviewed for functionality, and it is off by default - but we got into a lot of problems by
reviewing APIs and names as 'experimental/alpha' and then having to move them to beta/stable
due to backwards compat.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes definitely agree. Opened #43437 to track

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also to confirm - now or in future we will allow the user to create a ServiceAccount with the same name and additional annotations ? And we won't add or merge this on top ?

Can you at least add a TODO: that this needs to be documented and discussed. It has to be documented since users may need to add annotations ( for example to allow waypoint to get 3p service account tokens) and used in RBAC rules.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plan is if the resource already exists (without gateway.istio.io/managed) we will not touch it, so they can own it themselves. Note this isn't implemented but tracked in #43439

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note since we use Apply they can add annotations and we won't revert them btw. But it is useful to have them be able to fully own it I think.

@kyessenov
Copy link
Contributor

@hzxuzhonghu I recently changed waypoint xDS and it's fine for now. We can iterate on the VirtualService translation for it later.

Copy link
Contributor

@stevenctl stevenctl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks safe to me.

@howardjohn
Copy link
Member Author

/retest

@howardjohn
Copy link
Member Author

/test integ-ambient

@istio-testing istio-testing merged commit 73e9b41 into istio:master Feb 17, 2023
@costinm
Copy link
Contributor

costinm commented Feb 18, 2023 via email

@ericvn
Copy link
Contributor

ericvn commented Feb 20, 2023

It does seem this PR has a big negative effect on the post-submit tests, noting that it did re-enable the ambient tests.

Comment on lines +156 to +159
ManagedGatewayLabel = "gateway.istio.io/managed"
ManagedGatewayController = "istio.io-gateway-controller"
ManagedGatewayMeshControllerLabel = "istio.io-mesh-controller"
ManagedGatewayMeshController = "istio.io/mesh-controller"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, these constants should have comments, otherwise they are ambiguous to discern.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense, i think these may even change

@howardjohn
Copy link
Member Author

It does seem this PR has a big negative effect on the post-submit tests, noting that it did re-enable the ambient tests.

it's a tiny change we just need to build ztunnel in more cases. I'll fix when I'm back on Tuesday

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-notes-none Indicates a PR that does not require release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants