Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support serverless environment #440

Open
imroc opened this issue Mar 8, 2023 · 11 comments
Open

Support serverless environment #440

imroc opened this issue Mar 8, 2023 · 11 comments
Labels
area/configurability Area: Configurability

Comments

@imroc
Copy link
Member

imroc commented Mar 8, 2023

Background

Thanks to the great idea of separation of L4 and L7 and the current implementation of ztunnel in Rust, the resource occupation and performance loss are minimized, which also makes it possible to integrate mesh capabilities (ztunnel) into serverless environment, which can be dynamically enabled at any time when needed. It is very conducive to the large-scale implementation of service mesh.

But currently ztunnel cannot be directly integrated in serverless environment(one vm one pod). Below I will introduce the existing problems and how to solve them.

Unable to prefetch cert

When ztunnel runs in a serverless environment, the following error will be reported:

image

That's because istiod uses NodeAuthorizer to verify whether the service account of the caller is trusted (istio-system/ztunnel) when ztunnel calls CreateCertificate to fetch cert, and in our serverless scenarios, ztunnel is not a daemonset, but a hidden container built into the pod , the servcie account used is the pod's own service account, so the verification will fail, resulting in failure to fetch cert:

image

Solution: Let ztunnel support to customize the behavior, does not carry metadata in a serverless environment to avoid verification by NodeAuthorizer.

Failed to transparently intercept traffic

In our serverless environment, ztunnel share net namespace with pod, so it's more like a sidecar, but only for L4 traffic. Just like envoy as a sidecar to intercept traffic transparently, needs to set SO_MARK for the packets sent to the local upstream(like this), so that the local upstream's return packets will be re-route to the loopback device by policy routing, and then be transparently proxied by ztunnel (avoid local upstream's return packets going out through eth0).

Solution: Let ztunnel support to set custom SO_MARK to local upstream.

@imroc imroc added the area/configurability Area: Configurability label Mar 8, 2023
@imroc
Copy link
Member Author

imroc commented Mar 8, 2023

I have integrated the ztunnel into the serverless environment by modifying the ztunnel code, and successfully run ambient mode in serverless environment. Next, I will generalize my modification and submit PRs to fix this.

imroc added a commit to imroc/ztunnel that referenced this issue Mar 8, 2023
Let ztunnel be able to fetch certificates in serverless
environment, part of istio#440
imroc added a commit to imroc/ztunnel that referenced this issue Mar 8, 2023
Let ztunnel be able to fetch certificates in serverless
environment, part of istio#440
@howardjohn
Copy link
Member

Similar deployment models are "Istio as a sidecar" - useful when we cannot deploy daemonset, or on a VM (not always serverless).

cc @stevenctl @adiprerepa and @costinm who are interested in these deployments

We should make sure the solution is generalized to these; I don't think it will be too hard to do that.

@bleggett
Copy link
Contributor

bleggett commented Mar 8, 2023

Solution: Let ztunnel support to set custom SO_MARK to local upstream.

If we need to set optional packet marks for netns routing purposes, CNI/eBPF/iptables might be a better spot for this than ztunnel.

I understand that this won't work for serverless - but that doesn't necessarily mean we should be doing this in zt proper.

@costinm
Copy link
Contributor

costinm commented Mar 8, 2023

For VM and serverless ( at least google serverless) iptables mode works fine, and probably good enough for now.

I have a small change for DNS capture.

Ztunnel needs some convincing that the VM is a pod on same node.

imroc added a commit to imroc/ztunnel that referenced this issue Mar 9, 2023
Let ztunnel be able to fetch certificates in serverless
environment, part of istio#440
imroc added a commit to imroc/ztunnel that referenced this issue Mar 9, 2023
Let ztunnel be able to fetch certificates in serverless
environment, part of istio#440
imroc added a commit to imroc/ztunnel that referenced this issue Mar 9, 2023
Let ztunnel be able to fetch certificates in serverless
environment, part of istio#440
istio-testing pushed a commit that referenced this issue Mar 9, 2023
Let ztunnel be able to fetch certificates in serverless
environment, part of #440
@imroc
Copy link
Member Author

imroc commented Mar 10, 2023

If we need to set optional packet marks for netns routing purposes, CNI/eBPF/iptables might be a better spot for this than ztunnel.

@bleggett I optimized the iptables rules, and it worked. There is no need for zt to set SO_MARK, but I don't understand why istiod set 1337 mark for envoy.filters.listener.original_src in sidecar mode when interceptionMode set to TPROXY, is it for performance? (reduce iptables matching and mark)

check istiod code here

check envoy reference here

Let me share my iptables and policy-based routing implementation:

# Policy-based routing, in order to achieve transparent proxy, has two specific functions: 1) When external non-mtls traffic comes in, it is tproxyed to ztunnel 2) Avoid the app's return packet go out through `eth0`, ensure that it is transparently proxied by ztunnel when it goes to `lo`
ip route add local 0.0.0.0/0 dev lo table 133
ip rule add priority 101 fwmark 0x539 lookup 133

iptables -t mangle -N ztunnel-PREROUTING
iptables -t mangle -N ztunnel-DIVERT
iptables -t nat -N ztunnel-OUTPUT

# outbound

# Make sure that the app's return packets go to `lo` to be transparently proxied by ztunnel
iptables -t mangle -A OUTPUT -p tcp -m connmark --mark 1337 -j CONNMARK --restore-mark

iptables -t nat -A OUTPUT -p tcp -j ztunnel-OUTPUT
# The ztunnel forwards the connection sent to the local upstream, marks it with 1337
iptables -t nat -A ztunnel-OUTPUT -o lo -m owner --gid-owner 1337 -m conntrack --ctstate NEW -j CONNMARK --set-mark 1337
iptables -t nat -A ztunnel-OUTPUT -o lo -j RETURN
# Mark the outgoing connection initiated by ztunnel to avoid intercepting its subsequent return packets, such as accessing the peer ztunnel:15008, or connecting istiod:15012
iptables -t nat -A ztunnel-OUTPUT -m owner --gid-owner 1337 -m conntrack --ctstate NEW -j CONNMARK --set-mark 1338
# The traffic of ztunnel itself is not intercepted
iptables -t nat -A ztunnel-OUTPUT -m owner --gid-owner 1337 -j RETURN
# Other outbound traffic is uniformly intercepted to ztunnel:15001
iptables -t nat -p tcp -A ztunnel-OUTPUT -j REDIRECT --to-ports 15001

# inbound

iptables -t mangle -A ztunnel-DIVERT -j MARK --set-xmark 1337
iptables -t mangle -A ztunnel-DIVERT -j ACCEPT

iptables -t mangle -A PREROUTING -p tcp -j ztunnel-PREROUTING

iptables -t mangle -A ztunnel-PREROUTING -i lo -j RETURN
# Ignore return packet from outside to ztunnel (istiod or peer ztunnel)
iptables -t mangle -A ztunnel-PREROUTING -m connmark --mark 1338 -j RETURN
# Ignore 15008 and 15020
iptables -t mangle -A ztunnel-PREROUTING -p tcp -m tcp --dport 15008 -j RETURN
iptables -t mangle -A ztunnel-PREROUTING -p tcp -m tcp --dport 15020 -j RETURN
# If it is a package that has already been connected, mark it with 1337 mark to let it go directly to `lo`, and it will be transparently proxied by ztunnel
iptables -t mangle -A ztunnel-PREROUTING -p tcp -m conntrack --ctstate RELATED,ESTABLISHED -j ztunnel-DIVERT
# All other inbound traffic is tproxyed to 15006
iptables -t mangle -A ztunnel-PREROUTING ! -d 127.0.0.1/32 -p tcp -j TPROXY --on-port 15006 --on-ip 0.0.0.0 --tproxy-mark 1337

@imroc
Copy link
Member Author

imroc commented Mar 10, 2023

Another problem in serverless environment.

Proxy outbound with DirectLocal if destination is on the same virtual node

In serverless, node is virtual, pods on the same "node" do not share the ztunnel, so we can't let outbound traffic go through the DirectLocal fast path.

Solution: ref PR #448

@bleggett
Copy link
Contributor

bleggett commented Mar 10, 2023

@bleggett I optimized the iptables rules, and it worked. There is no need for zt to set SO_MARK, but I don't understand why istiod set 1337 mark for envoy.filters.listener.original_src in sidecar mode when interceptionMode set to TPROXY, is it for performance? (reduce iptables matching and mark)

Unclear to me, we don't use that mark in ambient CNI and I don't have a good mental model of the non-ambient CNI yet.

From what I can gather from the CNI code the 1337 mark is used to mark traffic coming from the sidecar proxy itself, so that it can be skipped for redirection. I am not aware of another use for it.

@costinm
Copy link
Contributor

costinm commented Mar 10, 2023 via email

@costinm
Copy link
Contributor

costinm commented Mar 10, 2023 via email

@bleggett
Copy link
Contributor

Does marking in ztunnle require NET_ADMIN ? We still want to make sure ztunnel as sidecar doesn't need N_ADMIN caps.

Agree - I think @imroc has found we don't need ztunnel to do the marking, which is what I would expect.

@imroc
Copy link
Member Author

imroc commented Mar 15, 2023

In favor of this design, applicable to serverless and VM scenarios: https://docs.google.com/document/d/135gWueRb5KW2vc0RerE2vGKBoCFxbqoLVBuTGJBcxO4/edit#

Related PR: #452

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/configurability Area: Configurability
Projects
None yet
Development

No branches or pull requests

4 participants