Description
I was playing with iptables yesterday, and I protoyped (well, copied from Google hits and mutated) a set of iptables rules that essentially do all the proxying for us without help from userspace. It's not urgent, but I want to file my notes before I lose them.
This has the additional nice side-effect (as far as I can tell) of preserving the source IP and being a large net simplification. Now kube-proxy would just need to sync Services -> iptables. This has the downside of not being compatible with older iptables and kernels. We had a problem with this before - at some point we need to decide just how far back in time we care about.
This can probably be optimized further, but in basic testing, I see sticky sessions working and if I comment that part out I see ~equal probability of hitting each backend. I was not able to get deterministic round-robin working properly (with --nth instead of --probability) but we could come back to that if we want.
This sets up a service portal with the backends listed below
iptables -t nat -N TESTSVC
iptables -t nat -F TESTSVC
iptables -t nat -N TESTSVC_A
iptables -t nat -F TESTSVC_A
iptables -t nat -N TESTSVC_B
iptables -t nat -F TESTSVC_B
iptables -t nat -N TESTSVC_C
iptables -t nat -F TESTSVC_C
iptables -t nat -A TESTSVC -m recent --name hostA --rcheck --seconds 1 --reap -j TESTSVC_A
iptables -t nat -A TESTSVC -m recent --name hostB --rcheck --seconds 1 --reap -j TESTSVC_B
iptables -t nat -A TESTSVC -m recent --name hostC --rcheck --seconds 1 --reap -j TESTSVC_C
iptables -t nat -A TESTSVC -m statistic --mode random --probability 0.333 -j TESTSVC_A
iptables -t nat -A TESTSVC -m statistic --mode random --probability 0.500 -j TESTSVC_B
iptables -t nat -A TESTSVC -m statistic --mode random --probability 1.000 -j TESTSVC_C
iptables -t nat -A TESTSVC_A -m recent --name hostA --set -j DNAT -p tcp --to-destination 10.244.4.6:9376
iptables -t nat -A TESTSVC_B -m recent --name hostB --set -j DNAT -p tcp --to-destination 10.244.1.15:9376
iptables -t nat -A TESTSVC_C -m recent --name hostC --set -j DNAT -p tcp --to-destination 10.244.4.7:9376
iptables -t nat -F KUBE-PORTALS-HOST
iptables -t nat -A KUBE-PORTALS-HOST -d 10.0.0.93/32 -m state --state NEW -p tcp -m tcp --dport 80 -j TESTSVC
iptables -t nat -F KUBE-PORTALS-CONTAINER
iptables -t nat -A KUBE-PORTALS-CONTAINER -d 10.0.0.93/32 -m state --state NEW -p tcp -m tcp --dport 80 -j TESTSVC