Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to enforce policy on running endpoints with debug=false and kernel==4.10 + some stable #1822

Closed
aanm opened this issue Oct 23, 2017 · 13 comments
Assignees
Labels
kind/bug This is a bug in the Cilium logic. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.

Comments

@aanm
Copy link
Member

aanm commented Oct 23, 2017

Steps to reproduce:

vagrant init eloycoto/cilium
vagrant up --provider=virtualbox
# wait for VM to boot
vagrant ssh
# inside the VM
mkdir -p /home/vagrant/go/src
go get github.com/cilium/cilium
go get github.com/jteeuwen/go-bindata
export GOPATH=/home/vagrant/go
cd $GOPATH/src/github.com/jteeuwen/go-bindata
# install go-bindata
go install ./...
export PATH=$GOPATH/bin:$PATH
cd $GOPATH/src/github.com/cilium/cilium
# make and install cilium
make && sudo make install
sudo mkdir -p /etc/sysconfig
# Install systemd scripts
sudo cp /home/vagrant/go/src/github.com/cilium/cilium/contrib/systemd/cilium-consul.service /lib/systemd/system
sudo cp /home/vagrant/go/src/github.com/cilium/cilium/contrib/systemd/cilium-docker.service /lib/systemd/system
sudo cp /home/vagrant/go/src/github.com/cilium/cilium/contrib/systemd/cilium-etcd.service /lib/systemd/system
sudo cp /home/vagrant/go/src/github.com/cilium/cilium/contrib/systemd/cilium.service /lib/systemd/system
sudo cp /home/vagrant/go/src/github.com/cilium/cilium/contrib/systemd/cilium /etc/sysconfig

# Disable debug in cilium options
sudo sh -c 'echo "CILIUM_OPTS=--kvstore consul --kvstore-opt consul.address=127.0.0.1:8500" >> /etc/sysconfig/cilium'

# pre-download images
sudo docker pull cilium/demo-httpd
sudo docker pull consul:0.8.3
# start cilium
sudo service cilium restart
# ensure cilium is running
sudo service cilium status
# With debug disabled
ps aux | grep cilium-agent
# /usr/bin/cilium-agent --debug --kvstore consul --kvstore-opt consul.address=127.0.0.1:8500
#
# Create docker network and containers
sudo docker network create --ipv6 --subnet ::1/112 --driver cilium --ipam-driver cilium cilium-net
sudo docker run -d --name app1 --net cilium-net -l "id=app1" cilium/demo-httpd
sudo docker run -d --name app2 --net cilium-net -l "id=app2" cilium/demo-httpd
sudo docker run -d --name app3 --net cilium-net -l "id=app3" cilium/demo-httpd

# Check the endpoints are ready
sudo cilium endpoint list
# Enable policy PolicyEnforcement
sudo cilium config PolicyEnforcement=always

sudo journalctl -u cilium -f
#... level=warning msg="Command execution failed: exit status 1"
#... level=warning msg="Join EP id=/var/run/cilium/state/38939_next ifname=lxcfbdd1"
#... level=warning msg="'probe' is not a recognized processor for this target (ignoring processor)"
#... level=warning msg="'probe' is not a recognized processor for this target (ignoring processor)"
#... level=warning msg="Log buffer too small to dump verifier log 2097152 bytes (7 tries)!"
#... level=warning msg="Error fetching program/map!"
#... level=warning msg="Failed to retrieve (e)BPF data!

This only happens when debug mode is off
"clang (4.0.0) and kernel (4.10.0) versions: OK!"

kernels 4.9, 4.12 and 4.13 are not affected

Reported-by @eloycoto

@aanm aanm added sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. kind/bug This is a bug in the Cilium logic. priority/high This is considered vital to an upcoming release. labels Oct 23, 2017
@eloycoto
Copy link
Member

Btw: If you run the cilium-agent in debug mode the issue does not happen.

@borkmann
Copy link
Member

I tried to reproduce this on a 4.10 kernel with the object files @aanm provided me offline.

I used the git tagged 4.10 kernel, and the related 4.10 tagged iproute2 version to test it. The error I get below is a different one than in the log above, which suggests in the log above an older iproute2 version may have been used, which doesn't expand log further.

Anyway, the error I get is due to complexity:

# ./iproute2/tc/tc -V
tc utility, iproute2-ss170905
# uname -a
Linux linux-2.home 4.10.0 #1 SMP Mon Oct 23 16:46:40 CEST 2017 x86_64 x86_64 x86_64 GNU/Linux
# ./iproute2/tc/tc filter add dev lo ingress bpf da obj ./bpf_lxc.o sec from-container
Note: 8 bytes struct bpf_elf_map fixup performed due to size mismatch!

Prog section 'from-container' rejected: Argument list too long (7)!
 - Type:         3
 - Instructions: 2557 (0 over limit)
 - License:      GPL

Verifier analysis:

Skipped 3422381 bytes, use 'verb' option for the full verbose log.
[...]
 R0=inv56,min_value=128,max_value=128 R1=inv32 R2=imm1,min_value=1,max_value=1 R3=imm2,min_value=2,max_value=2 R4=imm2,min_value=2,max_value=2 R6=ctx R7=imm2,min_value=2,max_value=2 R8=imm0,min_value=0,max_value=0 R9=inv56,min_value=61,max_value=58 R10=fp
539: (b7) r2 = 0
540: (7b) *(u64 *)(r10 -256) = r4
541: (79) r3 = *(u64 *)(r10 -232)
542: (07) r3 += 14
543: (7b) *(u64 *)(r10 -240) = r3
544: (77) r1 >>= 31
545: (4f) r2 |= r1
546: (15) if r2 == 0x0 goto pc+9
 R0=inv56,min_value=128,max_value=128 R1=inv63 R2=inv R3=inv R4=imm2,min_value=2,max_value=2 R6=ctx R7=imm2,min_value=2,max_value=2 R8=imm0,min_value=0,max_value=0 R9=inv56,min_value=61,max_value=58 R10=fp
547: (b7) r1 = 0
548: (7b) *(u64 *)(r10 -264) = r1
549: (bf) r1 = r8
550: (67) r1 <<= 32
551: (77) r1 >>= 32
552: (18) r2 = 0xffffff72
554: (1d) if r1 == r2 goto pc+185
 R0=inv56,min_value=128,max_value=128 R1=inv32 R2=inv R3=inv R4=imm2,min_value=2,max_value=2 R6=ctx R7=imm2,min_value=2,max_value=2 R8=imm0,min_value=0,max_value=0 R9=inv56,min_value=61,max_value=58 R10=fp
555: (05) goto pc-502
BPF program is too large. Proccessed 65537 insn

Error fetching program/map!

The issue is that 4.10 was declared EOL [1] last May. So it still has the old complexity limit of 68k, whereas 4.9, 4.11., 4.12, 4.13 have a complexity limit og 98k, and 4.14 of 128k. If we want to support 4.10, we need to split from-container into smaller chunks.

[1] http://lkml.iu.edu/hypermail/linux/kernel/1705.2/04097.html

@ianvernon
Copy link
Member

level=warning msg="Log buffer too small to dump verifier log 2097152 bytes (7 tries)! - is this something that can be fixed on the Cilium side? Would be helpful to have more to go off of than just this error message.

@borkmann borkmann self-assigned this Oct 23, 2017
@borkmann
Copy link
Member

What does tc -V say?

@aanm
Copy link
Member Author

aanm commented Oct 23, 2017

$ tc -V
tc utility, iproute2-ss161212

@borkmann
Copy link
Member

iproute2-ss161212 is the version for 4.9 kernels. The one I used iproute2-ss170905 for 4.10 kernels. It's missing this commit:

commit 0f74d0f3a9da9de5fa69bfc9eafcea9b90bbe211
Author: Thomas Graf <tgraf@suug.ch>
Date:   Wed Dec 7 10:47:59 2016 +0100

    bpf: Fix number of retries when growing log buffer
    
    The log buffer is automatically grown when the verifier output does not
    fit into the default buffer size. The number of growing attempts was
    not sufficient to reach the maximum buffer size so far.
    
    Perform 9 iterations to reach max and let the 10th one fail.
    
    j:0     i:65536         max:16777215
    j:1     i:131072        max:16777215
    j:2     i:262144        max:16777215
    j:3     i:524288        max:16777215
    j:4     i:1048576       max:16777215
    j:5     i:2097152       max:16777215
    j:6     i:4194304       max:16777215
    j:7     i:8388608       max:16777215
    j:8     i:16777216      max:16777215
    
    Signed-off-by: Thomas Graf <tgraf@suug.ch>
    Acked-by: Daniel Borkmann <daniel@iogearbox.net>

Switching to iproute2 version == kernel version will then result in the same message I got above.

@eloycoto
Copy link
Member

Hey @borkmann

I thought that I did something wrong but looks like that the Ubuntu Zesty iproute package is not correct:

root@runtime:/home/vagrant# apt-cache madison linux-image-generic
linux-image-generic | 4.10.0.37.37 | http://archive.ubuntu.com/ubuntu zesty-updates/main amd64 Packages
linux-image-generic | 4.10.0.37.37 | http://security.ubuntu.com/ubuntu zesty-security/main amd64 Packages
linux-image-generic | 4.10.0.19.21 | http://archive.ubuntu.com/ubuntu zesty/main amd64 Packages
root@runtime:/home/vagrant# apt-cache madison iproute2
  iproute2 | 4.9.0-1ubuntu1 | http://archive.ubuntu.com/ubuntu zesty/main amd64 Packages
root@runtime:/home/vagrant# 

And here you have the links:
https://packages.ubuntu.com/zesty/iproute2
https://packages.ubuntu.com/zesty/linux-image-generic

Maybe we should have a warning or something on cilium start? Do you want me to report to Ubuntu?

Regards

@eloycoto
Copy link
Member

And looks like that the same version is running on the current Ubuntu stable release.
https://packages.ubuntu.com/artful/iproute2

@borkmann
Copy link
Member

@eloycoto If you have a chance to report that to Ubuntu upstream in case there's not a ticket open on that yet, that would be great.

It doesn't solve the issue on verifier complexity (working on a fix for cilium for that), but it definitely seems strange on Ubuntu side that they are not aligned. It also means that users will miss all other new features in iproute2 that went into corresponding kernels. Thanks!

@borkmann
Copy link
Member

Last commit that I found which worked in this combination (4.10 kernel + recent clang) was back in May (!) commit fe5555e ("cilium: add proxy6map support"), first one that breaks due to hitting complexity limit is commit 6c0c690 ("cilium: ct_state was not being returned in ipv6 ct lookup case"). CI with kernel/llvm/iproute2 test matrix (#824) would help identifying such issues upfront.

@tgraf tgraf added priority/low This is considered nice to have. and removed priority/high This is considered vital to an upcoming release. labels Nov 20, 2017
@borkmann
Copy link
Member

The issue also happens on -stable kernels (e.g. 4.9.y and 4.13.y ) with a different LLVM version than 3.8.1 which runs on our Jenkins currently. LLVM's code generation there seems to be different in a way that we're hitting upper complexity limit of 98k (I've seen 233k from bpf_lxc).

Therefore bumping prio to high again. Adding tail call hack into the fast-path to reduce complexity.

@borkmann borkmann added priority/high This is considered vital to an upcoming release. and removed priority/low This is considered nice to have. labels Nov 29, 2017
@borkmann borkmann changed the title Unable to enforce policy on running endpoints with debug=false and kernel==4.10 Unable to enforce policy on running endpoints with debug=false and kernel==4.10 + some stable Nov 29, 2017
@tgraf tgraf removed priority/high This is considered vital to an upcoming release. project/1.0-gap labels Nov 29, 2017
@tgraf
Copy link
Member

tgraf commented Nov 29, 2017

This is due to missing upstreamed backports in non LTS kernel versions. It is unclear how we could avoid this behaviour given that these are kernel side changes that are missing.

@borkmann
Copy link
Member

Already fixed via #2784, therefore this can be closed. Ubuntu side via https://bugs.launchpad.net/ubuntu/+source/iproute2/+bug/1735032 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug This is a bug in the Cilium logic. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Projects
None yet
Development

No branches or pull requests

5 participants