New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpf: disable -Waddress-of-packed-member in scripts as well #318
Conversation
Do the same as in the recent commit ca0038e ("bpf: disable -Waddress-of-packed-member on newer clang versions") also for various scripts where we have BPF compilation targets. Signed-off-by: Daniel Borkmann <daniel@cilium.io>
Note that for GCE kernel, we likely need ... https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a08dd0da5307ba01295c8383923e51e7997c3576 ... as new clang fetches spilled map_value_or_null register from stack (r0), then compares r0 against NULL, marks it to map_value, and later on fetches the same spilled map_value_or_null register from stack again into r2 and performs some work on r2. On GCE verifier seems to complain that r2 is still map_value_or_null. Explanation is that in kernel in mark_map_regs(), the reg id was not cached. So the buggy kernel would mark the register first (due to being traversed first), and then override the regs[regno].id, which will have the side effect that no marking will be performed on spilled regs (due to id mismatches). |
What does this mean for actual kernel versions what is the behaviour a user would observe? Rejection of produced bytecode? |
For the 4.8 kernel (the one running on gce):
This only happens if the clang version used is the |
The interesting piece of the log is this one:
I wrote a similar test case yesterday to reproduce and it passed on my -net kernel, realizing that the one on GCE was older, not having these fixes against clang code generation we once did already. Issue is that 4.8 kernel is EOL since some months, the current stable kernels are 4.9 and 4.10 maintained by Greg. I went back to the list I once did with commits for cilium, and saw that the original map value marking commit is not even part of 4.8 and 4.9. So we would need the original fix from 57a09bf plus all follow-ups:
I could ask Dave today to queue these to stable, and once they're in we could open a ticket either at GCE support or Canonical to get this into their 4.8 kernels? Given that these are part of stable kernel would make the argument simple hopefully. Meanwhile going back to clang The cilium commit from here can go in independent of it, though, so that for clang |
Do the same as in the recent commit ca0038e ("bpf: disable
-Waddress-of-packed-member on newer clang versions") also for
various scripts where we have BPF compilation targets.
Signed-off-by: Daniel Borkmann daniel@cilium.io
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a08dd0da5307ba01295c8383923e51e7997c3576
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6760bf2ddde8ad64f8205a651223a93de3a35494
... as new clang fetches spilled map_value_or_null register from stack (r0), then compares r0 against NULL, marks it to map_value, and later on fetches the same spilled map_value_or_null register from stack again into r2 and performs some work on r2. On GCE verifier seems to complain that r2 is still map_value_or_null. Explanation is that in kernel in mark_map_regs(), the reg id was not cached. So the buggy kernel would mark the register first (due to being traversed first), and then override the regs[regno].id, which will have the side effect that no marking will be performed on spilled regs (due to id mismatches).
What does this mean for actual kernel versions what is the behaviour a user would observe? Rejection of produced bytecode?
For the 4.8 kernel (the one running on gce):
This only happens if the clang version used is the
4.0.0
. Should we keep the Dockerfile version with3.7.1
or increment it to4.0.0
at the cost of bumping the kernel minimal required version?I wrote a similar test case yesterday to reproduce and it passed on my -net kernel, realizing that the one on GCE was older, not having these fixes against clang code generation we once did already.
Issue is that 4.8 kernel is EOL since some months, the current stable kernels are 4.9 and 4.10 maintained by Greg. I went back to the list I once did with commits for cilium, and saw that the original map value marking commit is not even part of 4.8 and 4.9.
So we would need the original fix from 57a09bf plus all follow-ups:
I could ask Dave today to queue these to stable, and once they're in we could open a ticket either at GCE support or Canonical to get this into their 4.8 kernels? Given that these are part of stable kernel would make the argument simple hopefully.
Meanwhile going back to clang
3.7.1
seems probably reasonable as a quick fix.The cilium commit from here can go in independent of it, though, so that for clang
4.0
and latest kernels we don't throw this useless (for eBPF context) warning anymore.