Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bootenv/rockchip] Force cgroup v1 usage #3830

Merged
merged 1 commit into from Jun 6, 2022

Conversation

theblackhole
Copy link
Contributor

@theblackhole theblackhole commented May 27, 2022

Description

This commit forces cgroup v1 usage as a workaround for docker / runc BPF issues until a proper fix is submitted

Context

Docker 20.10+ cannot run on our Asus Tinkerboard, it throws BPF related errors (Like opencontainers/runc#2959 , especially this comment). I tried a lot of armbian flavors (focal, jammy, bullseye + current, edge and legacy kernels), and even building an image with CONFIG_BPF_SYSCALL=y as suggested by this comment but it didn't fix the issue.
The only thing that fixed docker was to add extraargs=systemd.unified_cgroup_hierarchy=0 to /boot/armbianEnv.txt.

Until someone else does a proper fix to make cgroup v2 working, I suggest downgrading to cgroup v1 with this extraarg.

How Has This Been Tested?

  • Tested on Asus Tinkerboard with latest 5.15 stable kernel from apt repository.
  • ❌ NOT tested with edge and legacy kernels
    (Sorry I had to quickly put the machine into production and didn't have time to test other kernels)

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
    => Does it need a documentation change ?
  • My changes generate no new warnings
  • Any dependent changes have been merged and published in downstream modules

Force cgroup v1 usage since some runc apps like docker 20.10+ cannot run on rockchip with cgroup v2 enabled. (It generates BPF related issues, see opencontainers/runc#2959)

This is a workaround until a proper fix is submitted.

Tested on Asus Tinkerboard with current stable kernel.
Copy link
Member

@igorpecovnik igorpecovnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is O.K. to merge.

@paolosabatino
Copy link
Contributor

I may guess there are no particular issues in using cgroups v1; recently I enable BPF and CGROUPS kernel config options for kernel 5.15 on rk322x and folks did not report any side effects so far.

@igorpecovnik igorpecovnik merged commit 00cd224 into armbian:master Jun 6, 2022
@rpardini
Copy link
Member

Wasted a disproportionate amount of time due to this.
This is just plain wrong. rockchip.txt is used across families and branches.
Forcing v1 hierarchy at systemd level forces things like kubernetes, containerd, Cilium etc down to v1 "compatibility mode" too (they might even "still work" though, although with little or no eBPF support).
v1 has been declared inherently insecure.
v2 is available since kernel 3.16 !!!
Kubernetes has plans to deprecate v1 eventually.
This should be (at best) done only for the specific legacy in question, or not at all and done in userpatches.

@paolosabatino
Copy link
Contributor

@rpardini hmmm, I'm very sorry this made you lose lot of time. I'm not an expert in container and affine technologies, so it turns out that the errors pointed by @theblackhole require lot of effort and time to replicate that can't spend right now.
Maybe @theblackhole can provide an alternative kernel config to fully enable working cgroups V2?

@igorpecovnik
Copy link
Member

igorpecovnik commented Jul 12, 2022

OMfG :( I wasn't aware neither.

[AR-1259](https://armbian.atlassian.net/browse/AR-1259)

@rpardini
Copy link
Member

@paolosabatino and @igorpecovnik thanks -- there's absolutely no need to be sorry.
This is just the way it is, done in the best of intentions, I also had no idea of the possible impacts before I detected them deep down the stack, and I also would have approved this PR.
What I think is more interesting is that rockchip.txt is also used in rockchip64_common (!) so this spreads to way too many boards. There's more cgroups stuff at the bootscript itself both for armhf and 64 under docker_optimizations so maybe that's a better place for this...

@theblackhole
Copy link
Contributor Author

theblackhole commented Jul 12, 2022

Oh damn. Well, since it is also used in newer boards, just rollback this PR until a proper fix is found (...or not, since more and more apps are beginning to drop armhf support).

Fortunatly, this patch can be applied individually without any custom kernel build (by modifying /boot/armbianEnv.txt). I don't know if you have a "known issues" section or somewhere we can put this issue and workaround.

@rpardini Maybe @theblackhole can provide an alternative kernel config to fully enable working cgroups V2?

Sorry, the kernel config I tried, that are supposed to fix the issue, didn't work. And since I opened this PR, I didn't have the opportunity (and time) to look further into it.
I wish I could help to find a proper kernel config to fix the Tinkerboard but it would require a lot of time because I'm not familiar with kernel configs and features.
Also a Tinkerboard 1 + Docker + Armbian (instead of the official tinkerOS) stack is rare I think, so unless I'm wrong, this might not worth to spend time into it ;)

@igorpecovnik
Copy link
Member

We can perhaps solve this by adding a separate env file for 64bit https://github.com/armbian/build/blob/master/config/sources/families/include/rockchip64_common.inc#L6 ?

@rpardini
Copy link
Member

✅ Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants