Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k3s requires /proc/config.gz but unavailable in Debian11 kernel 5.10 #3500

Closed
jinserk opened this issue Jun 23, 2021 · 12 comments
Closed

k3s requires /proc/config.gz but unavailable in Debian11 kernel 5.10 #3500

jinserk opened this issue Jun 23, 2021 · 12 comments

Comments

@jinserk
Copy link

jinserk commented Jun 23, 2021

Environmental Info:
K3s Version:
v1.20.2+k3s1

Node(s) CPU architecture, OS, and Version:
x86_64, Debian11 Kernel 5.10.0.7

Cluster Configuration:
Substructure of Flyte

Describe the bug:
k3s doesn't run properly

$ docker exec -it flyte-sandbox k3s check-config

Verifying binaries in /var/lib/rancher/k3s/data/8f4b194129852507eab4a55117fc942e0688ec9a70ffdaa5911ccc6652220f76/bin:
- sha256sum: good
- links: good

System:
- /sbin iptables v1.8.6 (legacy): ok
- swap: should be disabled
- routes: ok

Limits:
- /proc/sys/kernel/keys/root_maxkeys: 1000000

Device "configs" does not exist.
modprobe: can't change directory to '/lib/modules': No such file or directory
error: cannot find kernel config 
  try running this script again, specifying the kernel config:
  set CONFIG=/path/to/kernel/.config or add argument /path/to/kernel/.config

This looks originated from the unavailability of /proc/config.gz, as described here

Does Debian have /proc/config.gz?
/proc/config.gz isn't available in Debian, because the config is provided in /boot/config-*, no need for the in-memory variant (Kernel configuration CONFIG_IKCONFIG and CONFIG_IKCONFIG_PROC). See 541489

The following bash command match the current configuration file (on standard kernel):

ls /boot/config-$(uname -r)
@brandond
Copy link
Member

brandond commented Jun 23, 2021

check-config is just a standalone utility that can be run to verify system readiness; it doesn't need to succeed in order for k3s to run properly. Are you seeing any issues other than the check-config failure?

@jinserk
Copy link
Author

jinserk commented Jun 23, 2021

@brandond Yes. Actually the k3s in the Flyte-sandbox container doesn't work. Here is the discussion about the issue with Flyte team: flyteorg/flyte#1141

@brandond
Copy link
Member

I'm not familiar with flyte, but that thread appears to be about using minikube, not k3s. Do you have any information on how k3s fails in this configuration?

@jinserk
Copy link
Author

jinserk commented Jun 23, 2021

k3s.log

Can you see the /var/log/k3s.log extracted from the Flyte container using docker exec -it flyte-sandbox cat /var/log/k3s.log > k3s.log

@brandond
Copy link
Member

brandond commented Jun 23, 2021

I'm not really sure what the full stack is here - it sounds like you're running flyte in docker, and flyte in turn is running k3s? Either way, it looks like the terminal failure is due to missing cgroups:

time="2021-06-23T17:30:29.519697067Z" level=warning msg="Disabling CPU quotas due to missing cpu.cfs_period_us"
time="2021-06-23T17:30:29.519751137Z" level=warning msg="Disabling pod PIDs limit feature due to missing cgroup pids support"
F0623 17:30:29.522130     184 server.go:181] cannot set feature gate SupportPodPidsLimit to false, feature is locked to true

You might try adding systemd.unified_cgroup_hierarchy=1 to your kernel cmdline and restarting, see if that works?

@jinserk
Copy link
Author

jinserk commented Jun 23, 2021

@brandond Thanks but it doesn't resolve my issue. The situation looks the same.
k3s.log

@brandond
Copy link
Member

Yes, same error. I would work towards figuring out why the required cgroups are unavailable. It may have something to do with how flyte is setting up the environment, as k3s definitely works within Docker - either via k3d, or just running the rancher/docker image.

@yindia
Copy link

yindia commented Jun 23, 2021

@brandond in flyte, we are not doing anything special just running k3s, and then we will deploy the manifest.
Flyte sandbox dockerfile : https://github.com/flyteorg/flyte/tree/master/docker/sandbox

@brandond
Copy link
Member

We have our own Docker image at rancher/k3s, have you considered using FROM rancher/k3s instead of downloading the release artifact?

Are you creating the K3s containers as privileged?

Does this work on other distributions? I am assuming so, since this issue appears to be Debian 11 specific?

@jinserk
Copy link
Author

jinserk commented Jun 23, 2021

At least on Ubuntu 18.04 it works. Don't know on 20.04.
I will check rancher/k3s instead. Thank you!

@brandond
Copy link
Member

I am guessing that this has something to do with unified (v2) vs v1 cgroups, but I'm not sure how to propagate that all the way though Docker on Debian. systemd.unified_cgroup_hierarchy=0 or 1 is usually how I toggle it back and forth, but they may have changed it.

@stale
Copy link

stale bot commented Dec 21, 2021

This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 180 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.

@stale stale bot added the status/stale label Dec 21, 2021
@stale stale bot closed this as completed Jan 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants