-
Notifications
You must be signed in to change notification settings - Fork 767
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
knative HelloWorld Serving Code Example #784
Comments
Ok I have done some further testing, this works with clean 18.04.3 LTS (bionic) however issue exists in 19.10 (eoan). |
I'm getting a similar error attempting to exec -ti into a pod, something I did regularly during development while under minikube (attempting to switch). The pod has the status of "Running" and has a single container.
|
@rbt can you please attach the tarball from |
1.16/stable: inspection-report-20191111_173728.tar.gz After fiddling with the above, I realized some pods worked. I've narrowed it down to the below setting in the failing deployments. It's possible minikube was incomplete/buggy by allowing access and microk8s is working as expected; albeit with a terrible error message.
|
Same issue on a fresh microk8s installation with any $ snap list microk8s
Name Version Rev Tracking Publisher Notes
microk8s v1.16.3 1079 stable canonical✓ classic $ cat /etc/os-release
NAME="Ubuntu"
VERSION="19.10 (Eoan Ermine)" # ------------------------ >8 ------------------------
readinessProbe:
exec:
command:
- /ko-app/queue
- -probe-period
- "0"
# ------------------------ >8 ------------------------ $ kubectl describe pod
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Started 70s kubelet, manny Started container queue-proxy
Warning Unhealthy 69s kubelet, manny Readiness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "66a30c11deb99fd1a8e6929c83e7d9e59512a506a8693766c7e3e873c6808c08": OCI runtime exec failed: exec failed: container_linux.go:345: starting container
process caused "apparmor failed to apply profile: write /proc/self/attr/exec: operation not permitted": unknown
Warning Unhealthy 68s kubelet, manny Readiness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "95a2a1c687a12900fa1f9bd0725971ff51418b99d9638ea38673f36240a46b86": OCI runtime exec failed: exec failed: container_linux.go:345: starting container
process caused "apparmor failed to apply profile: write /proc/self/attr/exec: operation not permitted": unknown
Warning Unhealthy 67s kubelet, manny Readiness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "8d7e2c2ae8a0e9161b70061a1c7bc2231c47dca2fe3e88a148a481e6379c6a20": OCI runtime exec failed: exec failed: container_linux.go:345: starting container
process caused "apparmor failed to apply profile: write /proc/self/attr/exec: operation not permitted": unknown
Warning Unhealthy 66s kubelet, manny Readiness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "9c4658f5c2352f41d335bddadb84341acc328ddfde5adc107160bb497f3fdc7e": OCI runtime exec failed: exec failed: container_linux.go:345: starting container
process caused "apparmor failed to apply profile: write /proc/self/attr/exec: operation not permitted": unknown
Warning Unhealthy 65s kubelet, manny Readiness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "a62f586b56f11a04828bb9b97652ce4da36a750cdb07a02d5b1d720288212018": OCI runtime exec failed: exec failed: container_linux.go:345: starting container
process caused "apparmor failed to apply profile: write /proc/self/attr/exec: operation not permitted": unknown
Warning Unhealthy 64s kubelet, manny Readiness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "de440bd965a761b7faf1dc17d668be728015cc885ef63c92f4e6bc9b4db6f925": OCI runtime exec failed: exec failed: container_linux.go:345: starting container
process caused "apparmor failed to apply profile: write /proc/self/attr/exec: operation not permitted": unknown
Warning Unhealthy 63s kubelet, manny Readiness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "ef93e3317c6cc9c28a133f9d6e3c89709314e740ecdff62bde8acf903f765bde": OCI runtime exec failed: exec failed: container_linux.go:345: starting container
process caused "apparmor failed to apply profile: write /proc/self/attr/exec: operation not permitted": unknown
Warning Unhealthy 62s kubelet, manny Readiness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "ed74b405332d2909a9acd5d6f6bd6e292959d23dd6bf3cc3be398e74775e416b": OCI runtime exec failed: exec failed: container_linux.go:345: starting container
process caused "apparmor failed to apply profile: write /proc/self/attr/exec: operation not permitted": unknown
Warning Unhealthy 61s kubelet, manny Readiness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "7f86e575600d371b92b14696c0e38b27164ef2730ae54d848e1ef9acd95b0f42": OCI runtime exec failed: exec failed: container_linux.go:345: starting container
process caused "apparmor failed to apply profile: write /proc/self/attr/exec: operation not permitted": unknown @ktsakalozos here is the report you asked for. |
I am also having this issue on fresh Microk8s |
Im facing exactly the same issue regarding apparmor when starting the hello demo from knative (https://knative.dev/docs/serving/getting-started-knative-app/) Readiness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "7612ded2f1b669ec2610e2f1fddd53fe39cf26c8f05c35c7626676fea1daf021": OCI runtime exec failed: exec failed: container_linux.go:345: starting container process caused "apparmor failed to apply profile: write /proc/self/attr/exec: operation not permitted": unknown Are there already solutions for this issue or is a downgrade to 18.04.3 LTS (bionic) the only solution at the moment? |
I am seeing the following in the dmesg logs (inspection-report attached above in the
|
@ktsakalozos both the
|
A new finding: if I disable all profiles with $ sudo aa-status
apparmor module is loaded.
1 profiles are loaded.
1 profiles are in enforce mode.
cri-containerd.apparmor.d
0 profiles are in complain mode.
2 processes have profiles defined.
2 processes are in enforce mode.
/go/bin/helloworld (14862) cri-containerd.apparmor.d
/ko-app/queue (14917) cri-containerd.apparmor.d
0 processes are in complain mode.
0 processes are unconfined but have a profile defined. Now comes the interesting part: if I reload all profiles with $ sudo aa-status
apparmor module is loaded.
51 profiles are loaded.
13 profiles are in enforce mode.
/sbin/dhclient
/usr/bin/man
/usr/lib/NetworkManager/nm-dhcp-client.action
/usr/lib/NetworkManager/nm-dhcp-helper
/usr/lib/connman/scripts/dhclient-script
/usr/sbin/tcpdump
cri-containerd.apparmor.d
lsb_release
man_filter
man_groff
nvidia_modprobe
nvidia_modprobe//kmod
snap.core.hook.configure
38 profiles are in complain mode.
/snap/core/8268/usr/lib/snapd/snap-confine
/snap/core/8268/usr/lib/snapd/snap-confine//mount-namespace-capture-helper
/usr/lib/snapd/snap-confine
/usr/lib/snapd/snap-confine//mount-namespace-capture-helper
snap-update-ns.core
snap-update-ns.microk8s
snap.microk8s.add-node
snap.microk8s.cilium
snap.microk8s.config
snap.microk8s.ctr
snap.microk8s.daemon-apiserver
snap.microk8s.daemon-apiserver-kicker
snap.microk8s.daemon-cluster-agent
snap.microk8s.daemon-containerd
snap.microk8s.daemon-controller-manager
snap.microk8s.daemon-etcd
snap.microk8s.daemon-flanneld
snap.microk8s.daemon-kubelet
snap.microk8s.daemon-proxy
snap.microk8s.daemon-scheduler
snap.microk8s.disable
snap.microk8s.enable
snap.microk8s.helm
snap.microk8s.hook.configure
snap.microk8s.hook.install
snap.microk8s.hook.remove
snap.microk8s.inspect
snap.microk8s.istioctl
snap.microk8s.join
snap.microk8s.juju
snap.microk8s.kubectl
snap.microk8s.leave
snap.microk8s.linkerd
snap.microk8s.remove-node
snap.microk8s.reset
snap.microk8s.start
snap.microk8s.status
snap.microk8s.stop
2 processes have profiles defined.
2 processes are in enforce mode.
/go/bin/helloworld (2654) cri-containerd.apparmor.d
/ko-app/queue (2704) cri-containerd.apparmor.d
0 processes are in complain mode.
0 processes are unconfined but have a profile defined. If I do the same procedure after executing I believe the order in which AppArmor profiles are loaded matters here somehow. |
@ktsakalozos @joedborg, @jdstrand The culprit is the The same profile can then be re-loaded (same command with |
The comment about tearing down the apparmor policy from underneath microk8s could explain things. Put another way, when microk8s is started, it is supposed to be started under an apparmor profile (lenient with classic snap). microk8s starts up and eventually loads cri-containerd.apparmor.d and containerd is allowed to start containers (via runc) under this profile. If some of these are not in place (eg, something isn't being setup right on start), then weird things can happen. Eg, if containerd isn't setting up runc right to use it, then might see the 'no new privs' issue. The 'failed to apply profile' issue might be because cri-containerd.apparmor.d isn't loaded yet. That said, this comment from @rbt is important:
https://kubernetes.io/docs/concepts/policy/pod-security-policy/#privilege-escalation tells us: "These options control the allowPrivilegeEscalation container option. This bool directly controls whether the no_new_privs flag gets set on the container process." When nnp is invoked, it can't be revoked and it can complicate profile transitions that can block. OTOH, I don't know the order of operations, but it sounds like nnp is being applied before the profile transition to cri-containerd.apparmor.d, and the kernel is applying its heuristics to determine if that is ok and deciding it is not. |
Sorry for not seeing this earlier but I think @jdstrand 's analysis is spot-on, nnp is indeed applied by runC before the apparmor profile transition to cri-containerd. I would recommended to not set that pod security setting |
@jdstrand @anonymouse64 you both explained the case for the "no new privs" error, which is great and I thank you for the lengthy explanations, |
Is it possible there are multiple cri-containerd.apparmor.d profiles? The microk8s snap seems to copy over the cri-containerd.apparmor.d profile and load that when containerd starts: When you say it works above, perhaps that is because when containerd is already running and it is asked to start a new container with an apparmor profile, it will generate a default one which is different from the one that the snap is copying in. All that being said, it would be great to see a diff between the profile that is in the snap and the one that containerd generates for itself. @antoineco also note that the exec probe issue you are having could be due to the no new privs issue, because you are seeing a failure to transition to the profile, which can happen if containerd drops privileges because of that config, and then the profile it tries to transition to has more privileges. Are you certain that your containers don't have that |
@anonymouse64 please ignore the first case where I unload everything, this was only the first step of my troubleshooting. The profile that I unload (an reload) is the one matching the systemd service, not the CRI profile loaded by containerd (the one that gets copied, as you mentioned). At that point containerd is using the profile included with the snap, not its own defaults.
The diff between the 33c33
< deny /sys/firmware/efi/efivars/** rwklx,
---
> deny /sys/firmware/** rwklx,
36,37d35
<
< # suppress ptrace denials when using 'docker ps' or using 'ps' inside a container
39,41d36
<
< signal (receive) peer=snap.microk8s.daemon-kubelet,
< signal (receive) peer=snap.microk8s.daemon-containerd, Besides, I'm still puzzled about the fact that simply reloading the $ sudo apparmor_parser -R /var/lib/snapd/apparmor/profiles/snap.microk8s.daemon-containerd
$ sudo apparmor_parser -a /var/lib/snapd/apparmor/profiles/snap.microk8s.daemon-containerd # create the Knative Service ...
$ kubectl get pod
NAME READY STATUS RESTARTS AGE
helloworld-go-w88zp-deployment-5977bc5fff-7547v 2/2 Running 0 9s Does it mean the profile is not re-applied until the |
As an aside, the way I helped @joedborg handle this in the kubernetes-worker snap temporarily is to patch runC to never drop privileges. Perhaps microk8s needs to do that as well in order to work with knative. |
@anonymouse64 where can I find the runC patches you suggested? By not allowing runC to drop privileges are there any restrictions/limitations to the workloads we can serve? |
Indeed, this immediately solved the problem for me as well. |
Please note that |
I was terribly sloppy in my quoting of Antoine above (which I've fixed). I did execute |
Im facing the same problem $ sudo apparmor_parser -r /var/lib/snapd/apparmor/profiles/snap.microk8s.daemon-containerd Does not works for me But like mentioned before, this worked: $ sudo apparmor_parser -R /var/lib/snapd/apparmor/profiles/snap.microk8s.daemon-containerd |
@ktsakalozos sorry I forgot to respond to your ping, were you able to sync with @joedborg about the runC patches ? |
Yes, thank you. |
The same issue is reproducible on Ubuntu 20.04 (focal) and prevents us from upgrading to the latest LTS version of the OS |
Apparently something changed between Linux kernel 4.15.0 (works) and 4.18.0 (doesn't work) |
I am experiencing same issue after upgrading to 20.04. |
This 'works' for you because when you remove the profile (-R), the containerd process is put under the 'unconfined' profile and adding (-a) the profile back does not reattach the existing process to the added profile (ie, it is still unconfined). Whenever you stop/start containerd, I would expect the issue to come back up. |
When people see this issue, are there any security violations in the logs? Eg: 'journalctl --since yesterday |grep audit'. |
@jdstrand unfortunately not: #784 (comment) |
Ok, I tried to reproduce this with the instructions in the initial report and I see many denials related to no new privs when enabling the various services and can confirm aa-status for the containers that are started is wrong when containerd is running confined and correct when it isn't. While I didn't see the same errors others did when creating the helloworld-go pod (perhaps I didn't create it properly?), I can say for sure that unless containerd is handling nnp properly, the kernel nnp feature will sometimes prevent containerd from transitioning a container into the cri-containerd.apparmor.d profile (regardless of if apparmor claims it is ALLOWed). The fact that it works when removing the snap.microk8s.daemon-containerd profile very strongly supports this. I believe this issue will be resolved when microk8s (classic) is updated to include https://github.com/ubuntu/microk8s/blob/feature/jdb/strict/patches/runc-patches/snap-runc-no-prctl.patch. @ktsakalozos and I discussed this last week actually and I know that it is planned to incorporate this patch, but I don't know about the timelines. |
Hi all, thank you for your time, effort and patience. Could anyone provide feedback on the fix available through:
Many thanks. |
@ktsakalozos, |
@giner any chance you gave it a run over the weekend? Thank you. |
@ktsakalozos, I've done some tests. The original issue has gone however some of my workloads are failing when running on the mentioned version of microk8s and I haven't been able to find relevant logs or other helpful details so far. |
@giner can you give me an example workload I could look at? |
I'm trying to run this https://github.com/cloudfoundry-community/eirini-on-microk8s
|
The app specific error:
k8s events (errors and warnings only):
|
Thank you @giner . Any chance you could attach the |
@giner it is not so easy for me to verify the issue you report. A search on the error pointed to a fix on containerd containerd/containerd#2392 . Is it possible to narrow down the steps needed to reproduce what you see? |
Install the latest virtualbox and vagrant, then:
Note that the above requires at least 16G RAM and good internet connection. Also Vagrant file is configured to use bionic64 base image. Can be changed to focal64, the result will be the same. |
Hi @giner , everyone, just pushed a new snap under
Thank you |
@ktsakalozos I can confirm, it works for me now too |
It seems that revision 1710 doesn't have the fix yet and 1722 has it. Is there a generic way to know which revision has a fix without installing it?
|
Not without downloading and extracting the snap. You should expect the fix to land on the stable channel with v1.19.3 |
using the comme
Should I open new issue to track this specific error? |
Deploy a clean microk8s snap deployment:
snap install microk8s --current
Enable DNS, Istio + kNative
sudo microk8s.enable dns istio knative
Deploy HelloWorld go service example from:
The pods will create and fail due to below error:
Readiness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "fc02dba843b84f907e3054501f078791474de71dce1d68e37734af3ef30fcf22": OCI runtime exec failed: exec failed: container_linux.go:345: starting container process caused "apparmor failed to apply profile: write /proc/self/attr/exec: operation not permitted": unknown
Result from kubectl get ksvc is "RevisionMissing".
Digging further but has anyone else experienced this?
The text was updated successfully, but these errors were encountered: