Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crio fails to start on CentOS 7 fresh installation #3631

Closed
geragio opened this issue Apr 22, 2020 · 16 comments
Closed

crio fails to start on CentOS 7 fresh installation #3631

geragio opened this issue Apr 22, 2020 · 16 comments
Assignees

Comments

@geragio
Copy link

geragio commented Apr 22, 2020

Description

Hi,

I think there is some problem with the installation of cri-o on CentOS 7(fresh installation)
I'm following the instructions reported on README.md but the service crio doesn't start

The problem seems to be on crio-wipe.service(which is a crio.service dependency), it can't find the file /var/run/crio/version when it starts

#3560
@lsm5

Steps to reproduce the issue:

  1. curl -L -o /etc/yum.repos.d/devel:kubic:libcontainers:stable.repo https://download.opensuse.org/repositories/devel:kubic:libcontainers:stable/CentOS_7/devel:kubic:libcontainers:stable.repo
  2. curl -L -o /etc/yum.repos.d/devel:kubic:libcontainers:stable:cri-o:1.17.repo https://download.opensuse.org/repositories/devel:kubic:libcontainers:stable:cri-o:1.17/CentOS_7/devel:kubic:libcontainers:stable:cri-o:1.17.repo
  3. yum install cri-o
  4. systemctl daemon-reload
  5. systemctl start crio

Describe the results you received:
A dependency job for crio.service failed. See 'journalctl -xe' for details.

Describe the results you expected:
crio.service started

Additional information you deem important (e.g. issue happens only occasionally):

journalctl -xe output:

Unit crio-wipe.service has begun starting up.
apr 22 19:17:30 localhost.localdomain crio[1409]: version file /var/run/crio/version not found: open /var/run/crio/version: no such file or directory. Triggering wipetime="2020-0
apr 22 19:17:30 localhost.localdomain crio[1409]: time="2020-04-22 19:17:30.105869303+02:00" level=fatal msg="failed to mount overlay for metacopy check: invalid argument"
apr 22 19:17:30 localhost.localdomain kernel: overlayfs: unrecognized mount option "metacopy=on" or missing value
apr 22 19:17:30 localhost.localdomain systemd[1]: crio-wipe.service: main process exited, code=exited, status=1/FAILURE
apr 22 19:17:30 localhost.localdomain systemd[1]: Failed to start CRI-O Auto Update Script.
Subject: Unit crio-wipe.service has failed
Defined-By: systemd
Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

Unit crio-wipe.service has failed.

The result is failed.
apr 22 19:17:30 localhost.localdomain systemd[1]: Dependency failed for Container Runtime Interface for OCI (CRI-O).
Subject: Unit crio.service has failed
Defined-By: systemd
Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

Unit crio.service has failed.

The result is dependency.
apr 22 19:17:30 localhost.localdomain systemd[1]: Job crio.service/start failed with result 'dependency'.
apr 22 19:17:30 localhost.localdomain systemd[1]: Unit crio-wipe.service entered failed state.
apr 22 19:17:30 localhost.localdomain systemd[1]: crio-wipe.service failed.
apr 22 19:17:30 localhost.localdomain polkitd[719]: Unregistered Authentication Agent for unix-process:1403:86605 (system bus name :1.23, object path /org/freedesktop/PolicyKit1/

Output of crio --version:

 crio version 1.17.2

Additional environment details (AWS, VirtualBox, physical, etc.):
CentOS 7.7 1908 Minimal Installation

@lsm5
Copy link
Member

lsm5 commented Apr 22, 2020

@haircommander do you know what's to be done for:
apr 22 19:17:30 localhost.localdomain kernel: overlayfs: unrecognized mount option "metacopy=on" or missing value

@lsm5
Copy link
Member

lsm5 commented Apr 22, 2020

not sure about the /var/run/crio/version either, @dougsland PTAL ^

@haircommander
Copy link
Member

haircommander commented Apr 22, 2020

@geragio can you try to change the line
mountopt = "nodev,metacopy=on"

to
mountopt = "nodev"
in /etc/containers/storage.conf

@haircommander
Copy link
Member

@rhatdan or @nalind wdyt

@haircommander
Copy link
Member

note: this would come up with crio if crio-wipe wasn't called

@geragio
Copy link
Author

geragio commented Apr 22, 2020

@geragio can you try to change the line
mountopt = "nodev,metacopy=on"

to
mountopt = "nodev"
in /etc/containers/storage.conf

Hi @haircommander

I changed mountopt as suggested and now crio.service starts without any error:

● crio.service - Container Runtime Interface for OCI (CRI-O)
Loaded: loaded (/usr/lib/systemd/system/crio.service; disabled; vendor preset: disabled)
Active: active (running) since mer 2020-04-22 20:44:10 CEST; 1min 7s ago
Docs: https://github.com/cri-o/cri-o
Main PID: 1380 (crio)
CGroup: /system.slice/crio.service
└─1380 /usr/bin/crio

apr 22 20:44:10 localhost.localdomain systemd[1]: Starting Container Runtime Interface for OCI (CRI-O)...
apr 22 20:44:10 localhost.localdomain crio[1380]: time="2020-04-22 20:44:10.089552281+02:00" level=info msg="using conmon executable "/usr/libexec/crio/conmon""
apr 22 20:44:10 localhost.localdomain crio[1380]: time="2020-04-22 20:44:10.105019011+02:00" level=info msg="Found CNI network crio-bridge (type=bridge) at /etc/cni/net.d/100-crio-bridge.conf"
apr 22 20:44:10 localhost.localdomain crio[1380]: time="2020-04-22 20:44:10.118034405+02:00" level=info msg="Found CNI network 200-loopback.conf (type=loopback) at /etc/cni/net.d/200-loopback.conf"
apr 22 20:44:10 localhost.localdomain crio[1380]: time="2020-04-22 20:44:10.118065705+02:00" level=info msg="Update default CNI network name to crio-bridge"
apr 22 20:44:10 localhost.localdomain crio[1380]: time="2020-04-22 20:44:10.124103499+02:00" level=info msg="no seccomp profile specified, using the internal default"
apr 22 20:44:10 localhost.localdomain systemd[1]: Started Container Runtime Interface for OCI (CRI-O).

@haircommander
Copy link
Member

@dougsland the kernel on centos 7 seems to be too old for the metacopy=on option, can you remove it?

@geragio
Copy link
Author

geragio commented Apr 22, 2020

If it can help, the kernel version I used for the test is: 3.10.0-1062.18.1.el7.x86_64

@haircommander
Copy link
Member

@geragio what's the output of
rpm -q --whatprovides /etc/containers/storage.conf

@dougsland
Copy link
Contributor

I can't see metacopy=on in the default install of containers-common. Have you added metacopy=on by hand? @geragio

@geragio
Copy link
Author

geragio commented Apr 23, 2020

@geragio what's the output of
rpm -q --whatprovides /etc/containers/storage.conf

The output is: containers-common-0.2.0-2.1.el7.x86_64

@geragio
Copy link
Author

geragio commented Apr 23, 2020

I can't see metacopy=on in the default install of containers-common. Have you added metacopy=on by hand? @geragio

No, I didn't add anything by hand. I did the installation on a VM installed from scratch

@haircommander
Copy link
Member

haircommander commented Apr 23, 2020

containers-common may be coming from somewhere that isn't OBS
what's the output of yum info containers-common

@dougsland
Copy link
Contributor

dougsland commented Apr 23, 2020

@geragio reproduced the report. My setup was not getting the package from OBS. Going to prepare a build with a patch for centos7.

@dougsland
Copy link
Contributor

@geragio could you please test the new build available? If all good, please consider closing this report.

Thanks!

@geragio
Copy link
Author

geragio commented Apr 24, 2020

Hi @dougsland, it seems it's working correctly now.

Now I'm running cri-o 2:1.17.2-5.1.el7 and the metacopy=on parameter disappeared from the storage.conf file

Thank you so much for your precious support, I'm closing this issue

@geragio geragio closed this as completed Apr 24, 2020
simonswine pushed a commit to simonswine/fedora-rpm-crio that referenced this issue May 30, 2020
metacopy is not recognized by old kernel versions.

Signed-off-by: Douglas Schilling Landgraf <dougsland@redhat.com>
simonswine pushed a commit to simonswine/fedora-rpm-crio that referenced this issue May 30, 2020
metacopy is not recognized by old kernel versions.

Signed-off-by: Douglas Schilling Landgraf <dougsland@redhat.com>
simonswine pushed a commit to simonswine/fedora-rpm-crio that referenced this issue May 30, 2020
metacopy is not recognized by old kernel versions.

Signed-off-by: Douglas Schilling Landgraf <dougsland@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants