Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kata-deploy causing cri-o to fail after node is up #64

Closed
amshinde opened this issue Mar 5, 2019 · 9 comments
Closed

kata-deploy causing cri-o to fail after node is up #64

amshinde opened this issue Mar 5, 2019 · 9 comments

Comments

@amshinde
Copy link

amshinde commented Mar 5, 2019

kata-deploy adds multiple entries for the kata-runtime in /etc/crio/crio.conf, after the node is up.
This causes cri-o to fail causing the node state to show as Not-ready.
For running any further pods, I needed to manually remove the duplicate entries from the crio conf file, stop the kata-deploy ds and restart cri-o.

@amshinde
Copy link
Author

amshinde commented Mar 5, 2019

See kata-containers/packaging#374

@amshinde
Copy link
Author

amshinde commented Mar 5, 2019

cc @krsna1729 @egernst

@ganeshmaharaj
Copy link
Contributor

@amshinde didn't kata-containers/packaging#334 fix that issue? that patch was intended for that particular issue.

@krsna1729
Copy link
Contributor

@ganeshmaharaj you fixed it for network namespace lifecycle. Same needs to be done for everything we modify in kata-deploy :P

@ganeshmaharaj
Copy link
Contributor

@krsna1729 yeah, just realized that. Will try and get that fixed out.

@grahamwhaley
Copy link

just fyi, whilst I can see it, @ganeshmaharaj - I just ran up your Ubuntu vagrant, and used create_stack.sh minimal. I was just checking if crio was up (related to #73 ), and did:

$ systemctl status crio
 * crio.service - Open Container Initiative Daemon
   Loaded: loaded (/lib/systemd/system/crio.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Wed 2019-03-20 06:46:55 PDT; 3min 55s ago
     Docs: https://github.com/kubernetes-sigs/cri-o
  Process: 11651 ExecStart=/usr/bin/crio $CRIO_STORAGE_OPTIONS $CRIO_NETWORK_OPTIONS $CRIO_METRICS_OPTION
 Main PID: 11651 (code=exited, status=1/FAILURE)

... last key parsed 'crio.runtime.runtimes'): Key 'crio.runtime.runtimes.kata-qemu' has already been defined.
...

Having a look in /etc/crio/crio.conf:

# Path to directory where CNI plugin binaries are located.
plugin_dir = "/opt/cni/bin"
[crio.runtime.runtimes.kata-qemu]
  runtime_path = "/opt/kata/bin/kata-qemu"

[crio.runtime.runtimes.kata-fc]
  runtime_path = "/opt/kata/bin/kata-fc"
[crio.runtime.runtimes.kata-qemu]
  runtime_path = "/opt/kata/bin/kata-qemu"

[crio.runtime.runtimes.kata-fc]
  runtime_path = "/opt/kata/bin/kata-fc"

which feels like this issue does it? Deleting those last duplicate lines and systemctl restart crio gets the crio service back up.

@krsna1729
Copy link
Contributor

@grahamwhaley there is a PR that got merged in kata-deploy to make it idempotent. Waiting for next release/docker image update.

@krsna1729
Copy link
Contributor

@grahamwhaley @amshinde can you verify you dont see this anymore?

@grahamwhaley
Copy link

@krsna1729 - I've not witnessed the issue in the last few days, so I think it is fixed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants