Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

process instalbility observed when running inside a container laucnhed with daemonset #30106

Closed
sbezverk opened this issue Aug 4, 2016 · 6 comments
Labels
area/kubelet lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@sbezverk
Copy link
Contributor

sbezverk commented Aug 4, 2016

I noticed some issue when launching container using the daemonset. The same container launched with ReplicationController works 100% stable. The issue are mostly related to a process restart, some errors and inability to create a named socket. It is not a config issue as, as soon as I add strace in the command line of the process, it starts working fine. It seems some sort of a racing condition with daemoset. The issue is 100% reproducible, I am ready to offer access to the impacted test bed for further troubleshooting. Here is the error which is generated when container starts with DaemonSet:
[19202.800318] traps: ovsdb-server[24147] general protection ip:7fb18600be37 sp:7ffef3da0d00 error:0 in libc-2.17.so[7fb185fd5000+1b7000]
I am on Cenots 7.2, kubernetes 1.3.4-dirty (I needed this version for another issue which is fixed in this release). Please let me know if somebody would be interested to check it out.

@k8s-github-robot k8s-github-robot added area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. labels Aug 4, 2016
@dchen1107
Copy link
Member

@sbezverk If possible, could you please provide the following information:

  1. Run your container through rc, then run command kubectl get -o yaml pods
  2. Run your container as daemonset, then run command kubectl get -o yaml pods
  3. Run your container as daemonset, run kubectl logs --previous pod-id container to retrieve previous terminated container's log

@sbezverk
Copy link
Contributor Author

sbezverk commented Aug 5, 2016

@dchen1107 It turned out that replication controller also shows the similar issue :(
Here is the link to requested information.
Collected logs
Let me know if additional info is required.

@sbezverk
Copy link
Contributor Author

sbezverk commented Aug 5, 2016

@dchen1107 Interesting observation, if I start the process inside of a container with this script,
#!/bin/bash

sleep 1
/usr/sbin/ovsdb-server /etc/openvswitch/conf.db -vconsole:emer -vsyslog:err --remote=punix:/run/openvswitch/db.sock -vfile:info --log-file=/var/log/kolla/openvswitch/ovsdb-server.log

it works perfectly on both compute nodes.

@fejta-bot
Copy link

Issues go stale after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 16, 2017
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle rotten
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 15, 2018
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubelet lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

No branches or pull requests

5 participants