Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Salt-syndic process stuck on Debian #19957

Closed
claudiupopescu opened this issue Jan 22, 2015 · 18 comments
Closed

Salt-syndic process stuck on Debian #19957

claudiupopescu opened this issue Jan 22, 2015 · 18 comments
Labels
Bug broken, incorrect, or confusing behavior Core relates to code central or existential to Salt P2 Priority 2 Salt-Syndic severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around stale
Milestone

Comments

@claudiupopescu
Copy link
Contributor

Setup:

  1. Master of master controlling multiple syndics.
  2. Multiple salt masters with syndic installed and pointing to the master of masters.
  3. Each syndic has the minion connected to the master of masters.

How to reproduce:

  • Stop the salt-syndic service.
  • The service does not stop and a strace will show:
  poll([{fd=17, events=POLLIN}, {fd=22, events=POLLIN}], 2, 0) = 0 (Timeout)
  poll([{fd=17, events=POLLIN}], 1, 0)    = 0 (Timeout)
  poll([{fd=22, events=POLLIN}], 1, 0)    = 0 (Timeout)
  clock_gettime(CLOCK_MONOTONIC, {3952, 273992175}) = 0
  poll([{fd=17, events=POLLIN}, {fd=22, events=POLLIN}], 2, 1000) = 0 (Timeout)
  poll([{fd=17, events=POLLIN}], 1, 0)    = 0 (Timeout)
  poll([{fd=22, events=POLLIN}], 1, 0)    = 0 (Timeout)
  clock_gettime(CLOCK_MONOTONIC, {3953, 275512367}) = 0
  gettimeofday({1421939537, 10546}, NULL) = 0
  • Restart the syndic service and you will end up with 2 services which will result in doubling the test.ping result. Restart the service again and you will have 3 instances..
@rallytime
Copy link
Contributor

Thanks for the report @claudiupopescu and for the helpful steps to reproduce this bug. Can you also post the output of salt --versions-report?

@rallytime rallytime added Bug broken, incorrect, or confusing behavior severity-low 4th level, cosemtic problems, work around exists labels Jan 23, 2015
@rallytime rallytime added this to the Approved milestone Jan 23, 2015
@claudiupopescu
Copy link
Contributor Author

Even with the new version (updated today) I see the same issue:

           Salt: 2014.7.1
         Python: 2.7.3 (default, Mar 13 2014, 11:03:55)
         Jinja2: 2.6
       M2Crypto: 0.21.1
 msgpack-python: 0.1.10
   msgpack-pure: Not Installed
       pycrypto: 2.6
        libnacl: Not Installed
         PyYAML: 3.10
          ioflo: Not Installed
          PyZMQ: 13.1.0
           RAET: Not Installed
            ZMQ: 3.2.3
           Mako: 0.7.0

@rallytime
Copy link
Contributor

@claudiupopescu What disribution are you seeing this on? I am unable to reproduce this bug, but I installed the HEAD of 2014.7 branch using salt-bootstrap on Ubuntu 14.04. I'd like to try this with the packages you've used to install salt to try and reproduce.

root@nt-s:~# service salt-syndic start
salt-syndic start/running, process 4207
root@nt-s:~# service salt-syndic start
start: Job is already running: salt-syndic
root@nt-s:~# ps aux | grep salt
root      4309  8.2  2.4 214328 24668 ?        Ssl  16:39   0:00 /usr/bin/python /usr/bin/salt-syndic
root      4338  0.0  0.0  11980   936 pts/0    S+   16:39   0:00 grep --color=auto salt

Same questions go for @ev0rtex.

@claudiupopescu
Copy link
Contributor Author

                  Salt: 2014.7.2
                Python: 2.7.3 (default, Mar 13 2014, 11:03:55)
                Jinja2: 2.6
              M2Crypto: 0.21.1
        msgpack-python: 0.1.10
          msgpack-pure: Not Installed
              pycrypto: 2.6
               libnacl: Not Installed
                PyYAML: 3.10
                 ioflo: Not Installed
                 PyZMQ: 13.1.0
                  RAET: Not Installed
                   ZMQ: 3.2.3
                  Mako: 0.7.0
 Debian source package: 2014.7.2+ds-1~bpo70+1

On debian 7.8

Followed the official documentation of installing salt from repository.

@claudiupopescu
Copy link
Contributor Author

# ps axf | grep salt-syndic
24140 pts/1    S+     0:00  |                   \_ grep --color=auto salt-syndic
11890 ?        Sl     0:33 /usr/bin/python /usr/bin/salt-syndic -d
# service salt-syndic stop
[ ok ] Stopping salt syndic control daemon: salt-syndic.
# ps axf | grep salt-syndic
24170 pts/1    R+     0:00  |                   \_ grep --color=auto salt-syndic
11890 ?        Sl     0:33 /usr/bin/python /usr/bin/salt-syndic -d

As you can see the process is there after stopping it.

@rallytime
Copy link
Contributor

Ok thanks very much for that update. Looks like we have a bug in the init script for the Debian package: https://github.com/saltstack/salt/blob/develop/debian/salt-syndic.init (I'm not familiar with this file myself, but that's where one could start looking if they wanted to try and tackle this bug.)

I am curious to see if @ev0rtex is running on the same system or not, since he is seeing this bug too.

@claudiupopescu
Copy link
Contributor Author

The init file looks ok to me but who knows, maybe I missed something :)

Here you have more details:

# lsof -c salt-syndic | grep log
salt-synd 14019 root    3w   REG              202,0     4098  19272 /syndic.log

# grep -i pid /syndic.log
2015-03-24 07:56:47,452 [salt.utils.process                          ][DEBUG   ] Created pidfile: /salt-syndic.pid

# ls -la / | grep syndic
-rw-r--r--  1 root root     5 Mar 24 07:56 salt-syndic.pid
-rw-r--r--  1 root root  4098 Mar 24 07:59 syndic.log

As you can see salt-syndic uses / (root) for logging and pid file.

Now the init file:

# diff /etc/init.d/salt-syndic /etc/init.d/salt-minion
3c3
< # Provides:          salt-syndic
---
> # Provides:          salt-minion
8c8
< # Short-Description: salt syndic control daemon
---
> # Short-Description: salt minion control daemon
15,17c15,17
< DESC="salt syndic control daemon"
< NAME=salt-syndic
< DAEMON=/usr/bin/salt-syndic
---
> DESC="salt minion control daemon"
> NAME=salt-minion
> DAEMON=/usr/bin/salt-minion
40c40
<     start-stop-daemon --start --quiet --pidfile $PIDFILE --exec $DAEMON -- \
---
>     start-stop-daemon --start --quiet --background --pidfile $PIDFILE --exec $DAEMON -- \

As you can see the only major difference is --background which does not affect the pid and log path.

I will gladly provide more details but for now I don't have time to dive in salt code.

@ev0rtex
Copy link

ev0rtex commented Mar 24, 2015

@rallytime yeah, all of our servers are running wheezy so it could most definitely be a problem specifically with the Debian init script.

@rallytime
Copy link
Contributor

Good info @claudiupopescu and @ev0rtex. The only difference I see there is the --background as well.

@rallytime rallytime changed the title Salt-syndic process stuck Salt-syndic process stuck on Debian Mar 24, 2015
@rallytime
Copy link
Contributor

@claudiupopescu Can you let us know if #23341 resolves this issue as well, once it gets merged in?

@rallytime rallytime added severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around fixed-pls-verify fix is linked, bug author to confirm fix P2 Priority 2 Salt-Syndic and removed severity-low 4th level, cosemtic problems, work around exists labels May 5, 2015
@claudiupopescu
Copy link
Contributor Author

Sure, I will be forced to test anyway once the package is available :)

@jfindlay jfindlay added the Core relates to code central or existential to Salt label May 26, 2015
@rallytime
Copy link
Contributor

@claudiupopescu Just wondering if this one is still causing troubles for you.

@claudiupopescu
Copy link
Contributor Author

I still have the same problem with this version:

salt --versions
                  Salt: 2015.5.0
                Python: 2.7.3 (default, Mar 13 2014, 11:03:55)
                Jinja2: 2.6
              M2Crypto: 0.21.1
        msgpack-python: 0.1.10
          msgpack-pure: Not Installed
              pycrypto: 2.6
               libnacl: Not Installed
                PyYAML: 3.10
                 ioflo: Not Installed
                 PyZMQ: 13.1.0
                  RAET: Not Installed
                   ZMQ: 3.2.3
                  Mako: 0.7.0
 Debian source package: 2015.5.0+ds-1~bpo70+1

@rallytime rallytime removed the fixed-pls-verify fix is linked, bug author to confirm fix label Jun 25, 2015
@rallytime
Copy link
Contributor

Ok thanks for the update. I've removed the "Fixed Pending Verification" label so we can get eyes on this again.

@ev0rtex
Copy link

ev0rtex commented Jun 29, 2015

FWIW, I am running 2015.5.0 now and haven't had any further issues with mutiple returns. Sorry for the late reply here - I was away from things for a bit and didn't have a good chance to look at things and test it.

@claudiupopescu
Copy link
Contributor Author

@ev0rtex can you check if after service salt-syndic restart you see more than one syndic process running? Try that multiple times and see if the process is running only once.

For me on debian wheezy I still have the same issue.

@ev0rtex
Copy link

ev0rtex commented Jul 1, 2015

@claudiupopescu I should have checked that before. You are correct that syndic processes will be left running upon a service restart:

~$ ps -eaf | grep syndic
root      2910     1  0 Jun26 ?        00:01:42 /usr/bin/python /usr/bin/salt-syndic -d
~$ service salt-syndic restart
Restarting salt syndic control daemon: salt-syndic.
~$ ps -eaf | grep syndic
root      2910     1  0 Jun26 ?        00:01:42 /usr/bin/python /usr/bin/salt-syndic -d
root     25285     1 45 08:36 ?        00:00:00 /usr/bin/python /usr/bin/salt-syndic -d

I must not have had service restarts going on in awhile.

@stale
Copy link

stale bot commented Nov 22, 2017

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.

@stale stale bot added the stale label Nov 22, 2017
@stale stale bot closed this as completed Nov 29, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior Core relates to code central or existential to Salt P2 Priority 2 Salt-Syndic severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around stale
Projects
None yet
Development

No branches or pull requests

4 participants