Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'salt-run jobs.active' doesn't list all minions running the job #9526

Closed
Whissi opened this issue Jan 2, 2014 · 19 comments · Fixed by #12042
Closed

'salt-run jobs.active' doesn't list all minions running the job #9526

Whissi opened this issue Jan 2, 2014 · 19 comments · Fixed by #12042
Labels
Bug broken, incorrect, or confusing behavior fixed-pls-verify fix is linked, bug author to confirm fix severity-low 4th level, cosemtic problems, work around exists

Comments

@Whissi
Copy link
Contributor

Whissi commented Jan 2, 2014

Hi,

master # date
Thu Jan  2 15:48:46 CET 2014
master # salt-run jobs.active
'20140102153844095714':
  Arguments:
  - emerge --verbose --update --deep --newuse world
  Function: cmd.run
  Returned:
  - jid
  Running:
  - srv7.example.org: 25385
  - srv2.example.org: 21399
  Target: srv[2,3,7].example.org
  Target-type: glob
  User: root

I am missing "srv3.example.org" in the list. And yes, srv3 is running the job:

srv3 # qlop -c
 * net-libs/zeromq-4.0.3
     started: Thu Jan  2 15:48:50 2014
     elapsed: 5 seconds
 * net-misc/curl-7.34.0-r1
     started: Thu Jan  2 15:48:21 2014
     elapsed: 34 seconds
 * sys-devel/gettext-0.18.3.1-r1
     started: Thu Jan  2 15:48:38 2014
     elapsed: 17 seconds
 * dev-libs/icu-52.1
     started: Thu Jan  2 15:48:39 2014
     elapsed: 16 seconds
# salt --versions-report
           Salt: 0.17.4
         Python: 2.7.6 (default, Dec  2 2013, 13:35:18)
         Jinja2: 2.7.1
       M2Crypto: 0.21.1
 msgpack-python: 0.4.0
   msgpack-pure: Not Installed
       pycrypto: 2.6.1
         PyYAML: 3.10
          PyZMQ: 14.0.1
            ZMQ: 4.0.3
@basepi
Copy link
Contributor

basepi commented Jan 6, 2014

Is this consistently reproducible? My bet in this instance is that the minion timed out on the master's query for the job. Additionally, what OS is the minion running?

@basepi
Copy link
Contributor

basepi commented Jan 6, 2014

Ah, just noticed the emerge, so I'm assuming gentoo.

@basepi
Copy link
Contributor

basepi commented Jan 6, 2014

I'm a gentoo newb -- what is that qlop command doing, and how are you using it to verify that the salt job is still running?

@Whissi
Copy link
Contributor Author

Whissi commented Jan 6, 2014

Yup, Gentoo.

Well, I ssh'ed into srv3 and verified with ps that emerge was still running. qlop -c will list you the current emerging (installing) package if there is one. Because emerge was the job I started and ps listed the process as still running (and I confirmed with qlop that emerge was still active) I assume that the salt job was still active.

I can tell you next Thursday if it happens again (it was reproducible last week).

@Whissi
Copy link
Contributor Author

Whissi commented Jan 6, 2014

PS: When the job was done, srv3 showed up in salt-run jobs.print_job 20140102153844095714

@basepi
Copy link
Contributor

basepi commented Jan 7, 2014

Yep, I'm almost certain it's an premature timeout issue. When I find time to finish making the secondary timeout configurable, I think that will resolve this issue.

@basepi
Copy link
Contributor

basepi commented Feb 4, 2014

Good news, that secondary timeout is configurable now! It's not in 2014.1.0rc3, but it will be in 2014.1.0, out later this week. Just put gather_job_timeout: 5 in your master config, for example. There were also some poller fixes in the same pull req that may help as well.

@AnneTheAgile
Copy link
Contributor

Now that 2014.1.0 is shipped, can we close this issue? https://github.com/saltstack/salt/releases/tag/v2014.1.0

@techhat
Copy link
Contributor

techhat commented Feb 22, 2014

Sounds good to me!

@techhat techhat closed this as completed Feb 22, 2014
@olliewalsh
Copy link
Contributor

@basepi I think gather_job_timeout default should be higher - I had to set it to 5 to prevent timeouts while running pkg.latest_version on debian minions. I assume the load generated by running apt-get upgrade etc... was causing a spike in I/O which prevented find_job returning fast enough.

@basepi
Copy link
Contributor

basepi commented Mar 7, 2014

I agree. I'm going to bump it to 5 by default, 2 is too short for a
default.

@basepi
Copy link
Contributor

basepi commented Mar 7, 2014

#11062 is in.

@Whissi
Copy link
Contributor Author

Whissi commented Mar 21, 2014

I have to re-open the issue.

srv1 (master):

# salt --versions-report
           Salt: 2014.1.1
         Python: 2.7.6 (default, Dec  2 2013, 13:35:18)
         Jinja2: 2.7.2
       M2Crypto: 0.22
 msgpack-python: 0.4.1
   msgpack-pure: Not Installed
       pycrypto: 2.6.1
         PyYAML: 3.10
          PyZMQ: 14.1.1
            ZMQ: 4.0.4

srv2 (minion):

# salt --versions-report
           Salt: 2014.1.1
         Python: 2.7.6 (default, Dec  2 2013, 16:14:07)
         Jinja2: 2.7.2
       M2Crypto: 0.22
 msgpack-python: 0.4.1
   msgpack-pure: Not Installed
       pycrypto: 2.6.1
         PyYAML: 3.10
          PyZMQ: 14.1.1
            ZMQ: 4.0.4

srv3 (minion):

 # salt --versions-report
           Salt: 2014.1.0
         Python: 2.7.6 (default, Dec  2 2013, 16:40:06)
         Jinja2: 2.7.1
       M2Crypto: 0.21.1
 msgpack-python: 0.4.1
   msgpack-pure: Not Installed
       pycrypto: 2.6.1
         PyYAML: 3.10
          PyZMQ: 14.0.1
            ZMQ: 4.0.3

I started a long running job from the master: salt 'srv[2,3].example.org' cmd.run 'emerge -v1O --usepkg=n dev-lang/perl. You don't have to understand what the command is doing I guess, but I'll tell you: It will build the perl package on the two selected minions running Gentoo.

After ~8-10sec I pressed CTRL+C on the master, which gave me

^CExiting on Ctrl-C
This job's jid is:
20140321213933889697
The minions may not have all finished running and any remaining minions will return upon completion. To look up the return data for this job later run:
salt-run jobs.lookup_jid 20140321213933889697

That's OK. Now I run salt-run jobs.active and that's what I am getting:

'20140321213933889697':
  Arguments:
  - emerge -v1O --usepkg=n dev-lang/perl
  Function: cmd.run
  Returned:
  - jid
  Running:
  - srv2.example.org: 31492
  Target: srv[2,3].example.org
  Target-type: glob
  User: root

That's not OK:

  • srv3 is missing. I verified that the emerge process is running on that minion...
  • I am also wondering if this output is normal
  Returned:
  - jid

Looks like an empty, not resolved value? Not?

After some minutes I got

# salt-run jobs.active
'20140321213933889697':
  Arguments:
  - emerge -v1O --usepkg=n dev-lang/perl
  Function: cmd.run
  Returned:
  - srv2.example.org
  - jid
  Running: []
  Target: srv[2,3].example.org
  Target-type: glob
  User: root

Still an active job without a running minion? Well, the emerge process was still active on srv3 at this time. When it was finished, salt-run jobs.active returned {} as expected.

I thought that this should be working in 2014.1.0 (and the master is running 2014.1.1 now). I also have set gather_job_timeout: 5 in my master configuration (do I have to set this for the minions as well?) like @basepi told me in #9526 (comment)... So it looks like this isn't working yet :/

Can I help you debugging this?

@Whissi
Copy link
Contributor Author

Whissi commented Mar 21, 2014

Some debug logs:

# salt 'srv[2,3].example.org' cmd.run 'emerge --verbose --update --deep --pretend --newuse world'
srv2.example.org:

    These are the packages that would be merged, in order:

    Calculating dependencies  . .... done!

    Total: 0 packages, Size of downloads: 0 kB
^CExiting on Ctrl-C
This job's jid is:
20140321221646145273
The minions may not have all finished running and any remaining minions will return upon completion. To look up the return data for this job later run:
salt-run jobs.lookup_jid 20140321221646145273
srv1 ~ # salt-run jobs.active
'20140321221646145273':
  Arguments:
  - emerge --verbose --update --deep --pretend --newuse world
  Function: cmd.run
  Returned:
  - srv2.example.org
  - jid
  Running: []
  Target: srv[2,3].example.org
  Target-type: glob
  User: root
srv1 ~ # date
Fri Mar 21 22:17:37 CET 2014
srv1 ~ # salt-run jobs.active
{}

srv1 (master) debug log:

2014-03-21 22:15:50,793 [salt.master                                 ][INFO    ] Authentication request from srv3.example.org
2014-03-21 22:15:50,793 [salt.master                                 ][INFO    ] Authentication accepted from srv3.example.org
2014-03-21 22:15:51,105 [salt.master                                 ][INFO    ] Clear payload received with command _auth
2014-03-21 22:15:51,105 [salt.master                                 ][INFO    ] Authentication request from srv3.example.org
2014-03-21 22:15:51,105 [salt.master                                 ][INFO    ] Authentication accepted from srv3.example.org
2014-03-21 22:15:51,358 [salt.master                                 ][INFO    ] AES payload received with command _mine
2014-03-21 22:16:01,172 [salt.fileserver                             ][DEBUG   ] Updating fileserver cache
2014-03-21 22:16:01,173 [salt.fileserver                             ][DEBUG   ] diff_mtime_map: the maps are the same
2014-03-21 22:16:01,173 [salt.utils.verify                           ][DEBUG   ] This salt-master instance has accepted 7 minion keys.
2014-03-21 22:16:46,144 [salt.master                                 ][INFO    ] Clear payload received with command publish
2014-03-21 22:16:46,146 [salt.master                                 ][INFO    ] User root Published command cmd.run with jid 20140321221646145273
2014-03-21 22:16:46,146 [salt.master                                 ][DEBUG   ] Published command details {'tgt_type': 'glob', 'jid': '20140321221646145273', 'tgt': 'srv[2,3].example.org', 'ret': '', 'user': 'root', 'arg': ['emerge --verbose --update --deep --pretend --newuse world'], 'fun': 'cmd.run'}
2014-03-21 22:16:51,153 [salt.master                                 ][INFO    ] Clear payload received with command publish
2014-03-21 22:16:51,154 [salt.master                                 ][INFO    ] User root Published command saltutil.find_job with jid 20140321221651153500
2014-03-21 22:16:51,154 [salt.master                                 ][DEBUG   ] Published command details {'tgt_type': 'glob', 'jid': '20140321221651153500', 'tgt': 'srv[2,3].example.org', 'ret': '', 'user': 'root', 'arg': ['20140321221646145273'], 'fun': 'saltutil.find_job'}
2014-03-21 22:16:51,162 [salt.master                                 ][INFO    ] AES payload received with command _return
2014-03-21 22:16:51,163 [salt.master                                 ][INFO    ] Got return from srv2.example.org for job 20140321221651153500
2014-03-21 22:16:51,171 [salt.master                                 ][INFO    ] AES payload received with command _return
2014-03-21 22:16:51,171 [salt.master                                 ][INFO    ] Got return from srv3.example.org for job 20140321221651153500
2014-03-21 22:16:56,177 [salt.master                                 ][INFO    ] Clear payload received with command publish
2014-03-21 22:16:56,180 [salt.master                                 ][INFO    ] User root Published command saltutil.find_job with jid 20140321221656178059
2014-03-21 22:16:56,180 [salt.master                                 ][DEBUG   ] Published command details {'tgt_type': 'glob', 'jid': '20140321221656178059', 'tgt': 'srv[2,3].example.org', 'ret': '', 'user': 'root', 'arg': ['20140321221646145273'], 'fun': 'saltutil.find_job'}
2014-03-21 22:16:56,188 [salt.master                                 ][INFO    ] AES payload received with command _return
2014-03-21 22:16:56,188 [salt.master                                 ][INFO    ] Got return from srv2.example.org for job 20140321221656178059
2014-03-21 22:16:56,196 [salt.master                                 ][INFO    ] AES payload received with command _return
2014-03-21 22:16:56,196 [salt.master                                 ][INFO    ] Got return from srv3.example.org for job 20140321221656178059
2014-03-21 22:16:56,780 [salt.master                                 ][INFO    ] AES payload received with command _return
2014-03-21 22:16:56,780 [salt.master                                 ][INFO    ] Got return from srv2.example.org for job 20140321221646145273
2014-03-21 22:17:01,253 [salt.fileserver                             ][DEBUG   ] Updating fileserver cache
2014-03-21 22:17:01,254 [salt.fileserver                             ][DEBUG   ] diff_mtime_map: the maps are the same
2014-03-21 22:17:01,254 [salt.utils.verify                           ][DEBUG   ] This salt-master instance has accepted 7 minion keys.
2014-03-21 22:17:08,737 [salt.master                                 ][INFO    ] Clear payload received with command publish
2014-03-21 22:17:08,738 [salt.master                                 ][INFO    ] User root Published command saltutil.running with jid 20140321221708737474
2014-03-21 22:17:08,738 [salt.master                                 ][DEBUG   ] Published command details {'tgt_type': 'glob', 'jid': '20140321221708737474', 'tgt': '*', 'ret': '', 'user': 'root', 'arg': [], 'fun': 'saltutil.running'}
2014-03-21 22:17:08,744 [salt.master                                 ][INFO    ] AES payload received with command _return
2014-03-21 22:17:08,744 [salt.master                                 ][INFO    ] AES payload received with command _return
2014-03-21 22:17:08,744 [salt.master                                 ][INFO    ] AES payload received with command _return
2014-03-21 22:17:08,744 [salt.master                                 ][INFO    ] Got return from srv4.example.org for job 20140321221708737474
2014-03-21 22:17:08,744 [salt.master                                 ][INFO    ] AES payload received with command _return
2014-03-21 22:17:08,744 [salt.master                                 ][INFO    ] Got return from srv5.example.org for job 20140321221708737474
2014-03-21 22:17:08,744 [salt.master                                 ][INFO    ] Got return from srv6.example.org for job 20140321221708737474
2014-03-21 22:17:08,745 [salt.master                                 ][INFO    ] Got return from srv2.example.org for job 20140321221708737474
2014-03-21 22:17:08,753 [salt.master                                 ][INFO    ] AES payload received with command _return
2014-03-21 22:17:08,753 [salt.master                                 ][INFO    ] Got return from srv3.example.org for job 20140321221708737474
2014-03-21 22:17:08,753 [salt.master                                 ][INFO    ] AES payload received with command _return
2014-03-21 22:17:08,754 [salt.master                                 ][INFO    ] Got return from srv8.example.org for job 20140321221708737474
2014-03-21 22:17:23,825 [salt.master                                 ][INFO    ] Clear payload received with command publish
2014-03-21 22:17:23,826 [salt.master                                 ][INFO    ] User root Published command saltutil.running with jid 20140321221723825761
2014-03-21 22:17:23,827 [salt.master                                 ][DEBUG   ] Published command details {'tgt_type': 'glob', 'jid': '20140321221723825761', 'tgt': '*', 'ret': '', 'user': 'root', 'arg': [], 'fun': 'saltutil.running'}
2014-03-21 22:17:23,832 [salt.master                                 ][INFO    ] AES payload received with command _return
2014-03-21 22:17:23,832 [salt.master                                 ][INFO    ] AES payload received with command _return
2014-03-21 22:17:23,832 [salt.master                                 ][INFO    ] AES payload received with command _return
2014-03-21 22:17:23,832 [salt.master                                 ][INFO    ] Got return from srv6.example.org for job 20140321221723825761
2014-03-21 22:17:23,832 [salt.master                                 ][INFO    ] Got return from srv4.example.org for job 20140321221723825761
2014-03-21 22:17:23,833 [salt.master                                 ][INFO    ] Got return from srv5.example.org for job 20140321221723825761
2014-03-21 22:17:23,833 [salt.master                                 ][INFO    ] AES payload received with command _return
2014-03-21 22:17:23,834 [salt.master                                 ][INFO    ] Got return from srv2.example.org for job 20140321221723825761
2014-03-21 22:17:23,841 [salt.master                                 ][INFO    ] AES payload received with command _return
2014-03-21 22:17:23,841 [salt.master                                 ][INFO    ] Got return from srv3.example.org for job 20140321221723825761
2014-03-21 22:17:23,842 [salt.master                                 ][INFO    ] AES payload received with command _return
2014-03-21 22:17:23,842 [salt.master                                 ][INFO    ] Got return from srv8.example.org for job 20140321221723825761
2014-03-21 22:17:30,164 [salt.master                                 ][INFO    ] AES payload received with command _return
2014-03-21 22:17:30,165 [salt.master                                 ][INFO    ] Got return from srv3.example.org for job 20140321221646145273
2014-03-21 22:17:39,239 [salt.master                                 ][INFO    ] Clear payload received with command publish
2014-03-21 22:17:39,240 [salt.master                                 ][INFO    ] User root Published command saltutil.running with jid 20140321221739239145
2014-03-21 22:17:39,240 [salt.master                                 ][DEBUG   ] Published command details {'tgt_type': 'glob', 'jid': '20140321221739239145', 'tgt': '*', 'ret': '', 'user': 'root', 'arg': [], 'fun': 'saltutil.running'}
2014-03-21 22:17:39,246 [salt.master                                 ][INFO    ] AES payload received with command _return
2014-03-21 22:17:39,246 [salt.master                                 ][INFO    ] AES payload received with command _return
2014-03-21 22:17:39,246 [salt.master                                 ][INFO    ] Got return from srv5.example.org for job 20140321221739239145
2014-03-21 22:17:39,246 [salt.master                                 ][INFO    ] AES payload received with command _return
2014-03-21 22:17:39,246 [salt.master                                 ][INFO    ] AES payload received with command _return
2014-03-21 22:17:39,246 [salt.master                                 ][INFO    ] Got return from srv6.example.org for job 20140321221739239145
2014-03-21 22:17:39,246 [salt.master                                 ][INFO    ] Got return from srv2.example.org for job 20140321221739239145
2014-03-21 22:17:39,246 [salt.master                                 ][INFO    ] Got return from srv4.example.org for job 20140321221739239145
2014-03-21 22:17:39,254 [salt.master                                 ][INFO    ] AES payload received with command _return
2014-03-21 22:17:39,254 [salt.master                                 ][INFO    ] Got return from srv3.example.org for job 20140321221739239145
2014-03-21 22:17:39,256 [salt.master                                 ][INFO    ] AES payload received with command _return
2014-03-21 22:17:39,256 [salt.master                                 ][INFO    ] Got return from srv8.example.org for job 20140321221739239145
2014-03-21 22:18:01,324 [salt.fileserver                             ][DEBUG   ] Updating fileserver cache
2014-03-21 22:18:01,325 [salt.fileserver                             ][DEBUG   ] diff_mtime_map: the maps are the same
2014-03-21 22:18:01,326 [salt.utils.verify                           ][DEBUG   ] This salt-master instance has accepted 7 minion keys.
2014-03-21 22:19:01,396 [salt.fileserver                             ][DEBUG   ] Updating fileserver cache
2014-03-21 22:19:01,397 [salt.fileserver                             ][DEBUG   ] diff_mtime_map: the maps are the same
2014-03-21 22:19:01,398 [salt.utils.verify                           ][DEBUG   ] This salt-master instance has accepted 7 minion keys.
2014-03-21 22:20:01,468 [salt.fileserver                             ][DEBUG   ] Updating fileserver cache
2014-03-21 22:20:01,469 [salt.fileserver                             ][DEBUG   ] diff_mtime_map: the maps are the same
2014-03-21 22:20:01,469 [salt.utils.verify                           ][DEBUG   ] This salt-master instance has accepted 7 minion keys.
2014-03-21 22:21:01,540 [salt.fileserver                             ][DEBUG   ] Updating fileserver cache
2014-03-21 22:21:01,541 [salt.fileserver                             ][DEBUG   ] diff_mtime_map: the maps are the same
2014-03-21 22:21:01,541 [salt.utils.verify                           ][DEBUG   ] This salt-master instance has accepted 7 minion keys.
2014-03-21 22:22:01,611 [salt.fileserver                             ][DEBUG   ] Updating fileserver cache
2014-03-21 22:22:01,612 [salt.fileserver                             ][DEBUG   ] diff_mtime_map: the maps are the same
2014-03-21 22:22:01,613 [salt.utils.verify                           ][DEBUG   ] This salt-master instance has accepted 7 minion keys.
2014-03-21 22:23:01,659 [salt.fileserver                             ][DEBUG   ] Updating fileserver cache
2014-03-21 22:23:01,660 [salt.fileserver                             ][DEBUG   ] diff_mtime_map: the maps are the same
2014-03-21 22:23:01,661 [salt.utils.verify                           ][DEBUG   ] This salt-master instance has accepted 7 minion keys.

srv3 (minion, now also running 2014.1.1) debug log:

2014-03-21 22:15:48,110 [salt             ][INFO    ] Setting up the Salt Minion "srv3.example.org"
2014-03-21 22:15:48,115 [salt.utils.process][DEBUG   ] Created pidfile: /run/salt-minion.pid
2014-03-21 22:15:48,116 [salt.config      ][DEBUG   ] Reading configuration from /etc/salt/minion
2014-03-21 22:15:48,191 [salt.config      ][DEBUG   ] Including configuration from '/etc/salt/minion.d/myMinion.conf'
2014-03-21 22:15:48,192 [salt.config      ][DEBUG   ] Reading configuration from /etc/salt/minion.d/myMinion.conf
2014-03-21 22:15:48,671 [salt.minion                              ][DEBUG   ] Attempting to authenticate with the Salt Master at 10.79.0.1
2014-03-21 22:15:48,672 [salt.crypt                               ][DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
2014-03-21 22:15:48,731 [salt.crypt                               ][DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
2014-03-21 22:15:48,869 [salt.crypt                               ][DEBUG   ] Decrypting the current master AES key
2014-03-21 22:15:48,870 [salt.crypt                               ][DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
2014-03-21 22:15:48,985 [salt.minion                              ][INFO    ] Authentication with master successful!
2014-03-21 22:15:48,985 [salt.crypt                               ][DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
2014-03-21 22:15:49,122 [salt.crypt                               ][DEBUG   ] Decrypting the current master AES key
2014-03-21 22:15:49,123 [salt.crypt                               ][DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
2014-03-21 22:15:49,280 [salt.crypt                               ][DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
2014-03-21 22:15:49,339 [salt.config                              ][DEBUG   ] Reading configuration from /etc/salt/minion
2014-03-21 22:15:49,415 [salt.config                              ][DEBUG   ] Including configuration from '/etc/salt/minion.d/myMinion.conf'
2014-03-21 22:15:49,415 [salt.config                              ][DEBUG   ] Reading configuration from /etc/salt/minion.d/myMinion.conf
2014-03-21 22:15:50,198 [salt.loader                              ][DEBUG   ] Loaded localemod as virtual locale
2014-03-21 22:15:50,199 [salt.loader                              ][DEBUG   ] Loaded groupadd as virtual group
2014-03-21 22:15:50,202 [salt.loader                              ][DEBUG   ] Loaded ebuild as virtual pkg
2014-03-21 22:15:50,206 [salt.loader                              ][DEBUG   ] Loaded parted as virtual partition
2014-03-21 22:15:50,207 [salt.loader                              ][DEBUG   ] Loaded linux_sysctl as virtual sysctl
2014-03-21 22:15:50,227 [salt.loader                              ][DEBUG   ] Loaded sysmod as virtual sys
2014-03-21 22:15:50,230 [salt.loader                              ][DEBUG   ] Loaded zcbuildout as virtual buildout
2014-03-21 22:15:50,232 [salt.loader                              ][DEBUG   ] Loaded djangomod as virtual django
2014-03-21 22:15:50,234 [salt.loader                              ][DEBUG   ] Loaded useradd as virtual user
2014-03-21 22:15:50,235 [salt.loader                              ][DEBUG   ] Loaded gentoo_service as virtual service
2014-03-21 22:15:50,248 [salt.loader                              ][DEBUG   ] Loaded virtualenv_mod as virtual virtualenv
2014-03-21 22:15:50,250 [salt.loader                              ][DEBUG   ] Loaded linux_acl as virtual acl
2014-03-21 22:15:50,251 [salt.loader                              ][DEBUG   ] Loaded cmdmod as virtual cmd
2014-03-21 22:15:50,253 [salt.loader                              ][DEBUG   ] Loaded gentoolkitmod as virtual gentoolkit
2014-03-21 22:15:50,266 [salt.loader                                 ][DEBUG   ] Loaded syslog_return as virtual syslog
2014-03-21 22:15:50,267 [salt.loader                                 ][DEBUG   ] Loaded couchdb_return as virtual couchdb
2014-03-21 22:15:50,267 [salt.loader                                 ][DEBUG   ] Loaded carbon_return as virtual carbon
2014-03-21 22:15:50,268 [salt.minion                                 ][DEBUG   ] I am srv3.example.org and I am not supposed to start any proxies.
2014-03-21 22:15:50,268 [salt.minion                                 ][INFO    ] Minion is starting as user 'root'
2014-03-21 22:15:50,268 [salt.minion                                 ][DEBUG   ] Minion 'srv3.example.org' trying to tune in
2014-03-21 22:15:50,269 [salt.minion                                 ][DEBUG   ] Minion PUB socket URI: ipc:///var/run/salt/minion/minion_event_a9a4e31ca4566be0615108ef7b1891be_pub.ipc
2014-03-21 22:15:50,269 [salt.minion                                 ][DEBUG   ] Minion PULL socket URI: ipc:///var/run/salt/minion/minion_event_a9a4e31ca4566be0615108ef7b1891be_pull.ipc
2014-03-21 22:15:50,269 [salt.minion                                 ][DEBUG   ] Setting zmq_reconnect_ivl to '100ms'
2014-03-21 22:15:50,269 [salt.minion                                 ][DEBUG   ] Setting zmq_reconnect_ivl_max to '5100ms'
2014-03-21 22:15:50,777 [salt.utils.schedule                         ][DEBUG   ] Running scheduled job: __mine_interval
2014-03-21 22:15:50,777 [salt.utils.schedule                         ][DEBUG   ] schedule: This job was scheduled with jid_include, adding to cache (jid_include defaults to True)
2014-03-21 22:15:50,778 [salt.utils.schedule                         ][DEBUG   ] schedule: This job was scheduled with a max number of 2
2014-03-21 22:15:50,788 [salt.utils.schedule                         ][DEBUG   ] schedule.handle_func: adding this job to the jobcache with data {'fun': 'mine.update', 'jid': '20140321221550781515', 'pid': 15493, 'id': 'srv3.example.org'}
2014-03-21 22:15:50,791 [salt.crypt                                  ][DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
2014-03-21 22:15:50,928 [salt.crypt                                  ][DEBUG   ] Decrypting the current master AES key
2014-03-21 22:15:50,929 [salt.crypt                                  ][DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
2014-03-21 22:15:51,045 [salt.crypt                                  ][DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
2014-03-21 22:15:51,104 [salt.crypt                                  ][DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
2014-03-21 22:15:51,241 [salt.crypt                                  ][DEBUG   ] Decrypting the current master AES key
2014-03-21 22:15:51,241 [salt.crypt                                  ][DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
2014-03-21 22:16:46,148 [salt.minion                                 ][INFO    ] User root Executing command cmd.run with jid 20140321221646145273
2014-03-21 22:16:46,149 [salt.minion                                 ][DEBUG   ] Command details {'tgt_type': 'glob', 'jid': '20140321221646145273', 'tgt': 'srv[2,3].example.org', 'ret': '', 'user': 'root', 'arg': ['emerge --verbose --update --deep --pretend --newuse world'], 'fun': 'cmd.run'}
2014-03-21 22:16:46,160 [salt.loaded.int.module.cmdmod               ][INFO    ] Executing command 'emerge --verbose --update --deep --pretend --newuse world' in directory '/root'
2014-03-21 22:16:51,156 [salt.minion                                 ][INFO    ] User root Executing command saltutil.find_job with jid 20140321221651153500
2014-03-21 22:16:51,156 [salt.minion                                 ][DEBUG   ] Command details {'tgt_type': 'glob', 'jid': '20140321221651153500', 'tgt': 'srv[2,3].example.org', 'ret': '', 'user': 'root', 'arg': ['20140321221646145273'], 'fun': 'saltutil.find_job'}
2014-03-21 22:16:51,169 [salt.minion                                 ][INFO    ] Returning information for job: 20140321221651153500
2014-03-21 22:16:56,181 [salt.minion                                 ][INFO    ] User root Executing command saltutil.find_job with jid 20140321221656178059
2014-03-21 22:16:56,181 [salt.minion                                 ][DEBUG   ] Command details {'tgt_type': 'glob', 'jid': '20140321221656178059', 'tgt': 'srv[2,3].example.org', 'ret': '', 'user': 'root', 'arg': ['20140321221646145273'], 'fun': 'saltutil.find_job'}
2014-03-21 22:16:56,194 [salt.minion                                 ][INFO    ] Returning information for job: 20140321221656178059
2014-03-21 22:17:08,740 [salt.minion                                 ][INFO    ] User root Executing command saltutil.running with jid 20140321221708737474
2014-03-21 22:17:08,740 [salt.minion                                 ][DEBUG   ] Command details {'tgt_type': 'glob', 'jid': '20140321221708737474', 'tgt': '*', 'ret': '', 'user': 'root', 'arg': [], 'fun': 'saltutil.running'}
2014-03-21 22:17:08,751 [salt.minion                                 ][INFO    ] Returning information for job: 20140321221708737474
2014-03-21 22:17:23,828 [salt.minion                                 ][INFO    ] User root Executing command saltutil.running with jid 20140321221723825761
2014-03-21 22:17:23,828 [salt.minion                                 ][DEBUG   ] Command details {'tgt_type': 'glob', 'jid': '20140321221723825761', 'tgt': '*', 'ret': '', 'user': 'root', 'arg': [], 'fun': 'saltutil.running'}
2014-03-21 22:17:23,839 [salt.minion                                 ][INFO    ] Returning information for job: 20140321221723825761
2014-03-21 22:17:30,161 [salt.loaded.int.module.cmdmod               ][INFO    ] output: 
// OUTPUT REMOVED
2014-03-21 22:17:30,162 [salt.minion                                 ][INFO    ] Returning information for job: 20140321221646145273
2014-03-21 22:17:39,241 [salt.minion                                 ][INFO    ] User root Executing command saltutil.running with jid 20140321221739239145
2014-03-21 22:17:39,242 [salt.minion                                 ][DEBUG   ] Command details {'tgt_type': 'glob', 'jid': '20140321221739239145', 'tgt': '*', 'ret': '', 'user': 'root', 'arg': [], 'fun': 'saltutil.running'}
2014-03-21 22:17:39,252 [salt.minion                                 ][INFO    ] Returning information for job: 20140321221739239145

@basepi
Copy link
Contributor

basepi commented Mar 21, 2014

Hrm, that is strange behavior. I'll reopen and we can do more investigation.

@basepi basepi reopened this Mar 21, 2014
@basepi
Copy link
Contributor

basepi commented Mar 21, 2014

For the record, I just tested this and my single minion in my test was also missing from the Running list.

± salt-run jobs.active                                                                                                                             [22:44:24]
'20140321224338025685':
  Arguments:
  - '60'
  Function: test.sleep
  Returned:
  - jid
  Running: []
  Target: '*'
  Target-type: glob
  User: root

So something does seem to be amiss there. Haven't had a chance to code dive to see what the "Returned" piece is supposed to be.

cachedout pushed a commit to cachedout/salt that referenced this issue Apr 16, 2014
* Fixed output of running minions (saltstack#9526)
* Correctly identify returned minions.
Whissi added a commit to Whissi/gentoo-overlay that referenced this issue Apr 16, 2014
iquaba pushed a commit to iquaba/salt that referenced this issue Apr 16, 2014
* Fixed output of running minions (saltstack#9526)
* Correctly identify returned minions.
@AnneTheAgile
Copy link
Contributor

Thank you @thatch45 ! I see that one of the fixes is to skip over the 'jid' header;

salt / salt / runners / jobs.py has def active(): changed to be;
if minion.startswith('.') or minion == 'jid':
continue

@Whissi , I wonder if you still see the 'empty' jid in the "Returned" section after the fix. I was also puzzled what it is, I presume it's a header.

Whissi added a commit to Whissi/gentoo-overlay that referenced this issue Apr 17, 2014
@Whissi
Copy link
Contributor Author

Whissi commented Jun 13, 2014

@basepi , I don't understand why this fix was not picked for 2014.1.3, 2014.1,4 and now it is also missing in 2014.1.5. Can you help or explain? Thanks.

@basepi
Copy link
Contributor

basepi commented Jun 16, 2014

@Whissi because no one ever pinged me that it needed to be cherry-picked. If there's a fix that goes in, and you want to make sure it gets into the next bugfix release, you need to make sure it gets a "Bugfix - Cherry-Pick" label (usually by pinging me).

Pull requests that have been cherry-picked, and so will be in the next bugfix release (or are already in a bugfix release) have the label "Bugfix - [Done] Cherry-Pick".

I have labeled the pull request for cherry-picking.

cachedout pushed a commit that referenced this issue Jun 25, 2014
* Fixed output of running minions (#9526)
* Correctly identify returned minions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior fixed-pls-verify fix is linked, bug author to confirm fix severity-low 4th level, cosemtic problems, work around exists
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants