same error message as on issue #18504 #22571

BoomerB · 2015-04-13T09:04:50Z

Apr 13 10:55:14 chviradmprd12 salt-minion[1382]: [ERROR   ] Exception '_seconds' occurred in scheduled job
Apr 13 10:55:15 chviradmprd12 salt-minion[1382]: [ERROR   ] Exception '_seconds' occurred in scheduled job
Apr 13 10:55:16 chviradmprd12 salt-minion[1382]: [ERROR   ] Exception '_seconds' occurred in scheduled job
Apr 13 10:55:17 chviradmprd12 salt-minion[1382]: [ERROR   ] Exception '_seconds' occurred in scheduled job

The text was updated successfully, but these errors were encountered:

BoomerB · 2015-04-13T09:07:37Z

I have 2 scheduled jobs for this minion (which is also the master)

  push_sudoers:
    function: state.sls
    minutes: 60
    splay: 15
    args:
      - Config_TST.default.scb.sudo.sudoers

  make_sudoers:
    function: state.sls
    seconds: 3600
    splay: 900
    args:
      - Config_TST.default.scb.sudo.make_sudoers

BoomerB · 2015-04-13T09:10:03Z

dber@chviradmprd12:~> uname -a
Linux chviradmprd12 3.12.39-47-default #1 SMP Thu Mar 26 13:21:16 UTC 2015 (a901594) x86_64 x86_64 x86_64 GNU/Linux

dber@chviradmprd12:~> cat /etc/SuSE-release
SUSE Linux Enterprise Server 12 (x86_64)
VERSION = 12
PATCHLEVEL = 0

dber@chviradmprd12:~> cat /etc/os-release
NAME="SLES"
VERSION="12"
VERSION_ID="12"
PRETTY_NAME="SUSE Linux Enterprise Server 12"
ID="sles"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:suse:sles:12"

dber@chviradmprd12:~> /usr/bin/salt-minion --version
salt-minion 2014.7.2 (Helium)

dber@chviradmprd12:~> rpm -qa | grep salt
salt-minion-2014.7.2-182.11.noarch
salt-master-2014.7.2-182.11.noarch
salt-bash-completion-2014.7.2-182.11.noarch
salt-2014.7.2-182.11.noarch

dber@chviradmprd12:~> zypper repos -e - devel_languages_python
[devel_languages_python]
name=Python Modules (SLE_12)
enabled=1
autorefresh=1
baseurl=http://download.opensuse.org/repositories/devel:/languages:/python/SLE_12/
type=rpm-md
gpgcheck=1
gpgkey=http://download.opensuse.org/repositories/devel:/languages:/python/SLE_12/repodata/repomd.xml.key

jfindlay · 2015-04-14T18:35:30Z

@BoomerB, thanks for reporting.

#18504, @garethgreenaway.

BoomerB · 2015-04-17T08:51:12Z

Hi,
sorry for not coming back to you, I was distracted
what kind of info do you need?

garethgreenaway · 2015-04-17T14:02:15Z

There was a bug I introduced in early versions of 2014.7 if you were using splay without seconds, it's fixed in later versions. Possible for you to upgrade?

ekle · 2015-04-20T14:59:25Z

error still exists in 2014.7.4

garethgreenaway · 2015-04-20T15:01:57Z

@ekle Thanks. Will take a look.

garethgreenaway · 2015-04-20T15:15:18Z

@ekle Can you provide an example schedule job where you're seeing the issue? I'm attempting to duplicate the error using a really simply job with test.ping and unable to replicate the issue. Thanks!

ekle · 2015-04-20T15:35:00Z

nothing special, just this in the pillar:

schedule:
  highstate:
    function: state.highstate
    maxrunning: 1
    seconds: 180
    splay: 120

after removing the splay line, it works fine.
with the line i get
2015-04-20 14:54:23,260 [salt.minion ][ERROR ] Exception '_seconds' occurred in scheduled job
master and minion are 2014.7.4 debian versions installed with apt-get on debian jessie from
deb http://debian.saltstack.com/debian jessie-saltstack main

garethgreenaway · 2015-04-20T15:54:47Z

Thanks. Going to fire up a Docker instance with Debian jessie and test with the packages.

garethgreenaway · 2015-04-20T16:27:17Z

@ekle If you run the minion in debug mode, before the exception line, do you see a line that begins with 'schedule.handle_func: Adding splay of'?

ekle · 2015-04-21T07:08:52Z

yes:

2015-04-21 07:07:24,916 [salt.utils.schedule][DEBUG   ] schedule.handle_func: Adding splay of 9 seconds to next run.
2015-04-21 07:07:24,916 [salt.minion      ][ERROR   ] Exception '_seconds' occurred in scheduled job

i can also reproduce this with test.ping, looks like the function does not matter

garethgreenaway · 2015-04-21T14:53:14Z

Perfect. Now we know where the exception is happening and can hopefully figure out why. It seems to be happening because the _seconds key doesn't exist in the data dictionary, but it should be there. Let me investigate a bit more. Thanks!

garethgreenaway · 2015-04-22T00:38:48Z

@ekle Do you have the configuration item loop_interval set in the minion configuration?

garethgreenaway · 2015-04-22T00:46:47Z

Nevermind. Had a theory but disapproved it.

garethgreenaway · 2015-04-22T02:30:27Z

@ekle One other question, do you see the exception on the first run? Or is it only on subsequent runs of the jobs?

ekle · 2015-04-22T10:23:46Z

loop_interval is set to 25

ekle · 2015-04-22T12:27:54Z

@garethgreenaway it looks like the jobs runs a few times correct and that started to fail.
i created a new schedule "ping2" to test it. here are a few lines from the logs:

cat /var/log/salt/minion | egrep "(ping2|_sec)" 
2015-04-22 10:36:10,742 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:36:44,472 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:37:34,809 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:38:30,054 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:39:21,475 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:39:58,318 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:40:40,838 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:41:13,078 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:41:49,277 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:42:44,479 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:43:40,101 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:44:21,056 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:45:11,085 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:45:54,454 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:46:30,623 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:47:20,701 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:47:56,736 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:48:33,736 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:49:55,514 [salt.minion      ][ERROR   ] Exception '_seconds' occurred in scheduled job
2015-04-22 10:50:04,193 [salt.minion      ][ERROR   ] Exception '_seconds' occurred in scheduled job
2015-04-22 10:50:29,199 [salt.minion      ][ERROR   ] Exception '_seconds' occurred in scheduled job
...

i'm not quite sure, but i think it started to fail after a saltutil.sync_all or saltutil.refresh_pillar run.

garethgreenaway · 2015-04-22T13:18:47Z

Can you try disabling loop_interval? And ill do some tests with refresh_pillar.

ekle · 2015-04-22T13:38:00Z

disabling loop_interval does not help

garethgreenaway · 2015-04-22T13:50:28Z

Finally able to duplicate it. After running the saltutil.sync_all I'm seeing the issue.

garethgreenaway · 2015-04-22T14:08:13Z

Actually refresh_pillar seems to be the culprit. Thanks for mentioning that! It looks like the job was remaining in the job queue but all the data was being refreshed by the refresh_pillar call, so my assumptions about what was in there were incorrect. Think I've got it figured out, few more tests and I'll push up a PR with the fix. Thanks again!

rallytime · 2015-05-04T01:06:12Z

@ekle Can you confirm if the pull requests linked above fixed this issue for you or not?

ekle · 2015-05-05T08:36:50Z

installed 2015.2.0rc2-226-gb3c50c2 looks good

rallytime · 2015-05-05T15:26:53Z

Great! I am going to close this then. If it pops up again, leave a comment and we can open this again and address any additional issues. Thanks!

jfindlay added the info-needed waiting for more info label Apr 14, 2015

jfindlay added this to the Blocked milestone Apr 14, 2015

jfindlay self-assigned this Apr 14, 2015

jfindlay added fixed-pls-verify fix is linked, bug author to confirm fix Core relates to code central or existential to Salt severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around P3 Priority 3 and removed info-needed waiting for more info labels Apr 17, 2015

jfindlay modified the milestones: Approved, Blocked Apr 17, 2015

jfindlay removed their assignment Apr 17, 2015

This was referenced Apr 22, 2015

Fixes to scheduler #22945

Merged

Fixes to scheduler #22947

Merged

rallytime added the Bug broken, incorrect, or confusing behavior label May 5, 2015

rallytime closed this as completed May 5, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

same error message as on issue #18504 #22571

same error message as on issue #18504 #22571

BoomerB commented Apr 13, 2015

BoomerB commented Apr 13, 2015

BoomerB commented Apr 13, 2015

jfindlay commented Apr 14, 2015

BoomerB commented Apr 17, 2015

garethgreenaway commented Apr 17, 2015

ekle commented Apr 20, 2015

garethgreenaway commented Apr 20, 2015

garethgreenaway commented Apr 20, 2015

ekle commented Apr 20, 2015

garethgreenaway commented Apr 20, 2015

garethgreenaway commented Apr 20, 2015

ekle commented Apr 21, 2015

garethgreenaway commented Apr 21, 2015

garethgreenaway commented Apr 22, 2015

garethgreenaway commented Apr 22, 2015

garethgreenaway commented Apr 22, 2015

ekle commented Apr 22, 2015

ekle commented Apr 22, 2015

garethgreenaway commented Apr 22, 2015

ekle commented Apr 22, 2015

garethgreenaway commented Apr 22, 2015

garethgreenaway commented Apr 22, 2015

rallytime commented May 4, 2015

ekle commented May 5, 2015

rallytime commented May 5, 2015

same error message as on issue #18504 #22571

same error message as on issue #18504 #22571

Comments

BoomerB commented Apr 13, 2015

BoomerB commented Apr 13, 2015

BoomerB commented Apr 13, 2015

jfindlay commented Apr 14, 2015

BoomerB commented Apr 17, 2015

garethgreenaway commented Apr 17, 2015

ekle commented Apr 20, 2015

garethgreenaway commented Apr 20, 2015

garethgreenaway commented Apr 20, 2015

ekle commented Apr 20, 2015

garethgreenaway commented Apr 20, 2015

garethgreenaway commented Apr 20, 2015

ekle commented Apr 21, 2015

garethgreenaway commented Apr 21, 2015

garethgreenaway commented Apr 22, 2015

garethgreenaway commented Apr 22, 2015

garethgreenaway commented Apr 22, 2015

ekle commented Apr 22, 2015

ekle commented Apr 22, 2015

garethgreenaway commented Apr 22, 2015

ekle commented Apr 22, 2015

garethgreenaway commented Apr 22, 2015

garethgreenaway commented Apr 22, 2015

rallytime commented May 4, 2015

ekle commented May 5, 2015

rallytime commented May 5, 2015