Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

same error message as on issue #18504 #22571

Closed
BoomerB opened this issue Apr 13, 2015 · 25 comments
Closed

same error message as on issue #18504 #22571

BoomerB opened this issue Apr 13, 2015 · 25 comments
Labels
Bug broken, incorrect, or confusing behavior Core relates to code central or existential to Salt fixed-pls-verify fix is linked, bug author to confirm fix P3 Priority 3 severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around
Milestone

Comments

@BoomerB
Copy link

BoomerB commented Apr 13, 2015

Apr 13 10:55:14 chviradmprd12 salt-minion[1382]: [ERROR   ] Exception '_seconds' occurred in scheduled job
Apr 13 10:55:15 chviradmprd12 salt-minion[1382]: [ERROR   ] Exception '_seconds' occurred in scheduled job
Apr 13 10:55:16 chviradmprd12 salt-minion[1382]: [ERROR   ] Exception '_seconds' occurred in scheduled job
Apr 13 10:55:17 chviradmprd12 salt-minion[1382]: [ERROR   ] Exception '_seconds' occurred in scheduled job
@BoomerB
Copy link
Author

BoomerB commented Apr 13, 2015

I have 2 scheduled jobs for this minion (which is also the master)

  push_sudoers:
    function: state.sls
    minutes: 60
    splay: 15
    args:
      - Config_TST.default.scb.sudo.sudoers

  make_sudoers:
    function: state.sls
    seconds: 3600
    splay: 900
    args:
      - Config_TST.default.scb.sudo.make_sudoers

@BoomerB
Copy link
Author

BoomerB commented Apr 13, 2015

dber@chviradmprd12:~> uname -a
Linux chviradmprd12 3.12.39-47-default #1 SMP Thu Mar 26 13:21:16 UTC 2015 (a901594) x86_64 x86_64 x86_64 GNU/Linux
dber@chviradmprd12:~> cat /etc/SuSE-release
SUSE Linux Enterprise Server 12 (x86_64)
VERSION = 12
PATCHLEVEL = 0
dber@chviradmprd12:~> cat /etc/os-release
NAME="SLES"
VERSION="12"
VERSION_ID="12"
PRETTY_NAME="SUSE Linux Enterprise Server 12"
ID="sles"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:suse:sles:12"
dber@chviradmprd12:~> /usr/bin/salt-minion --version
salt-minion 2014.7.2 (Helium)
dber@chviradmprd12:~> rpm -qa | grep salt
salt-minion-2014.7.2-182.11.noarch
salt-master-2014.7.2-182.11.noarch
salt-bash-completion-2014.7.2-182.11.noarch
salt-2014.7.2-182.11.noarch
dber@chviradmprd12:~> zypper repos -e - devel_languages_python
[devel_languages_python]
name=Python Modules (SLE_12)
enabled=1
autorefresh=1
baseurl=http://download.opensuse.org/repositories/devel:/languages:/python/SLE_12/
type=rpm-md
gpgcheck=1
gpgkey=http://download.opensuse.org/repositories/devel:/languages:/python/SLE_12/repodata/repomd.xml.key

@jfindlay jfindlay added the info-needed waiting for more info label Apr 14, 2015
@jfindlay jfindlay added this to the Blocked milestone Apr 14, 2015
@jfindlay jfindlay self-assigned this Apr 14, 2015
@jfindlay
Copy link
Contributor

@BoomerB, thanks for reporting.

#18504, @garethgreenaway.

@BoomerB
Copy link
Author

BoomerB commented Apr 17, 2015

Hi,
sorry for not coming back to you, I was distracted
what kind of info do you need?

@garethgreenaway
Copy link
Contributor

There was a bug I introduced in early versions of 2014.7 if you were using splay without seconds, it's fixed in later versions. Possible for you to upgrade?

@jfindlay jfindlay added fixed-pls-verify fix is linked, bug author to confirm fix Core relates to code central or existential to Salt severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around P3 Priority 3 and removed info-needed waiting for more info labels Apr 17, 2015
@jfindlay jfindlay modified the milestones: Approved, Blocked Apr 17, 2015
@jfindlay jfindlay removed their assignment Apr 17, 2015
@ekle
Copy link

ekle commented Apr 20, 2015

error still exists in 2014.7.4

@garethgreenaway
Copy link
Contributor

@ekle Thanks. Will take a look.

@garethgreenaway
Copy link
Contributor

@ekle Can you provide an example schedule job where you're seeing the issue? I'm attempting to duplicate the error using a really simply job with test.ping and unable to replicate the issue. Thanks!

@ekle
Copy link

ekle commented Apr 20, 2015

nothing special, just this in the pillar:

schedule:
  highstate:
    function: state.highstate
    maxrunning: 1
    seconds: 180
    splay: 120

after removing the splay line, it works fine.
with the line i get
2015-04-20 14:54:23,260 [salt.minion ][ERROR ] Exception '_seconds' occurred in scheduled job
master and minion are 2014.7.4 debian versions installed with apt-get on debian jessie from
deb http://debian.saltstack.com/debian jessie-saltstack main

@garethgreenaway
Copy link
Contributor

Thanks. Going to fire up a Docker instance with Debian jessie and test with the packages.

@garethgreenaway
Copy link
Contributor

@ekle If you run the minion in debug mode, before the exception line, do you see a line that begins with 'schedule.handle_func: Adding splay of'?

@ekle
Copy link

ekle commented Apr 21, 2015

yes:

2015-04-21 07:07:24,916 [salt.utils.schedule][DEBUG   ] schedule.handle_func: Adding splay of 9 seconds to next run.
2015-04-21 07:07:24,916 [salt.minion      ][ERROR   ] Exception '_seconds' occurred in scheduled job

i can also reproduce this with test.ping, looks like the function does not matter

@garethgreenaway
Copy link
Contributor

Perfect. Now we know where the exception is happening and can hopefully figure out why. It seems to be happening because the _seconds key doesn't exist in the data dictionary, but it should be there. Let me investigate a bit more. Thanks!

@garethgreenaway
Copy link
Contributor

@ekle Do you have the configuration item loop_interval set in the minion configuration?

@garethgreenaway
Copy link
Contributor

Nevermind. Had a theory but disapproved it.

@garethgreenaway
Copy link
Contributor

@ekle One other question, do you see the exception on the first run? Or is it only on subsequent runs of the jobs?

@ekle
Copy link

ekle commented Apr 22, 2015

loop_interval is set to 25

@ekle
Copy link

ekle commented Apr 22, 2015

@garethgreenaway it looks like the jobs runs a few times correct and that started to fail.
i created a new schedule "ping2" to test it. here are a few lines from the logs:

cat /var/log/salt/minion | egrep "(ping2|_sec)" 
2015-04-22 10:36:10,742 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:36:44,472 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:37:34,809 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:38:30,054 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:39:21,475 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:39:58,318 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:40:40,838 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:41:13,078 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:41:49,277 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:42:44,479 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:43:40,101 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:44:21,056 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:45:11,085 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:45:54,454 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:46:30,623 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:47:20,701 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:47:56,736 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:48:33,736 [salt.utils.schedule][INFO    ] Running scheduled job: ping2
2015-04-22 10:49:55,514 [salt.minion      ][ERROR   ] Exception '_seconds' occurred in scheduled job
2015-04-22 10:50:04,193 [salt.minion      ][ERROR   ] Exception '_seconds' occurred in scheduled job
2015-04-22 10:50:29,199 [salt.minion      ][ERROR   ] Exception '_seconds' occurred in scheduled job
...

i'm not quite sure, but i think it started to fail after a saltutil.sync_all or saltutil.refresh_pillar run.

@garethgreenaway
Copy link
Contributor

Can you try disabling loop_interval? And ill do some tests with refresh_pillar.

@ekle
Copy link

ekle commented Apr 22, 2015

disabling loop_interval does not help

@garethgreenaway
Copy link
Contributor

Finally able to duplicate it. After running the saltutil.sync_all I'm seeing the issue.

@garethgreenaway
Copy link
Contributor

Actually refresh_pillar seems to be the culprit. Thanks for mentioning that! It looks like the job was remaining in the job queue but all the data was being refreshed by the refresh_pillar call, so my assumptions about what was in there were incorrect. Think I've got it figured out, few more tests and I'll push up a PR with the fix. Thanks again!

This was referenced Apr 22, 2015
@rallytime
Copy link
Contributor

@ekle Can you confirm if the pull requests linked above fixed this issue for you or not?

@ekle
Copy link

ekle commented May 5, 2015

installed 2015.2.0rc2-226-gb3c50c2 looks good

@rallytime rallytime added the Bug broken, incorrect, or confusing behavior label May 5, 2015
@rallytime
Copy link
Contributor

Great! I am going to close this then. If it pops up again, leave a comment and we can open this again and address any additional issues. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior Core relates to code central or existential to Salt fixed-pls-verify fix is linked, bug author to confirm fix P3 Priority 3 severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around
Projects
None yet
Development

No branches or pull requests

5 participants