Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minion returns error to CLI on long jobs #26856

Closed
jfindlay opened this issue Sep 2, 2015 · 4 comments
Closed

minion returns error to CLI on long jobs #26856

jfindlay opened this issue Sep 2, 2015 · 4 comments
Labels
Bug broken, incorrect, or confusing behavior Core relates to code central or existential to Salt P3 Priority 3 severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around
Milestone

Comments

@jfindlay
Copy link
Contributor

jfindlay commented Sep 2, 2015

This seems to happen consistently on all commands that exceed the timeout. The VM gets into this state after reinstalling different versions of salt from a local git checkout with pip install -e. I have tried everything I can think of to remove python, salt, and git artifacts: eggs, rm -fr on portions of /usr/lib/python, git clean -dfx, rm -fr /var/cache/salt.

I eventually gave up and created a new VM but again encountered this problem as well as #25991 within a matter of days. If this is the cause, I suspect it may be due to switching between 2015.5 and 2015.8.

commands

# salt -v jmoney-main cmd.run 'sleep 1024'
Executing job with jid 20150902125814358137
-------------------------------------------

Failed to authenticate! This is most likely because this user is not permitted to execute commands, but there is a small possibility that a disk error occurred (check disk/inode usage).

logs

# salt-master -l debug
...
[DEBUG   ] Sending event - data = {'_stamp': '2015-09-02T18:58:14.358403', 'minions': ['jmoney-main']}
[DEBUG   ] Sending event - data = {'tgt_type': 'glob', 'jid': '20150902125814358137', 'tgt': 'jmoney-main', '_stamp': '2015-09-02T18:58:14.358743', 'user': 'root', 'arg': ['sleep 1024'], 'fun': 'cmd.run', 'minions': ['jmoney-main']}
[INFO    ] User root Published command cmd.run with jid 20150902125814358137
[DEBUG   ] Published command details {'tgt_type': 'glob', 'jid': '20150902125814358137', 'tgt': 'jmoney-main', 'ret': '', 'user': 'root', 'arg': ['sleep 1024'], 'fun': 'cmd.run'}
# salt-minion -l debug
...
[INFO    ] User root Executing command cmd.run with jid 20150902125814358137
[DEBUG   ] Command details {'tgt_type': 'glob', 'jid': '20150902125814358137', 'tgt': 'jmoney-main', 'ret': '', 'user': 'root', 'arg': ['sleep 1024'], 'fun': 'cmd.run'}
[INFO    ] Starting a new job with PID 27687
[INFO    ] Executing command 'sleep 1024' in directory '/root'
[DEBUG   ] output:
[INFO    ] Returning information for job: 20150902124307876576
[DEBUG   ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'jmoney-main', 'tcp://127.0.0.1:4506', 'aes')
[DEBUG   ] Initializing new SAuth for ('/etc/salt/pki/minion', 'jmoney-main', 'tcp://127.0.0.1:4506')

versions

# salt --versions
Salt Version:
           Salt: 2015.8.0rc3-179-g3cc84ec

Dependency Versions:
         Jinja2: 2.7.3
       M2Crypto: 0.21.1
           Mako: 1.0.0
         PyYAML: 3.11
          PyZMQ: 14.4.0
         Python: 2.7.9 (default, Mar  1 2015, 12:57:24)
           RAET: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.0.5
           cffi: 0.8.6
       cherrypy: Not Installed
       dateutil: Not Installed
          gitdb: 0.5.4
      gitpython: Not Installed
          ioflo: Not Installed
        libnacl: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.4.2
   mysql-python: Not Installed
      pycparser: 2.10
       pycrypto: 2.6.1
         pygit2: Not Installed
   python-gnupg: Not Installed
          smmap: 0.8.2
        timelib: Not Installed

System Versions:
           dist: debian 8.1
        machine: x86_64
        release: 3.16.0-4-amd64
         system: debian 8.1
@jfindlay jfindlay added Bug broken, incorrect, or confusing behavior severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around Core relates to code central or existential to Salt P3 Priority 3 labels Sep 2, 2015
@jfindlay jfindlay added this to the Approved milestone Sep 2, 2015
@cachedout
Copy link
Contributor

Can this problem be replicated just by creating a VM and immediately switching from one version of Salt to another? It sounds like this happened after a few days, so I think we need to narrow the possibles causes down before we can do much work here.

@cachedout
Copy link
Contributor

@jfindlay I'm not sure where to go with this one. Have you seen anything like this since your initial report? I'm wondering if we should keep this open or not.

@jfindlay
Copy link
Contributor Author

@cachedout, no. I think it could be an issue with the python caching of dependencies, because that is the only thing I didn't clean out in an attempt to narrow in/resolve this problem.

@cachedout
Copy link
Contributor

@jfindlay Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior Core relates to code central or existential to Salt P3 Priority 3 severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around
Projects
None yet
Development

No branches or pull requests

2 participants