Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

14.04 / salt '*' test.ping Failed to authenticate, is this user permitted to execute commands? #12248

Closed
kiorky opened this issue Apr 24, 2014 · 135 comments
Assignees
Labels
Bug broken, incorrect, or confusing behavior severity-low 4th level, cosemtic problems, work around exists
Milestone

Comments

@kiorky
Copy link
Contributor

kiorky commented Apr 24, 2014

Since my upgrade to 14.04 and sync of @makinacorpus fork to last develop, something is weird.

I get intermittent total unfunctionnality of the saltmaster , on a period of 1 at 2 minutes, i cant ping any minion, and the next 2 minutes ping is working, and 2 minutes after this does not work.

 mastersalt '*' test.ping
Failed to authenticate, is this user permitted to execute commands?

I rebuilded all the eggs locally installed to be sure of the linking, but this seems not to be sufficient.

Ill check tomorow to double check if this is one of our maintainance crons, but i would suspect yet another regression.

@kiorky kiorky changed the title salt '*' test.ping Failed to authenticate, is this user permitted to execute commands? 14.04 / salt '*' test.ping Failed to authenticate, is this user permitted to execute commands? Apr 24, 2014
@kiorky
Copy link
Contributor Author

kiorky commented Apr 24, 2014

This morning everything is operating correctly, i prefer to let this bug open at the moment, but for now, it seems more of an installation temporary problem as during this upgrade:

  • DNS changed
  • IP changed
  • OS changed
  • Salt was upgraded

I so also suspect a dns propagation / ip routing propragation problem...

So this may be just a false positive bug and more a big PEBCAK :)

/cc @regilero

@phaf
Copy link
Contributor

phaf commented Apr 24, 2014

Please tell us again if have new ideas what might caused your problems. I also discovered your described behaviour in my setup (Debian 7, Packages from Salt-Repo), but sometimes it works a whole week without any problems

@excavador
Copy link

#12246

@basepi
Copy link
Contributor

basepi commented Apr 24, 2014

We'll leave this open for now, but if it doesn't pop up, let's close it in a few days.

@basepi basepi added this to the Blocked milestone Apr 24, 2014
@basepi
Copy link
Contributor

basepi commented Apr 24, 2014

Also, unless I'm missing something 14.04 is not a valid salt version. Or it's very very old if it's from the 0.14 release. Did you mean 2014.1.3?

@kiorky
Copy link
Contributor Author

kiorky commented Apr 24, 2014

@basepi, ubuntu 14.04 ;)

@kiorky
Copy link
Contributor Author

kiorky commented Apr 24, 2014

i mentionned that i used last salt/develop tip yesterday... so f43dfd5

@basepi
Copy link
Contributor

basepi commented Apr 25, 2014

@kiorky Wow, that makes wayyyyy more sense. Thanks.

@cachedout
Copy link
Contributor

@kiorky We were going to wait and see if you saw this again. Is this safe to close now?

@ifnull
Copy link

ifnull commented Jun 9, 2014

@cachedout I'm seeing this on Trusty 14.04 with Salt 2014.1.4 as well.

Everything works fine on the first box (using VirtualBox), but when I test the box I have packaged from this VM, I see the same error after running salt '*' state.clear_cache; salt-call -l debug state.highstate;. What is interesting is that if I try the highstate command a couple minutes later I do not see the error.

Let me know if there is any information that I can provide to help debug this.

@cachedout
Copy link
Contributor

@ifnull Thanks for letting us know. We'll keep this open and try to figure out what's going on here.

@ifnull
Copy link

ifnull commented Jun 9, 2014

@danielfrg
Copy link
Contributor

I am having the same problem mentioned here.

@ifnull
Copy link

ifnull commented Jun 16, 2014

Looks like it is failing here:
https://github.com/saltstack/salt/blob/develop/salt/client/__init__.py#L212

There was a recent commit that mentions that there are other reasons for failure here besides authentication:
4ba2094

I was able to reduce the problem to the following command.

import salt.client
client = salt.client.LocalClient()
client.cmd(tgt='*', fun='state.highstate', arg=[], timeout=5, expr_form='glob', ret='')

In this example, pub_data returns nothing.

root@vagrant:/var/www# salt --versions-report
           Salt: 2014.1.4
         Python: 2.7.6 (default, Mar 22 2014, 22:59:56)
         Jinja2: 2.7.2
       M2Crypto: 0.21.1
 msgpack-python: 0.3.0
   msgpack-pure: Not Installed
       pycrypto: 2.6.1
         PyYAML: 3.10
          PyZMQ: 14.0.1
            ZMQ: 4.0.4

@ifnull
Copy link

ifnull commented Jun 17, 2014

@cachedout @basepi From what I can tell, the problem occurs in the following lines of code. payload appears to be set to an empty string which causes pub() to return an empty string and subsequently fail the _check_pub_data check.

https://github.com/saltstack/salt/blob/develop/salt/client/__init__.py#L1366-L1371

Payload is unpredictably being set to an empty string:

(Pdb++) payload = sreq.send(payload_kwargs)  
(Pdb++) print payload

(Pdb++) payload = sreq.send(payload_kwargs)  
(Pdb++) payload = sreq.send(payload_kwargs)
(Pdb++) print payload
{'load': {'jid': '20140617011356196301', 'minions': ['minion']}, 'enc': 'clear'}
(Pdb++) payload = sreq.send(payload_kwargs)
(Pdb++) print payload
{'load': {'jid': '20140617011421541371', 'minions': ['minion']}, 'enc': 'clear'}
(Pdb++) payload = sreq.send(payload_kwargs)
(Pdb++) print payload
{'load': {'jid': '20140617011424300494', 'minions': ['minion']}, 'enc': 'clear'}
(Pdb++) payload = sreq.send(payload_kwargs)
(Pdb++) print payload

(Pdb++) payload = sreq.send(payload_kwargs)
(Pdb++) print payload
{'load': {'jid': '20140617011433539494', 'minions': ['minion']}, 'enc': 'clear'}
(Pdb++) payload = sreq.send(payload_kwargs)
(Pdb++) print payload
{'load': {'jid': '20140617011437036333', 'minions': ['minion']}, 'enc': 'clear'}
(Pdb++) payload = sreq.send(payload_kwargs)
(Pdb++) print payload
{'load': {'jid': '20140617011441100207', 'minions': ['minion']}, 'enc': 'clear'}
(Pdb++) payload = sreq.send(payload_kwargs)
(Pdb++) print payload

(Pdb++) 

@basepi
Copy link
Contributor

basepi commented Jun 17, 2014

Very strange, thanks for the updates!

@ifnull
Copy link

ifnull commented Jun 18, 2014

I no longer believe this is related to ZeroMQ 4. I downgraded to ZeroMQ 3 and had the same result.

Uninstall ZMQ

cd /tmp
apt-get remove libzmq3 libzmq3-dev -y
pip uninstall pyzmq -y

Install ZMQ 3

wget http://download.zeromq.org/zeromq-3.2.4.tar.gz
tar xvzf zeromq-3.2.4.tar.gz
cd ./zeromq-3.2.4
./configure
make
make install
ldconfig
pip install pyzmq --install-option="--zmq=/usr/local/"
cd ../
rm -fR ./zeromq-3*

Confirm

salt --versions-report
           Salt: 2014.1.4
         Python: 2.7.6 (default, Mar 22 2014, 22:59:56)
         Jinja2: 2.7.2
       M2Crypto: 0.21.1
 msgpack-python: 0.3.0
   msgpack-pure: Not Installed
       pycrypto: 2.6.1
         PyYAML: 3.10
          PyZMQ: 14.3.1
            ZMQ: 3.2.4

Notes

  • libzmq3 installed via apt is actually ZMQ 4.0.4
  • If you do not build ZMQ from source, pyzmq will try to build ZMQ itself.
  • easy_install will use the egg which does not contain ZMQ 3

References

@gravyboat
Copy link
Contributor

I am not encountering this error on the following Ubuntu 14.04 configuration inside an LXC:

           Salt: 2014.1.3
         Python: 2.7.6 (default, Mar 22 2014, 22:59:56)
         Jinja2: 2.7.2
       M2Crypto: 0.21.1
 msgpack-python: 0.3.0
   msgpack-pure: Not Installed
       pycrypto: 2.6.1
         PyYAML: 3.10
          PyZMQ: 14.0.1
            ZMQ: 4.0.4

@ifnull
Copy link

ifnull commented Jun 18, 2014

Just tried with 2014.1.3 and I'm still seeing the issue. It may be worth noting that I am using GitFS. I'm going to run a couple tests with GitFS excluded.

@ifnull
Copy link

ifnull commented Jun 18, 2014

@thatch45 made a commit (4ba2094) that indicates this may be related to inode usage. Maybe he could shed some light here.

@gravyboat
Copy link
Contributor

@danielfrg can you tell us what your setup looks like? Are you running this inside of virtualbox, vagrant, etc.? What does your output of --versions-report show?

@ifnull
Copy link

ifnull commented Jun 18, 2014

Solution / Workaround

The issue I was having was resolved by settings the number of Virtualbox CPUs from 2 to 1.

Notes

I'm thinking this is a Virtualbox issue. From what I have been reading there isn't currently a performance gain using more than 1 CPU with Virtualbox.

References

https://www.virtualbox.org/ticket/5957
https://ruin.io/2014/05/05/benchmarking-virtualbox-multiple-core-performance/

@danielfrg
Copy link
Contributor

I saw the issue on EC2. I was updating a master instance from 12.04 to 14.04 and saw the Failed to authenticate, is this user permitted to execute commands? error when doing test.ping and state.highstate. Similar to some people it works fine and then "randomly" shows the error again, and after a couple of minutes it works again.

I tried on 13.10 and everything is working fine. I could try to reproduce the error if needed but it was just a base ubuntu 14.04 image on EC2.

@ifnull
Copy link

ifnull commented Jun 19, 2014

@danielfrg what ec2 instance were you using?

@danielfrg
Copy link
Contributor

m3.large both master and minions

@basepi
Copy link
Contributor

basepi commented Sep 23, 2014

@cackovic This is the behavior most people saw originally in this issue, and should be fixed in an upcoming release of salt (the fix was in the bootstrap script, and I don't think we got the latest bootstrap version into 2014.1.11, but it will be in the next release after that.)

@cackovic
Copy link

@basepi thanks for the fast update. IMHO this bug is bad enough for a hotfix :)

@basepi
Copy link
Contributor

basepi commented Sep 24, 2014

@cackovic Just to verify, did you install via bootstrap or manually?

@cackovic
Copy link

@basepi it was with a bootstrap. I reinstalled manually and it seems to be ok.

@basepi
Copy link
Contributor

basepi commented Sep 25, 2014

Awesome, glad you got around the issue. The latest bootstrap script has the fix, and the next release of salt should have the latest bootstrap script.

@jessehu
Copy link

jessehu commented Oct 9, 2014

Hi @cachedout , the saltstack/salt-bootstrap#445 is marked as merged, but the file https://github.com/cachedout/salt-bootstrap/blob/a29dc8e519ada8acff9d15efd87290958321d60a/bootstrap-salt.sh is different from https://bootstrap.saltstack.com . I'm using "wget -q -O - https://bootstrap.saltstack.com | sh -s -- -M -X" to install salt-master on Debian 7.6 and get the same error "Failed to authenticate, is this user permitted to execute commands?" when run "sudo salt '*' state.highstate"

@basepi
Copy link
Contributor

basepi commented Oct 9, 2014

@s0undt3ch any idea why bootstrap.saltstack.com doesn't have that commit? Did it get changed again?

@basepi
Copy link
Contributor

basepi commented Oct 9, 2014

Oh, wait, maybe it was just never merged....

@rallytime
Copy link
Contributor

@jessehu - so the script at https://bootstrap.saltstack.com runs off of the stable branch of https://github.com/saltstack/salt-bootstrap. saltstack/salt-bootstrap#445 has been merged into the develop branch of salt-bootstrap, but the stable branch has not yet been updated with this change.

@zesty
Copy link

zesty commented Oct 16, 2014

+1 for fixing stable sooner rather than later

@basepi
Copy link
Contributor

basepi commented Oct 24, 2014

I wouldn't be surprised if it is updated by now. @s0undt3ch or @rallytime may know.

@douglasryanadams
Copy link

Similar issues from a host that's a master and a minion. When I try to run

$: sudo salt '*' state.highstate 

I get the Authentication error everyone else mentions. When I try to run

$: sudo salt-call state.highstate

I get this:

[INFO    ] Loading fresh modules for state activity
local:
    Data failed to compile:
----------
    Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/salt/state.py", line 2498, in call_highstate
    top = self.get_top()
  File "/usr/lib/python2.7/dist-packages/salt/state.py", line 2042, in get_top
    tops = self.get_tops()
  File "/usr/lib/python2.7/dist-packages/salt/state.py", line 1915, in get_tops
    saltenv
  File "/usr/lib/python2.7/dist-packages/salt/fileclient.py", line 143, in cache_file
    return self.get_url(path, '', True, saltenv)
  File "/usr/lib/python2.7/dist-packages/salt/fileclient.py", line 503, in get_url
    return self.get_file(url, dest, makedirs, saltenv)
  File "/usr/lib/python2.7/dist-packages/salt/fileclient.py", line 919, in get_file
    data = channel.send(load)
  File "/usr/lib/python2.7/dist-packages/salt/transport/__init__.py", line 91, in send
    return self._crypted_transfer(load, tries, timeout)
  File "/usr/lib/python2.7/dist-packages/salt/transport/__init__.py", line 83, in _crypted_transfer
    return _do_transfer()
  File "/usr/lib/python2.7/dist-packages/salt/transport/__init__.py", line 77, in _do_transfer
    timeout)
  File "/usr/lib/python2.7/dist-packages/salt/crypt.py", line 505, in loads
    data = self.decrypt(data)
  File "/usr/lib/python2.7/dist-packages/salt/crypt.py", line 482, in decrypt
    raise AuthenticationError('message authentication failed')
AuthenticationError: message authentication failed

And I'm sure my key is accepted already and that I'm running as sudo.


EDIT:
We discovered that this state was caused by our automated deployment code attempting to use 'service' to restart a master that was started with '/etc/init.d/' and was starting two salt-masters. We fixed this by sticking with '/etc/init.d/' for everything.

@starchy
Copy link

starchy commented Dec 6, 2014

I'm seeing this predictably with 2014.7 on Debian Wheezy when running an orchestration that kicks off a backup script with cmd.run. However, running cmd.run against the same minions succeeds without error.

# salt-run state.orchestrate orchestration.rdiff-backup [ERROR ] An exception occurred in this state: Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/salt/state.py", line 1533, in call **cdata['kwargs']) File "/usr/lib/python2.7/dist-packages/salt/states/saltmod.py", line 354, in function cmd_ret = __salt__['saltutil.cmd'](tgt, fun, **cmd_kw) File "/usr/lib/python2.7/dist-packages/salt/modules/saltutil.py", line 675, in cmd client, tgt, fun, arg, timeout, expr_form, ret, kwarg, **kwargs) File "/usr/lib/python2.7/dist-packages/salt/modules/saltutil.py", line 643, in _exec tgt, fun, arg, timeout, expr_form, ret, kwarg, **kwargs): File "/usr/lib/python2.7/dist-packages/salt/client/__init__.py", line 643, in cmd_iter **kwargs): File "/usr/lib/python2.7/dist-packages/salt/client/__init__.py", line 915, in get_iter_returns jinfo = self.gather_job_info(jid, tgt, tgt_type) File "/usr/lib/python2.7/dist-packages/salt/client/__init__.py", line 209, in gather_job_info timeout=timeout, File "/usr/lib/python2.7/dist-packages/salt/client/__init__.py", line 286, in run_job return self._check_pub_data(pub_data) File "/usr/lib/python2.7/dist-packages/salt/client/__init__.py", line 221, in _check_pub_data 'Failed to authenticate! This is most likely because this ' EauthAuthenticationError: Failed to authenticate! This is most likely because this user is not permitted to execute commands, but there is a small possibility that a disk error occurred (check disk/inode usage). [...]

@basepi
Copy link
Contributor

basepi commented Dec 8, 2014

@starchy That actually sounds like a different issue. Mind opening a new issue?

@kitplummer
Copy link

I'm seeing this message:

[root@ip-172-31-6-75 master]# salt '*' test.ping
[ERROR ] Salt request timed out. If this error persists, worker_threads may need to be increased.
Failed to authenticate! This is most likely because this user is not permitted to execute commands, but there is a small possibility that a disk error occurred (check disk/inode usage).

This is with CentOS 7 for both the Master and salt-cloud provisioned instances in EC2.

@basepi
Copy link
Contributor

basepi commented Jan 5, 2015

@kitplummer salt --versions-report please?

@cachedout
Copy link
Contributor

Are there any cases currently where people can reliably reproduce the error Failed to authenticate, is this user permitted to execute commands?. I would love to have something I can look at, if anybody can point me in that direction.

@saltuser
Copy link

Hi!

We hit this message yesterday. The reason seems to be that restarting salt master somehow left one process hanging.
After killing the hung/leftover process manually (and cruelly) the message disappeared.

salt --versions-report
Salt: 2014.7.0
Python: 2.7.3 (default, Mar 13 2014, 11:03:55)
Jinja2: 2.6
M2Crypto: 0.21.1
msgpack-python: 0.1.10
msgpack-pure: Not Installed
pycrypto: 2.6
libnacl: Not Installed
PyYAML: 3.10
ioflo: Not Installed
PyZMQ: 13.1.0
RAET: Not Installed
ZMQ: 3.2.3
Mako: 0.7.0

It's not much, but i hope it helps (and keeps the conversation going) :)

@basepi
Copy link
Contributor

basepi commented Jan 13, 2015

Interesting, thanks for the update @saltuser. I'm sure the extra process didn't suffer. ;)

@nacengineer
Copy link

@cachedout I can confirm this behavior happens when you change the number of CPUs on a Virtual Machine.

Basically I bumped the Procs to 2 sockets / 4 cores/socket when I bumped it back to the original 1/2 it worked again.

LSB Release

No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 14.04.1 LTS
Release:        14.04
Codename:       trusty

Uname Info

3.13.0-45-generic #74-Ubuntu SMP Tue Jan 13 19:36:28 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
           Salt: 2014.1.1
         Python: 2.7.6 (default, Mar 22 2014, 22:59:56)
         Jinja2: 2.7.2
       M2Crypto: 0.21.1
 msgpack-python: 0.1.10
   msgpack-pure: Not Installed
       pycrypto: 2.6.1
         PyYAML: 3.10
          PyZMQ: 14.0.1
            ZMQ: 4.0.4

running on VMware ESXi, 5.0.0, 623860

UPDATE:

I dug a little further. Most of my minions we installed from PPA so were on Salt 2014.7.1 and my master was not and thus was on the above version.

I upgraded the master to pull from the PPA version and now that its on 2014.7.1 too, I can confirm that this issue clears up. 😄

@s0undt3ch
Copy link
Member

Sorry for the slow response. As @rallytime pointed out, https://bootstrap.saltstack.com uses the stable branch of the repository. I try to merge the develop branch as often as possible with the new fixes. If you want to use the development version of the bootstrap script, please use https://bootstrap.saltstack.com/develop

@cachedout
Copy link
Contributor

This is a loooong issue. :]

At this point, I think we've more or less at the point where the issue as it was originally reported was traced back to lingering salt-master procs during an incomplete restart. Since it looks like that issue has been addressed via bootstrap and (hopefully) packaging changes, I am going to finally go ahead and close this.

If people have lingering issues that are reproduceable, please open them in separate issues and we will track them individually going forward instead of having this monolithic issue. Thanks.

@Pratik-Patil
Copy link

Pratik-Patil commented Jun 9, 2016

I have restarted the salt-master and it worked !!
restart salt-master;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior severity-low 4th level, cosemtic problems, work around exists
Projects
None yet
Development

No branches or pull requests