Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

salt-master not functional under 0.17.1 with order_masters: True #8134

Closed
rmrf opened this issue Oct 28, 2013 · 8 comments
Closed

salt-master not functional under 0.17.1 with order_masters: True #8134

rmrf opened this issue Oct 28, 2013 · 8 comments
Labels
Bug broken, incorrect, or confusing behavior fixed-pls-verify fix is linked, bug author to confirm fix
Milestone

Comments

@rmrf
Copy link

rmrf commented Oct 28, 2013

Hi guys.

here is the my salt-master with "order_masters: True", seem new version 0.17.1 have bug to handle this mode. btw syndic, minion also with this version.

# salt-master  --versions-report
           Salt: 0.17.1
         Python: 2.7.3 (default, Aug  1 2012, 05:14:39)
         Jinja2: 2.6
       M2Crypto: 0.21.1
 msgpack-python: 0.1.10
   msgpack-pure: Not Installed
       pycrypto: 2.4.1
         PyYAML: 3.10
          PyZMQ: 13.0.0
            ZMQ: 3.2.2

bellow is the information when start with salt-master -l debug

........

[DEBUG   ] This salt-master instance has accepted 0 minion keys.
Process MWorker-4:
Traceback (most recent call last):
  File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib/pymodules/python2.7/salt/master.py", line 706, in run
    self.__bind()
  File "/usr/lib/pymodules/python2.7/salt/master.py", line 611, in __bind
    ret = self.serial.dumps(self._handle_payload(payload))
  File "/usr/lib/pymodules/python2.7/salt/master.py", line 635, in _handle_payload
    'clear': self._handle_clear}[key](load)
  File "/usr/lib/pymodules/python2.7/salt/master.py", line 644, in _handle_clear
    return getattr(self.clear_funcs, load['cmd'])(load)
  File "/usr/lib/pymodules/python2.7/salt/master.py", line 1715, in _auth
    if not salt.utils.verify.valid_id(self.opts, load['id']):
  File "/usr/lib/pymodules/python2.7/salt/utils/verify.py", line 453, in valid_id
    return bool(clean_path(opts['pki_dir'], id_))
  File "/usr/lib/pymodules/python2.7/salt/utils/verify.py", line 437, in clean_path
    if not os.path.isabs(path):
  File "/usr/lib/python2.7/posixpath.py", line 53, in isabs
    return s.startswith('/')
AttributeError: 'NoneType' object has no attribute 'startswith'

if start salt-master with /etc/init.d/salt-master , will spot out these zombie process.

# ps aux | grep salt
root      3081  0.0  0.3 557360 25308 ?        Ssl  02:05   0:00 /usr/bin/python /usr/bin/salt-master
root      3082  0.0  0.3 363240 31360 ?        S    02:05   0:00 /usr/bin/python /usr/bin/salt-master
root      3089  0.0  0.2 295216 19020 ?        Sl   02:05   0:00 /usr/bin/python /usr/bin/salt-master
root      3090  0.0  0.2 295216 18940 ?        Sl   02:05   0:00 /usr/bin/python /usr/bin/salt-master
root      3095  0.0  0.0      0     0 ?        Z    02:05   0:00 [salt-master] <defunct>
root      3096  0.0  0.0      0     0 ?        Z    02:05   0:00 [salt-master] <defunct>
root      3099  0.0  0.0      0     0 ?        Z    02:05   0:00 [salt-master] <defunct>
root      3102  0.0  0.0      0     0 ?        Z    02:05   0:00 [salt-master] <defunct>
root      3103  0.0  0.0      0     0 ?        Z    02:05   0:00 [salt-master] <defunct>

Please kindly help.

Thanks
-Ryan

@basepi
Copy link
Contributor

basepi commented Oct 29, 2013

Can you test this on the latest develop branch? I think this was related to some changes we made with the ID-guessing in the config, and is fixed in the latest develop.

@ljpsfree
Copy link

I did some test.
File "/usr/lib/pymodules/python2.7/salt/master.py", line 611, in __bind
ret = self.serial.dumps(self._handle_payload(payload))

Here, in the payload, ID is None, that crashed later part.
payload was decoded from packages from somewhere, I've no idea where it comes.
I'll try develop branch.

@basepi
Copy link
Contributor

basepi commented Oct 30, 2013

Right, that's why I suspect it's related to the ID guessing bug -- ID shouldn't be None in the develop branch.

@rmrf
Copy link
Author

rmrf commented Oct 31, 2013

Thanks guys, after set id inside configuration files, this issue not happen again.

@ljpsfree
Copy link

@basepi Yes, I saw that code changes in develop branch.

In case other people found the same problem, I listed that code changes below.

Here's the code in 0.17
/usr/lib/pymodules/python2.7/salt/config.py line 814

if opts['id'] is None and minion_id:
    opts['id'], using_ip_for_id = get_id(opts['root_dir'])

minion_id is always False (default value), so get_id never be called
It has been changed in develop branch
/usr/lib/pymodules/python2.7/salt/config.py line 814

if opts['id'] is None:
    opts['id'], using_ip_for_id = get_id(opts['root_dir'],
                                             minion_id=minion_id)

Solution

  1. Update your code to develop branch
  2. If you need to insistent on 0.17, here's a simple walkaround to use:
    Assign a value for "id" parameter in minion config file which normally locates on /etc/salt/minion

@basepi
Copy link
Contributor

basepi commented Oct 31, 2013

Awesome. I'm going to close this one, keep us posted if it crops up again.

@basepi basepi closed this as completed Oct 31, 2013
@jacksontj
Copy link
Contributor

I am just in the process of releasing 0.17.1 myself and ran into this bug. I went ahead and cherry picked (952975e) the "fix" but i'm still getting the stack traces. I then went and added an id to the minion config and I'm still getting stack traces, are there other commits to fix this issue that I should pull down? Or are we close enough to the next release that I should just hold on.

Update: nevermind-- i'm a liar. I forgot to restart our module sync process which was still running the old version.

@basepi
Copy link
Contributor

basepi commented Nov 1, 2013

Cool. For the record, the next release is probably a week off. You can decide if that's "close enough". =)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior fixed-pls-verify fix is linked, bug author to confirm fix
Projects
None yet
Development

No branches or pull requests

4 participants