Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

salt-call falsely reports a master as down if it does not have PKI directories created #40948

Closed
ScoreUnder opened this issue Apr 28, 2017 · 8 comments
Labels
Bug broken, incorrect, or confusing behavior Core relates to code central or existential to Salt P4 Priority 4 severity-low 4th level, cosemtic problems, work around exists stale
Milestone

Comments

@ScoreUnder
Copy link
Contributor

ScoreUnder commented Apr 28, 2017

Description of Issue/Question

After the setup described in the setup section, salt-call will give the following output:

[DEBUG   ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'hawaii.<company tld>', 'tcp://<ip 1>:4506')
[INFO    ] Master salt.<company tld> could not be reached, trying next master (if any)
[WARNING ] Master ip address changed from <ip 1> to <ip 2>
[DEBUG   ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'hawaii.<company tld>', 'tcp://<ip 2>:4506')

IPs/hosts scrubbed. Note that while I have not included the next few lines of log, the next master it tries fails in exactly the same way.

In salt/minion.py, adding these lines:

                     except SaltClientError as exc:
                         last_exc = exc
+                        import traceback
+                        traceback.print_exc()
                         msg = ('Master {0} could not be reached, trying '
                                'next master (if any)'.format(opts['master']))
                         log.info(msg)
                         continue

Causes the real error to appear:

[DEBUG   ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'hawaii.<company tld>', 'tcp://<ip 1>:4506')
Traceback (most recent call last):
  File "/usr/local/Cellar/saltstack/2016.11.3/libexec/lib/python2.7/site-packages/salt/minion.py", line 563, in eval_master
    pub_channel = salt.transport.client.AsyncPubChannel.factory(opts, **factory_kwargs)
  File "/usr/local/Cellar/saltstack/2016.11.3/libexec/lib/python2.7/site-packages/salt/transport/client.py", line 162, in factory
    return salt.transport.zeromq.AsyncZeroMQPubChannel(opts, **kwargs)
  File "/usr/local/Cellar/saltstack/2016.11.3/libexec/lib/python2.7/site-packages/salt/transport/zeromq.py", line 298, in __init__
    self.auth = salt.crypt.AsyncAuth(self.opts, io_loop=self.io_loop)
  File "/usr/local/Cellar/saltstack/2016.11.3/libexec/lib/python2.7/site-packages/salt/crypt.py", line 342, in __new__
    new_auth.__singleton_init__(opts, io_loop=io_loop)
  File "/usr/local/Cellar/saltstack/2016.11.3/libexec/lib/python2.7/site-packages/salt/crypt.py", line 381, in __singleton_init__
    self.get_keys()
  File "/usr/local/Cellar/saltstack/2016.11.3/libexec/lib/python2.7/site-packages/salt/crypt.py", line 634, in get_keys
    salt.utils.verify.check_path_traversal(self.opts['pki_dir'], user)
  File "/usr/local/Cellar/saltstack/2016.11.3/libexec/lib/python2.7/site-packages/salt/utils/verify.py", line 400, in check_path_traversal
    raise SaltClientError(msg)
SaltClientError: Could not access /etc/salt/pki. Path does not exist.
[INFO    ] Master salt.<company tld> could not be reached, trying next master (if any)
[WARNING ] Master ip address changed from <ip 1> to <ip 2>
[DEBUG   ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'hawaii.<company tld>', 'tcp://<ip 2>:4506')

This is a bug in two parts:

  1. salt-call is not creating the requisite directory structure (though it will happily generate the keys and save them once the directories exist)
  2. The error logged is misleading (claiming that a master cannot be reached, but the actual error is a missing directory on the minion)

I would argue that the error message is the more important part of the bug, because with a correct error message, a sufficiently literate operator can solve the problem by themselves.

Setup

  • Install salt 2016.11.3 via brew on OSX
  • Drop in the minion configs for communicating with the masters
  • Ensure that the /etc/salt/pki directory does not exist (it was not automatically created for me)

Steps to Reproduce Issue

  • salt-call test.ping -l trace

Versions Report

Salt Version:
           Salt: 2016.11.3
 
Dependency Versions:
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: Not Installed
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.9.5
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: 1.0.6
   msgpack-pure: Not Installed
 msgpack-python: 0.4.8
   mysql-python: 1.2.5
      pycparser: Not Installed
       pycrypto: 2.6.1
         pygit2: Not Installed
         Python: 2.7.6 (default, Sep  9 2014, 15:04:36)
   python-gnupg: Not Installed
         PyYAML: 3.12
          PyZMQ: 16.0.2
           RAET: Not Installed
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.4.2
            ZMQ: 4.2.2
 
System Versions:
           dist:   
        machine: x86_64
        release: 14.0.0
         system: Darwin
        version: 10.10.1 x86_64
@Ch3LL
Copy link
Contributor

Ch3LL commented Apr 28, 2017

I'm not sure why the /etc/salt/pki directory was not created for you. Are the packages available via brew the same mac packages from repo.saltstack.com? Did you remove the directory after authenticating to the master for some reason?

Speaking to the error looks like @terminalmage provided a PR in #40961 if you want to give that a try.

@Ch3LL Ch3LL added Bug broken, incorrect, or confusing behavior Core relates to code central or existential to Salt severity-low 4th level, cosemtic problems, work around exists P4 Priority 4 labels Apr 28, 2017
@Ch3LL Ch3LL added this to the Approved milestone Apr 28, 2017
@terminalmage
Copy link
Contributor

I could never reproduce this, and I've had a PR open for almost 2 months now. @ScoreUnder can you try the fix in that PR?

@ScoreUnder
Copy link
Contributor Author

When you say you can't reproduce this, do you mean the pki/minion directory always exists, or that it gets created successfully?

@terminalmage
Copy link
Contributor

terminalmage commented Jul 10, 2017

Sorry for the delay. So, I was finally able to reproduce, but to do so I had to never have started the salt-minion daemon and accepted the key on the master. Either that, or I had to stop the salt-minion daemon, remove /etc/salt/pki, and delete the minion key from the master using salt-key -d minion-id.

When you say "Drop in the minion configs for communicating with the masters", what exactly do you mean? salt-call (unless run with --local), will attempt to send the return from the job to the master, and it can't do this if the key hasn't been accepted on the master yet. The way this is typically done when a salt-minion daemon is involved is that the daemon will first try to authenticate to the master. If the master has no accepted key for that minion, then the salt-minion deamon will automatically perform the key exchange with the master. The salt-minion daemon then waits for 10 seconds and reattempts the authentication, until such time as the master accepts the key.

With salt-call however, no such automatic key exchange is performed. salt-call can be used without a daemon, but It is assumed that key exchange has already taken place. If you would like to use salt-call with no daemon, there are two options:

  1. Start the salt-minion daemon initially and accept the key on the master using salt-key.
  2. Generate the keys beforehand using this walkthrough and copy the pre-accepted key to the minion.

Using the second option will let you use salt-call without ever having started a daemon.

@ScoreUnder
Copy link
Contributor Author

When you say "Drop in the minion configs for communicating with the masters", what exactly do you mean?

The minion configs only contained the hostnames of the masters to communicate with.

It looks like I might have to go for option 1. Usually we would run test.ping to ensure that the key is registered with the master, and then accept it from there before starting the salt-minion daemon.

@rijnhard
Copy link

rijnhard commented Jun 1, 2018

Running into the same issue in docker.
I had to get systemd working for the minions though (which was a whole other story) but anyway.

Out of the box it works for centos:7 and opensuse/leap:42, and generates the /etc/salt/pki/minion directories successfully.

It DOESN'T work for debian 8 or 9. and its reproducible, each time it does not generate the pki folder.

dockerfiles here:

Debian 9

FROM minimum2scp/systemd-stretch

ENV HOME /root
USER root

RUN apt-get update \
    && apt-get install -y wget vim software-properties-common gnupg

RUN wget -O - https://repo.saltstack.com/apt/debian/9/amd64/latest/SALTSTACK-GPG-KEY.pub | apt-key add - \
    && echo "deb http://repo.saltstack.com/apt/debian/9/amd64/latest stretch main" > /etc/apt/sources.list.d/saltstack.list \
    && apt-get update \
    && apt-get install -y salt-minion

# dont even try to override entrypoint or cmd, it wont work.

Debian 8

FROM minimum2scp/systemd-jessie

ENV HOME /root
USER root

RUN echo 'debconf debconf/frontend select Noninteractive' | debconf-set-selections \
    && apt-get update \
    && apt-get install -y wget vim

RUN wget -O - https://repo.saltstack.com/apt/debian/8/amd64/latest/SALTSTACK-GPG-KEY.pub | apt-key add - \
    && echo "deb http://repo.saltstack.com/apt/debian/8/amd64/latest jessie main" > /etc/apt/sources.list.d/saltstack.list \
    && apt-get update \
    && apt-get install -y salt-minion

# dont even try to override entrypoint or cmd, it wont work.

@rijnhard
Copy link

rijnhard commented Jun 1, 2018

Just a silly hunch, but could it be a missed dependency?

@stale
Copy link

stale bot commented Sep 14, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.

@stale stale bot added the stale label Sep 14, 2019
@stale stale bot closed this as completed Sep 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior Core relates to code central or existential to Salt P4 Priority 4 severity-low 4th level, cosemtic problems, work around exists stale
Projects
None yet
Development

No branches or pull requests

4 participants