Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

salt.states.service: detect that service failed to start/stop #36607

Merged
merged 1 commit into from
Sep 28, 2016
Merged

salt.states.service: detect that service failed to start/stop #36607

merged 1 commit into from
Sep 28, 2016

Conversation

vutny
Copy link
Contributor

@vutny vutny commented Sep 27, 2016

What does this PR do?

It checks if service was successfully started up using service.running state or stopped (killed) using service.dead state and makes Salt report False if desired result wasn't achieved.

What issues does this PR fix or reference?

It fixes issues #16677 #33540

Previous Behavior

Example of weird side effects
Service is down:

salt-call service.status zookeeper
local:
    False

This is an attempt to start the service:

salt-call service.start zookeeper
local:
    True

But, because of bad config file or missed deps, the service actually failed to start:

salt-call service.status zookeeper
local:
    False

Unexpectedly, Salt thinks that the service is up and running:

# salt-call -l debug state.sls_id zookeeper-service zookeeper.server
...

[DEBUG   ] Rendered data from file: /var/cache/salt/minion/files/base/zookeeper/server/init.sls:

....

zookeeper-service:
  service.running:
    - name: zookeeper
    - enable: true

...

[INFO    ] Running state [zookeeper] at time 14:17:20.869420
[INFO    ] Executing state service.running for zookeeper
[DEBUG   ] LazyLoaded cmd.run
[INFO    ] Executing command ['systemctl', 'status', 'zookeeper.service', '-n', '0'] in directory '/root'
[DEBUG   ] output: * zookeeper.service - Apache Zookeeper
   Loaded: loaded (/etc/systemd/system/zookeeper.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Tue 2016-09-27 14:16:50 UTC; 30s ago
  Process: 20876 ExecStop=/usr/lib/zookeeper/bin/zkServer.sh stop (code=exited, status=0/SUCCESS)
  Process: 20866 ExecStart=/usr/lib/zookeeper/bin/zkServer.sh start-foreground (code=exited, status=127)
 Main PID: 20866 (code=exited, status=127)
[INFO    ] Executing command ['systemctl', 'is-active', 'zookeeper.service'] in directory '/root'
[DEBUG   ] output: failed
[INFO    ] Executing command ['systemctl', 'is-enabled', 'zookeeper.service'] in directory '/root'
[DEBUG   ] output: enabled
[INFO    ] Executing command ['systemctl', 'is-enabled', 'zookeeper.service'] in directory '/root'
[DEBUG   ] output: enabled
[DEBUG   ] Service 'zookeeper' is not masked
[INFO    ] Executing command ['systemd-run', '--scope', 'systemctl', 'start', 'zookeeper.service'] in directory '/root'
[DEBUG   ] output: Running scope as unit run-20966.scope.
[INFO    ] Executing command ['systemctl', 'is-enabled', 'zookeeper.service'] in directory '/root'
[DEBUG   ] output: enabled
[INFO    ] Executing command ['systemctl', 'is-active', 'zookeeper.service'] in directory '/root'
[DEBUG   ] output: failed
[INFO    ] Executing command ['systemctl', 'is-enabled', 'zookeeper.service'] in directory '/root'
[DEBUG   ] output: enabled
[INFO    ] Service zookeeper is already enabled, and is running
[INFO    ] Completed state [zookeeper] at time 14:17:21.537021 duration_in_ms=667.601
[DEBUG   ] Initializing new AsyncZeroMQReqChannel for ('/etc/salt/pki/minion', 'salt-centos7', 'tcp://172.17.0.2:4506', 'aes')
[DEBUG   ] Initializing new AsyncAuth for ('/etc/salt/pki/minion', 'salt-centos7', 'tcp://172.17.0.2:4506')
[DEBUG   ] LazyLoaded highstate.output
local:
...
----------
          ID: zookeeper-service
    Function: service.running
        Name: zookeeper
      Result: True
     Comment: Service zookeeper is already enabled, and is running
     Started: 14:17:20.869420
    Duration: 667.601 ms
     Changes:   
...

New Behavior

Salt is able to detect that starting/stopping the service has been failed and outputs correct results.

----------
          ID: zookeeper-service
    Function: service.running
        Name: zookeeper
      Result: False
     Comment: Service zookeeper is already enabled, and is dead
     Started: 15:06:15.198733
    Duration: 667.819 ms
     Changes:   

Tests written?

Added negative test case to the test_running test in the unit.states.service_test module.

Altered test_dead test:

  • Modified test case which mocks service.status output: return second status False (after calling service.stop in the state module).
  • Converted test case to negative to be able to detect that service.stop does not really stop the service.

@cachedout cachedout merged commit 4ab52ae into saltstack:2015.8 Sep 28, 2016
@vutny vutny deleted the detect-service-fail branch September 28, 2016 08:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants