Skip to content
This repository has been archived by the owner on Dec 21, 2021. It is now read-only.

Move restart policy to systemd #263

Merged
merged 14 commits into from
Aug 30, 2021
Merged

Conversation

siegfriedweber
Copy link
Member

@siegfriedweber siegfriedweber commented Aug 17, 2021

Description

Fixes #207
Tested by stackabletech/agent-integration-tests#61

Changed

  • Handling of service restarts moved from the Stackable agent to systemd.

Removed

  • Check removed if a service starts up correctly within 10 seconds. systemd manages restarts now and the Stackable agent cannot detect if a service is in a restart loop.

Review Checklist

  • Code contains useful comments
  • (Integration-)Test cases added (or not applicable)
  • Documentation added (or not applicable)
  • Changelog updated (or not applicable)

Copy link
Member

@soenkeliebau soenkeliebau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, I've left a few questions that I'd like to clarify, but most of those will probably simply be resolveable.

As far as I can tell this would not notice if a service enters into a permanent restart loop, right? It would update the restart count on the pod, but never set a failed condition or something similar?

src/provider/states/pod/starting.rs Show resolved Hide resolved
src/provider/states/pod/running.rs Show resolved Hide resolved
src/provider/states/pod/running.rs Show resolved Hide resolved
src/provider/systemdmanager/service.rs Show resolved Hide resolved
razvan
razvan previously approved these changes Aug 23, 2021
Copy link
Member

@razvan razvan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't look closely at the code but the corresponding integration tests are green.

@siegfriedweber
Copy link
Member Author

As far as I can tell this would not notice if a service enters into a permanent restart loop, right? It would update the restart count on the pod, but never set a failed condition or something similar?

Right. If the restart policy of the pod is set to Always or OnFailure then this is the desired behavior, I think. The Stackable Agent also does not have a notion what a "restart loop" is. We regard a restart every 10 seconds as a restart loop but a restart every 10 minutes probably not.

soenkeliebau
soenkeliebau previously approved these changes Aug 30, 2021
Copy link
Member

@soenkeliebau soenkeliebau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

The running state is no longer responsible for deciding if a service
should be restarted. This is done by systemd. Therefore the transition
from the running state to the starting state was removed.

The service_state function was added to the SystemdManager to check if a
service is running or terminated. This function is used for the
transition from the running to the terminated state and for patching the
container state.
When a systemd service is started then the agent does not wait anymore
ten seconds to check if it was successfully started because systemd
manages restarts now and the agent cannot detect if the service is in a
restart loop.
@siegfriedweber siegfriedweber merged commit 98bf547 into main Aug 30, 2021
@siegfriedweber siegfriedweber deleted the move_restart_policy_to_systemd branch August 30, 2021 11:24
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move restart behavior of services from agent to systemd
4 participants