Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heartbeats #4

Merged
merged 8 commits into from
May 30, 2014
Merged

Heartbeats #4

merged 8 commits into from
May 30, 2014

Conversation

ethanrowe
Copy link
Owner

This addresses:

The unit tests are a tad coupled to the implementation. This stuff would be better done with an integration test, which is something I'm happy to work out a little later; we're still in an experimental phase, and I'm okay with using TDD at the unit level to make sure things are pieces together right, using manual runs to verify the full end-to-end stuff.

To facilitate better control of the consul agent, monitor its state,
and give more flexibility to what we do with it concurrently.

(It would be nicer to put something like this within a god or bluepill
app, but I think the various processes are too interdependent for
either to be a good choice.  Plus I want a single entry point that
stays in the foreground, ideal for running in a container).

The AgentProcess class is instantiated with the args as one wants
passed to "consul agent", and:
* It launches the thing and waits in a separate thread; we get
  immediate notification on exit.
* It attempts to verify that the agent is up by running "consul info"
  against it, which will give a non-zero exit code if the agent isn't
  up.
* It provides for "on up" callbacks and invokes them verification
  succeeds.

Next things:
* handle stopping
* handling signals
* revise launch_and_join to use the new object.
AutoConsul::Runner::AgentProcess.stop! will signal the running
agent with SIGINT to cause a clean shutdown.

It'll fire :on_stopping callbacks first.  These are registered
with AgentProcess#on_stopping.

It'll put itself into :stopping status before signaling.

When the process actually goes down, the AgentProcess#thread, which
is already waiting on that process, will see and flip the status
to :down.

Switching to :down will run callbacks registered with the
AgentProcess#on_down method.

Tweaks:
- fixed mistake in formatting of :spawn command
- AgentProcess#launch! puts the agent runner thread into
  :abort_on_exception mode, so a failed call to the consul will
  blow everything up.
AutoConsul::Runner::AgentProcess:
- #run! will just #launch! and #verify_up!, purely for convenience.
- #wait will wait for the agent to stop (by joining the waiting thread)
  and return the exit code.  Also for convenience.

So a basic usage pattern would be:

    runner = AutoConsul::Runner::AgentProcess.new(%w[
               -server -bootstrap -node my-name
             ])
    runner.run!
    # Exit with status code returned by the agent process.
    exit(runner.wait)
AutoConsul::Runner refactored methods to use the new AgentProcess
class for execution:
- Runner::run_agent!  -> Runner::agent_runner
- Runner::run_server! -> Runner::server_runner
- Same params as before
- Both return an AgentRunner with cluster join logic added in an
  :on_up callback.

Revised the main entry point to account for this, and to use exit
codes for the various commands.
To facilitate concurrent operations more easily, the
AutoConsul::Runner::AgentProcess#while_up helper lets you supply
a block that you want to run in parallel while the agent is running.

When the AgentProcess moves to an :up status, your block will start
in a new thread.

When the AgentProcess moves to either :stopping or :down states,
that thread will be killed.

If you want to poll things or write out periodic metrics or
whatever, this is a reasonable way to hook into the lifecycle of
the agent.
Add a "-t" or "--ticks" option that, when used with the "run"
command, causes heartbeats to issue at the specified interval in
a background thread as long as the agent is in the "up" state.

This means we get a single entry point for running the full discoverable
agent.

Additionally, the "-d" option was broken and only used the default
of "/tmp/consul/state".  Now it's properly honored, which means it'll
blow up if you don't have permissions to write to the requested path.
AutoConsul::Runner::AgentProcess
- "consul agent" is spawned into its own process group, so we can
  control signals through the parent proces.
- This means a SIGKILL to the parent won't go to the agent process,
  which is by design.
- SIGTERM and SIGINT are trapped and result in a stop! call to
  cause a clean shutdown.
- To avoid thread deadlocks, the signal handler puts the received
  signal onto a Queue; a separate thread pops from the queue and
  attempts the stop! call.
ethanrowe added a commit that referenced this pull request May 30, 2014
@ethanrowe ethanrowe merged commit 6c63e95 into master May 30, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant