-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
split publish/subscribe behavior to better-leverage context #476
Comments
Updated the worked demo example here: https://gist.github.com/tgross/b167a2b1d4a5b727643d6e25f9962553 This one handles signals (SIGTERM and SIGINT, plus SIGHUP for reloads) rather than having a fixed time interval. |
There's a lot of work sitting behind this and I definitely feel the urge to complete what I've put in so far. At the moment I'm stuck attempting to uncover the reason why CP is not exiting when all jobs have completed cleanly. The primary integration test where this was uncovered is running the following ( {
consul: "127.0.0.1:8500",
jobs: [
{
name: "testenvvar",
port: 8500,
exec: "env",
health: {
exec: "true",
interval: 10,
ttl: 25
}
}
]
} This will halt execution and exits cleanly on master so I know the call stack the bug is in but not the specific code that's causing it. When this fires on my branch, it's still blocking on the Still in flux. |
Ok, just as a note, it looks like the control server is not being shut down when a job exits cleanly/successfully. This makes sense because the existing implementation doesn't |
Another note, I'm on a good path now and down to a single failing integration test, |
…taCenter#476) This commit refactors away EventHandler to replace it with separate Publisher/Subscriber interfaces. The side effect is properly using the `context` package throughout ContainerPilot as the basis for it's shutdown process. This helps provide a more structured shutdown and no longer relies on the event bus and event processing which was bug prone. * Introduce more `context` handling throughout our shutdown processes * Introduce Subscriber and Publisher interfaces * Turn `Job` into a Publisher and Subscriber * Turn `Watch` into a Publisher * Turn `Metric` into a Subscriber * Refactor Control and Telemetry server shutdown/reload * Feed a context.CancelFunc up stack into the reload control endpoint * Telemetry server reload uses an appropriate context.CancelFunc * Fix all broken unit tests for pub/sub refactoring * Patch tests around pub/sub to use QuitByClose event for testing only * Reuse QuitByTest event for shutting down through internal control channel * Shut down CP after all Jobs have completed
I cleaned up where I left off a few weeks ago and all tests are passing. Going to start rigorous testing in order to make sure this thing is truly stable. |
…taCenter#476) This commit refactors away EventHandler to replace it with separate Publisher/Subscriber interfaces. The side effect is properly using the `context` package throughout ContainerPilot as the basis for it's shutdown process. This helps provide a more structured shutdown and no longer relies on the event bus and event processing which was bug prone. * Introduce more `context` handling throughout our shutdown processes * Introduce Subscriber and Publisher interfaces * Turn `Job` into a Publisher and Subscriber * Turn `Watch` into a Publisher * Turn `Metric` into a Subscriber * Refactor Control and Telemetry server shutdown/reload * Feed a context.CancelFunc up stack into the reload control endpoint * Telemetry server reload uses an appropriate context.CancelFunc * Fix all broken unit tests for pub/sub refactoring * Patch tests around pub/sub to use QuitByClose event for testing only * Reuse QuitByTest event for shutting down through internal control channel * Shut down CP after all Jobs have completed
This commit refactors away EventHandler to replace it with separate Publisher/Subscriber interfaces. The side effect is properly using the `context` package throughout ContainerPilot as the basis for it's shutdown process. This helps provide a more structured shutdown and no longer relies on the event bus and event processing which was bug prone. * Introduce more `context` handling throughout our shutdown processes * Introduce Subscriber and Publisher interfaces * Turn `Job` into a Publisher and Subscriber * Turn `Watch` into a Publisher * Turn `Metric` into a Subscriber * Refactor Control and Telemetry server shutdown/reload * Feed a context.CancelFunc up stack into the reload control endpoint * Telemetry server reload uses an appropriate context.CancelFunc * Fix all broken unit tests for pub/sub refactoring * Patch tests around pub/sub to use QuitByClose event for testing only * Reuse QuitByTest event for shutting down through internal control channel * Shut down CP after all Jobs have completed
This has just been merged. Thanks for the proposal @tgross! |
No idea why I closed this without getting it released first. |
Cancelling jobs, watches, servers, etc. has been a bit of a bug farm. We don't really leverage
context
as much as we probably could, but any use of this needs to make sure we're sending messages back like stopping/stopped and deregistering from the event bus.I've worked up a preliminary design that splits the
EventHandler
into separatePublisher
andSubscriber
types and usescontext.CancelFunc
as a means of signaling shutdown instead of publishing an event. This seems to work ok and I think will be subtly simpler in the long run.Here's the worked demo: https://gist.github.com/tgross/6b0e32a85157c045718ee2ef5d3a8da6
If you
go run main.go 1000
this will run for 1000ms and then exit cleanly. It has the pub/sub types, an implementation of watch (pub-only), metric (sub-only), and job (pub+sub).cc @cheapRoc
The text was updated successfully, but these errors were encountered: