Skip to content
This repository has been archived by the owner on Feb 3, 2023. It is now read-only.

Signal shutdown #1599

Merged
merged 8 commits into from
Jul 20, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG-UNRELEASED.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.htm
### Added
- **BREAKING:** Zomes must now include a `validate_agent` callback. If this rejects in any zome the DNA will not start. This can be used to enforce membrane requirements. [#1497](https://github.com/holochain/holochain-rust/pull/1497)
- Added a `get_links_count` method which allows the user to get number of links by base and tag [#1568](https://github.com/holochain/holochain-rust/pull/1568)### Changed
- The Conductor will shut down gracefully when receiving SIGINT (i.e. Ctrl+C) or SIGKILL, also causing a graceful shutdown of an attached n3h instance, if running [#1599](https://github.com/holochain/holochain-rust/pull/1599)

### Deprecated

Expand Down
1 change: 1 addition & 0 deletions conductor/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ holochain_core_types = { path = "../core_types" }
holochain_conductor_api = { path = "../conductor_api" }
lib3h_sodium = "=0.0.7"
holochain_common = { path = "../common" }
signal-hook = "=0.1.9"
structopt = "=0.2.15"
tiny_http = "=0.6.2"
ws = "=0.8.0"
27 changes: 23 additions & 4 deletions conductor/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,14 +17,16 @@
extern crate holochain_conductor_api;
extern crate holochain_core_types;
extern crate lib3h_sodium;
extern crate signal_hook;
extern crate structopt;

use holochain_conductor_api::{
conductor::{mount_conductor_from_config, Conductor, CONDUCTOR},
config::{self, load_configuration, Configuration},
};
use holochain_core_types::error::HolochainError;
use std::{fs::File, io::prelude::*, path::PathBuf, sync::Arc, thread::sleep, time::Duration};
use signal_hook::{iterator::Signals, SIGINT, SIGTERM};
use std::{fs::File, io::prelude::*, path::PathBuf, sync::Arc};
use structopt::StructOpt;

#[derive(StructOpt, Debug)]
Expand All @@ -43,6 +45,8 @@ fn main() {
.config
.unwrap_or(config::default_persistence_dir().join("conductor-config.toml"));
let config_path_str = config_path.to_str().unwrap();
let termination_signals =
Signals::new(&[SIGINT, SIGTERM]).expect("Couldn't create signals list");
println!("Using config path: {}", config_path_str);
match bootstrap_from_config(config_path_str) {
Ok(()) => {
Expand All @@ -66,9 +70,24 @@ fn main() {
.expect("Could not start UI servers!");
}

// TODO wait for a SIGKILL or SIGINT instead here.
loop {
sleep(Duration::from_secs(1))
for _sig in termination_signals.forever() {
let mut conductor_guard = CONDUCTOR.lock().unwrap();
let conductor = std::mem::replace(&mut *conductor_guard, None);
let refs = Arc::strong_count(&CONDUCTOR);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we care about dropping the value in the conductor_guard?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AshantiMutinta it will get dropped at the end of the block. What's your concern?

if refs == 1 {
println!("Gracefully shutting down conductor...");
} else {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the difference between 'graceful' shutdown and 'explicit' shutdown?

(context, I've lost the ability to fully track/understand some of the more sophisticated Rust code that's being written by y'all now)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the idea is that if the reference count for the CONDUCTOR Arc is 1 there is nobody else (no other thread) alive and running for this conductor and we don't need to explicitly call conductor.shutdown() as that is called in the Drop implementation.

But I don't completely get the difference either since we take ownership of the conductor in line 75 with mem::replace, right? So that object will be dropped at the end of the for-body's scope either way.. Isn't that right, @maackle?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this bit is a little confusing, but I figured it's better to have something than nothing. I was thinking that other threads would still have access to a conductor in the explicit shutdown case, but @lucksus is right that the conductor gets dropped either way. However, if there are other threads referencing CONDUCTOR at this point, they would wind up with a None when they expected Some(Conductor), so this warning at least explains why. I could modify the warning message a little, but I think it still mostly makes sense like this.

println!(
"Explicitly shutting down conductor. {} other threads were referencing it, so if unwrap errors follow, that might be why.",
refs - 1
);
conductor
.expect("No conductor running")
.shutdown()
.expect("Error shutting down conductor");
}
break;
// NB: conductor is dropped here and should shut down itself
}
}
Err(error) => println!("Error while trying to boot from config: {:?}", error),
Expand Down
15 changes: 8 additions & 7 deletions conductor_api/src/conductor/base.rs
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,10 @@ impl Drop for Conductor {
kill();
}
}
self.shutdown();

self.shutdown()
.unwrap_or_else(|err| println!("Error during shutdown, continuing anyway: {:?}", err));

if let Some(network) = self.n3h_keepalive_network.take() {
if let Err(err) = network.stop() {
println!("ERROR stopping network thread: {:?}", err);
Expand Down Expand Up @@ -446,16 +449,14 @@ impl Conductor {
}

/// Stop and clear all instances
/// @QUESTION: why don't we care about errors on shutdown?
pub fn shutdown(&mut self) {
let _ = self
.stop_all_instances()
.map_err(|error| notify(format!("Error during shutdown: {}", error)));
pub fn shutdown(&mut self) -> Result<(), HolochainInstanceError> {
self.stop_all_instances()?;
self.stop_all_interfaces();
self.signal_multiplexer_kill_switch
.as_ref()
.map(|sender| sender.send(()));
self.instances = HashMap::new();
Ok(())
}

pub fn spawn_network(&mut self) -> Result<SpawnResult, HolochainError> {
Expand Down Expand Up @@ -564,7 +565,7 @@ impl Conductor {
}

let config = self.config.clone();
self.shutdown();
self.shutdown().map_err(|e| e.to_string())?;

self.start_signal_multiplexer();

Expand Down