Skip to content

Commit

Permalink
Rewrite package install strategy.
Browse files Browse the repository at this point in the history
This changeset contains several logical changes, all of which are
coupled together as they were discovered in the process of refactoring
the installation logic. The logical changes are broken out into sections
below.

Pull-Through Local Caching Package Installation
-----------------------------------------------

The implementation for package installation was re-written in order to
aggressively use the artifact cache (by default `/hab/cache/artifacts`
on disk). The previous implementation was built up over time with
installation from local artifacts added at a later date. Consequently,
many installation inconsistencies were present:

* When performing an installation from a local package artifact, all
  dependencies were blindly installed from an upstream Depot--whether or
  not they were installed or even cached on disk.
* When installing from a package identifier string, the fully dependency
  list was queried from an upstream Depot even though the fetched
  package artifact contained this metadata. Furthermore, the artifact's
  own metadata should be trusted over any other source, assuming
  the package is confirmed "verified".
* The local package artifact installation read package dependencies from
  its metadata file correctly which uncovered a bug (fixed in #1013)
  which would have been found earlier had all installation strategies
  uses the same code path.
* When performing an installation from a local package artifact, the
  artifact file was never copied into the artifact cache for future
  re-use. Neither were any locally available dependency package
  artifacts.

This change introduces a new internal struct called an `InstallTask`
that maintains knowledge of various directories as well as a common
Depot client which is only constructed once. There are 2 primary
installation strategies still present: install from a package identifier
string (ex: `core/redis`) or install from a local package artifact (ex:
`*.hart`). Effort was made to ensure that both approaches use the same
code path as much as possible to minimize future behavioral bugs.

The updated installation high level strategies are summarized here:

From Package Identifier:

1. Is the given package identifier fully qualified? If not, query the
   upstream Depot for the latest package identifier that satisfies the
   given "fuzzy" package identifier. In this way a given `core/redis`
   may return a fully qualified `core/redis/3.0.7/20160614001713`
   identifier.
2. Is the specific release already installed for the determined fully
   qualified package identifier? If so, return early as there is no
   further work to do.
3. Is there a package artifact for the specific release in the artifact
   cache? If there is, we use this one. Otherwise, fetch the specific
   release from the upstream Depot and write the package artifact into
   the artifact cache.
4. Verify the package artifact to ensure that its fully qualified package
   identifier precisely matches the intended target package identifier.
   Next, verify that the package artifact is correctly signed and is
   self-consistent.
5. Determine the fully list of runtime dependencies by extracting the
   `TDEPS` metadata file from the local package artifact. For each
   dependency, perform steps 2 through 4 above so that all dependencies
   are either confirmed pre-installed or each cached package artifact
   has been fully verified.
6. Extract/unpack each package artifact dependency that was not found to
   be pre-installed, followed by the target package artifact last. The
   order is important to ensure that target package is not installed
   until all of its dependencies are correctly installed.

From Local Package Artifact:

1. Extract the fully qualified package identifier from the `IDENT`
   metadata file in the package artifact.
2. Is the specific release already installed for the determined fully
   qualified package identifier? If so, return early as there is no
   further work to do.
3. Copy the artifact into the artifact cache and rewrite the file name
   based on the computed artifact name. This means that we can install
   from a Habitat package with any arbitrary name like `file.1` and it
   would be rewritten to
   `core-redis-3.0.7-20160614001713-x86_64-linux.hart` in the artifact
   cache.
4. Verify the package artifact to ensure that its fully qualified package
   identifier precisely matches the intended target package identifier.
   Next, verify that the package artifact is correctly signed and is
   self-consistent.
5. Determine the fully list of runtime dependencies by extracting the
   `TDEPS` metadata file from the local package artifact. For each
   dependency, perform steps 2 through 4 in the "From Package
   Identifier" section so that all dependencies are either confirmed
   pre-installed or each cached package artifact has been fully
   verified. There is one difference however: for each dependency, first
   check to see if a package artifact exists in the same directory that
   contained the original target package artifact. There is a good
   chance that if the directory contains many other package artifacts
   that we can use these rather than re-fetching the exact same ones
   from an upstream Depot. Unlike the "From Package Identifier"
   scenario, we have an explicit directory we can check, so there is
   no harm in trying.
6. Extract/unpack each package artifact dependency that was not found to
   be pre-installed, followed by the target package artifact last. The
   order is important to ensure that target package is not installed
   until all of its dependencies are correctly installed.

New: Supervisor Start From Package Artifact
-------------------------------------------

In the course of the installation refactoring above, an easy new
improvement (or feature) presented itself: the ability to start
supervising a package that has not yet been installed, given a local
package artifact. In other words, this change now supports the
following:

    hab start ./results/core-redis-3.0.7-20160614001713-x86_64-linux.hart

or:

    hab sup start /tmp/redis.hart

assuming that `/tmp/redis.hart` is a legitimate Habitat package artifact
and was simply renamed.

As with the installation implementation rewrite above, if the package is
already installed, then there is no work to do, otherwise the "From
Local Package Artifact" strategy above is used.

The intended primary use case for this pattern in for Plan authors
working on their Plans in a Studio instance. This will allow them to
build a package artifact and them immediately start it without having to
install it beforehand. Secondarily, this pattern may be useful in
environments with slow or limited internet access and eventually fully
"air gapped" environments.

Fixed: at-once Update Strategy
------------------------------

Also in the course of the above refactoring 2 bugs were found in the
"at-once" update strategy in the Supervisor:

1. Once a new version of a package is found in an upstream Depot, it is
   installed but the new version is not loaded when `start_package()`
   is called (source:
https://github.com/habitat-sh/habitat/blob/9e25233fafe0433864ab47d0d8ede6344d30be8a/components/sup/src/command/start.rs#L120)
   with the consequence that the original release is re-loaded and
   new version is never used until the Supervisor completely
   restarts.
2. The former custom install logic in the "at-once" strategy only
   fetched and extracted the new target package meaning that if the
   new release contained updated dependency versions, these would
   not be installed and could result in the supervised process
   entering a crash loop. The fix for this was simply to invoke the
   same installation logic as used everywhere else in the system and is
   explained in much greater detail above.

Improved: Removed Optional Depot URL in Supervisor
--------------------------------------------------

While refactoring the Supervisor component, it became clear that earlier
versions of the codebase accepted an *optional* upstream Depot URL.
Currently, fall-back default URL values are used and so we can guarantee
that a value is always present for the Supervisor. The config value of
`url` was updated to simply be a `String` (formerly an `Option<String>`)
and all extra checking was removed where a value may not have been
present. Less branching logic is almost always a good thing!

Bonus: Remove Unused Imports
----------------------------

In the course of regression testing, several "unused import" warnings
were found and removed. A few were due to code refactoring, but others
were most likely due to upgrading the Rust compiler toolchain in the
last few weeks. Just keeping the shop floor swept.

Signed-off-by: Fletcher Nichol <fnichol@nichol.ca>
  • Loading branch information
fnichol committed Aug 4, 2016
1 parent ea7f763 commit 2e19a32
Show file tree
Hide file tree
Showing 10 changed files with 297 additions and 261 deletions.
380 changes: 191 additions & 189 deletions components/common/src/command/package/install.rs

Large diffs are not rendered by default.

10 changes: 10 additions & 0 deletions components/common/src/error.rs
Expand Up @@ -27,6 +27,7 @@ pub type Result<T> = result::Result<T, Error>;

#[derive(Debug)]
pub enum Error {
ArtifactIdentMismatch((String, String, String)),
CantUploadGossipToml,
CryptoKeyError(String),
GossipFileRelativePath(String),
Expand All @@ -46,6 +47,12 @@ pub enum Error {
impl fmt::Display for Error {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
let msg = match *self {
Error::ArtifactIdentMismatch((ref a, ref ai, ref i)) => {
format!("Artifact ident {} for `{}' does not match expected ident {}",
ai,
a,
i)
}
Error::CantUploadGossipToml => {
format!("Can't upload gossip.toml, it's a reserved file name")
}
Expand All @@ -72,6 +79,9 @@ impl fmt::Display for Error {
impl error::Error for Error {
fn description(&self) -> &str {
match *self {
Error::ArtifactIdentMismatch((_, _, _)) => {
"Artifact ident does not match expected ident"
}
Error::CantUploadGossipToml => "Can't upload gossip.toml, it's a reserved filename",
Error::CryptoKeyError(_) => "Missing or invalid key",
Error::GossipFileRelativePath(_) => {
Expand Down
14 changes: 7 additions & 7 deletions components/hab/src/command/pkg.rs
Expand Up @@ -219,13 +219,13 @@ pub mod export {
println!("Searching for {} in remote {}",
&format_ident.to_string(),
&default_depot_url());
try!(install::from_url(&default_depot_url(),
format_ident,
PRODUCT,
VERSION,
Path::new(FS_ROOT_PATH),
&cache_artifact_path(None),
&default_cache_key_path(None)));
try!(install::start(&default_depot_url(),
&format_ident.to_string(),
PRODUCT,
VERSION,
Path::new(FS_ROOT_PATH),
&cache_artifact_path(None),
&default_cache_key_path(None)));
}
}
let pkg_arg = OsString::from(&ident.to_string());
Expand Down
14 changes: 7 additions & 7 deletions components/hab/src/exec.rs
Expand Up @@ -98,13 +98,13 @@ pub fn command_from_pkg(command: &str,
println!("{}",
Cyan.bold()
.paint(format!("∵ Package for {} not found, installing", &ident)));
try!(common::command::package::install::from_url(&default_depot_url(),
ident,
PRODUCT,
VERSION,
fs_root_path,
&cache_artifact_path(None),
cache_key_path));
try!(common::command::package::install::start(&default_depot_url(),
&ident.to_string(),
PRODUCT,
VERSION,
fs_root_path,
&cache_artifact_path(None),
cache_key_path));
command_from_pkg(&command, &ident, &cache_key_path, retry + 1)
}
Err(e) => return Err(Error::from(e)),
Expand Down
2 changes: 1 addition & 1 deletion components/hab/src/gossip.rs
Expand Up @@ -84,7 +84,7 @@ pub mod hab_gossip {
use common::gossip_file::GossipFile;
use common::wire_message::WireMessage;
use hcore::crypto::SymKey;
use rustc_serialize::{json, Encodable};
use rustc_serialize::json;
use utp::UtpSocket;
use uuid::Uuid;

Expand Down
93 changes: 51 additions & 42 deletions components/sup/src/command/start.rs
Expand Up @@ -60,15 +60,14 @@ use std::env;
use std::path::Path;

use ansi_term::Colour::Yellow;
use common::command::ProgressBar;
use common::command::package::install;
use depot_client::Client;
use hcore::crypto::default_cache_key_path;
use hcore::fs::{cache_artifact_path, FS_ROOT_PATH};
use hcore::package::PackageIdent;

use {PRODUCT, VERSION};
use error::{Error, Result};
use error::Result;
use config::{Config, UpdateStrategy};
use package::Package;
use topology::{self, Topology};
Expand All @@ -85,61 +84,71 @@ static LOGKEY: &'static str = "CS";
/// * Fails if an unknown topology was specified on the command line
pub fn package(config: &Config) -> Result<()> {
match Package::load(config.package(), None) {
Ok(package) => {
Ok(mut package) => {
let update_strategy = config.update_strategy();
match update_strategy {
UpdateStrategy::None => {}
_ => {
if let &Some(ref url) = config.url() {
outputln!("Checking remote for newer versions...");
// It is important to pass `config.package()` to `show_package()` instead of the
// package identifier of the loaded package. This will ensure that if the operator
// starts a package while specifying a version number, they will only automaticaly
// receive release updates for the started package.
//
// If the operator does not specify a version number they will automatically receive
// updates for any releases, regardless of version number, for the started package.
let depot_client = try!(Client::new(url, PRODUCT, VERSION, None));
let latest_pkg_data =
try!(depot_client.show_package((*config.package()).clone()));
let latest_ident: PackageIdent = latest_pkg_data.get_ident().clone().into();
if &latest_ident > package.ident() {
outputln!("Downloading latest version from remote: {}", latest_ident);
let mut progress = ProgressBar::default();
let archive = try!(depot_client.fetch_package(latest_ident,
&cache_artifact_path(None),
Some(&mut progress)));
try!(archive.verify(&default_cache_key_path(None)));
try!(archive.unpack(None));
} else {
outputln!("Already running latest.");
};
}
let url = config.url();
outputln!("Checking Depot for newer versions...");
// It is important to pass `config.package()` to `show_package()` instead
// of the package identifier of the loaded package. This will ensure that
// if the operator starts a package while specifying a version number, they
// will only automaticaly receive release updates for the started package.
//
// If the operator does not specify a version number they will
// automatically receive updates for any releases, regardless of version
// number, for the started package.
let depot_client = try!(Client::new(url, PRODUCT, VERSION, None));
let latest_pkg_data =
try!(depot_client.show_package((*config.package()).clone()));
let latest_ident: PackageIdent = latest_pkg_data.get_ident().clone().into();
if &latest_ident > package.ident() {
outputln!("Downloading latest version from Depot: {}", latest_ident);
let new_pkg_data = try!(install::start(url,
&latest_ident.to_string(),
PRODUCT,
VERSION,
Path::new(FS_ROOT_PATH),
&cache_artifact_path(None),
&default_cache_key_path(None)));
package = try!(Package::load(&new_pkg_data, None));
} else {
outputln!("Already running latest.");
};
}
}
start_package(package, config)
}
Err(_) => {
outputln!("{} is not installed",
Yellow.bold().paint(config.package().to_string()));
match *config.url() {
Some(ref url) => {
let url = config.url();
let new_pkg_data = match config.local_artifact() {
Some(artifact) => {
try!(install::start(url,
&artifact,
PRODUCT,
VERSION,
Path::new(FS_ROOT_PATH),
&cache_artifact_path(None),
&default_cache_key_path(None)))
}
None => {
outputln!("Searching for {} in remote {}",
Yellow.bold().paint(config.package().to_string()),
url);
let new_pkg_data = try!(install::from_url(url,
config.package(),
PRODUCT,
VERSION,
Path::new(FS_ROOT_PATH),
&cache_artifact_path(None),
&default_cache_key_path(None)));
let package = try!(Package::load(&new_pkg_data.get_ident().clone().into(),
None));
start_package(package, config)
try!(install::start(url,
&config.package().to_string(),
PRODUCT,
VERSION,
Path::new(FS_ROOT_PATH),
&cache_artifact_path(None),
&default_cache_key_path(None)))
}
None => Err(sup_error!(Error::PackageNotFound(config.package().clone()))),
}
};
let package = try!(Package::load(&new_pkg_data, None));
start_package(package, config)
}
}
}
Expand Down
18 changes: 14 additions & 4 deletions components/sup/src/config.rs
Expand Up @@ -85,7 +85,8 @@ impl Default for Command {
pub struct Config {
command: Command,
package: PackageIdent,
url: Option<String>,
local_artifact: Option<String>,
url: String,
topology: Topology,
group: String,
path: String,
Expand Down Expand Up @@ -261,12 +262,12 @@ impl Config {

/// Set the url
pub fn set_url(&mut self, url: String) -> &mut Config {
self.url = Some(url);
self.url = url;
self
}

/// Return the url
pub fn url(&self) -> &Option<String> {
pub fn url(&self) -> &str {
&self.url
}

Expand Down Expand Up @@ -382,6 +383,15 @@ impl Config {
&self.package
}

pub fn set_local_artifact(&mut self, artifact: String) -> &mut Config {
self.local_artifact = Some(artifact);
self
}

pub fn local_artifact(&self) -> Option<&str> {
self.local_artifact.as_ref().map(String::as_ref)
}

pub fn set_organization(&mut self, org: String) -> &mut Config {
self.organization = Some(org);
self
Expand Down Expand Up @@ -439,7 +449,7 @@ mod tests {
fn url() {
let mut c = Config::new();
c.set_url(String::from("http://foolio.com"));
assert_eq!(c.url().as_ref().unwrap(), "http://foolio.com");
assert_eq!(c.url(), "http://foolio.com");
}

#[test]
Expand Down
22 changes: 15 additions & 7 deletions components/sup/src/main.rs
Expand Up @@ -24,6 +24,7 @@ extern crate libc;
#[macro_use]
extern crate clap;

use std::path::Path;
use std::process;
use std::result;
use std::str::FromStr;
Expand All @@ -34,7 +35,7 @@ use hcore::env as henv;
use hcore::fs;
use hcore::crypto::{default_cache_key_path, SymKey};
use hcore::crypto::init as crypto_init;
use hcore::package::PackageIdent;
use hcore::package::{PackageArchive, PackageIdent};
use hcore::url::{DEFAULT_DEPOT_URL, DEPOT_URL_ENVVAR};

use sup::config::{Command, Config, UpdateStrategy};
Expand Down Expand Up @@ -74,9 +75,15 @@ fn config_from_args(subcommand: &str, sub_args: &ArgMatches) -> Result<Config> {
if let Some(ref archive) = sub_args.value_of("archive") {
config.set_archive(archive.to_string());
}
if let Some(ref package) = sub_args.value_of("package") {
let ident = try!(PackageIdent::from_str(package));
config.set_package(ident);
if let Some(ref ident_or_artifact) = sub_args.value_of("pkg_ident_or_artifact") {
if Path::new(ident_or_artifact).is_file() {
let ident = try!(PackageArchive::new(Path::new(ident_or_artifact)).ident());
config.set_package(ident);
config.set_local_artifact(ident_or_artifact.to_string());
} else {
let ident = try!(PackageIdent::from_str(ident_or_artifact));
config.set_package(ident);
}
}
if let Some(key) = sub_args.value_of("key") {
config.set_key(key.to_string());
Expand Down Expand Up @@ -255,12 +262,13 @@ fn main() {
};

let sub_start = SubCommand::with_name("start")
.about("Start a Habitat-supervised service from a package")
.about("Start a Habitat-supervised service from a package or artifact")
.aliases(&["st", "sta", "star"])
.arg(Arg::with_name("package")
.arg(Arg::with_name("pkg_ident_or_artifact")
.index(1)
.required(true)
.help("Name of package to start"))
.help("A Habitat package identifier (ex: acme/redis) or a filepath to a Habitat \
Artifact (ex: /home/acme-redis-3.0.7-21120102031201-x86_64-linux.hart)"))
.arg(arg_url())
.arg(arg_group())
.arg(arg_org())
Expand Down
4 changes: 1 addition & 3 deletions components/sup/src/topology/mod.rs
Expand Up @@ -148,9 +148,7 @@ impl<'a> Worker<'a> {
UpdateStrategy::None => {}
_ => {
let pkg_lock_2 = pkg_lock.clone();
if let &Some(ref url) = config.url() {
pkg_updater = Some(package::PackageUpdater::start(url, pkg_lock_2));
}
pkg_updater = Some(package::PackageUpdater::start(config.url(), pkg_lock_2));
}
}

Expand Down
1 change: 0 additions & 1 deletion components/sup/tests/util/command.rs
Expand Up @@ -12,7 +12,6 @@
// See the License for the specific language governing permissions and
// limitations under the License.

use std::io::prelude::*;
use std::io;
use std::process::{Command, Child, Stdio, ExitStatus};
use std::fmt;
Expand Down

0 comments on commit 2e19a32

Please sign in to comment.