Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 70 additions & 0 deletions DESIGN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
Design
======

Manifest Format
---------------

A filepack manifest contains all information needed to verify the contents of a
directory. The `files` key of the manifest is a directory object mapping
filenames to directory entries, which may themselves be directories, or files,
in which case they contain the hash of the file contents, as well as the length
of the file.

The length of the file is not strictly necessary for verification, but is
included so that truncation or extension can be explicitly identified, which
may help in understanding verification failures.

File Hashes
-----------

The contents of files are hashed with
[BLAKE3](https://github.com/BLAKE3-team/BLAKE3), using the official Rust
implementation. BLAKE3 was chosen both for its speed, and for the fact that it
utilizes a Merkle tree construction. A Merkle tree allows for verified file
streaming and subrange inclusion proofs, which both seem useful in the context
of file hashing and verification.

Signatures
----------

Filepack allows for the creation of signatures over the contents of a manifest,
which thus commit to the contents of the directory covered by the manifest.
Signatures are made not over serialized manifest, but over a fingerprint hash,
a Merkle tree hash created from the contents of the manifest. This keeps
signatures independent of the manifest format, avoids issues with
canonicalization of the manifest JSON, avoids hash loops due to the inclusion
of signatures in the manifest itself, and allows proving the inclusion of files
covered by a signature using a Merkle receipt.

Fingerprints
------------

Although only manifest fingerprints are exposed externally, several types of
fingerprints are used internally, namely directory, entry, file, and message
fingerprints.

Fingerprints are constructed to be unique, both between and within types,
meaning that it is impossible two different values with different types or
contents, but which have the same fingerprint.

Fingerprints are BLAKE3 hashes. To guarantee that fingerprints are unique
between types, the hasher is first initialized with a length-prefixed string
unique to each type.

After the prefix, the value is hashed as a sequence of TLV fields.

Fields are hashed in order, but may be skipped, in the case of optional fields,
or repeated, in the case of fields containing multiple values.

Currently, no fingerprint test vectors exist, and the best documentation is
the code itself.

In particular, see:

- [FingerprintHasher](src/fingerprint_hasher.rs)
- [FingerprintPrefix](src/fingerprint_prefix.rs)
- [Manifest](src/manifest.rs)
- [Directory](src/directory.rs)
- [Entry](src/entry.rs)
- [Files](src/file.rs)
- [Message](src/message.rs)
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -304,6 +304,8 @@ Fingerprints are BLAKE3 hashes, constructed such that it is impossible to
produce objects which are different, either in type or content, but which have
the same fingerprint.

For details on how fingerprints are calculated, see [DESIGN.md](DESIGN.md).

Alternatives and Prior Art
--------------------------

Expand Down
2 changes: 1 addition & 1 deletion src/directory.rs
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ impl Directory {
}

pub(crate) fn fingerprint(&self) -> Hash {
let mut hasher = ContextHasher::new(HashContext::Directory);
let mut hasher = FingerprintHasher::new(FingerprintPrefix::Directory);

for (component, entry) in &self.entries {
hasher.field(0, entry.fingerprint(component).as_bytes());
Expand Down
4 changes: 2 additions & 2 deletions src/entry.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,14 @@ use super::*;

#[derive(Clone, Debug, Deserialize, PartialEq, Serialize)]
#[serde(deny_unknown_fields, rename_all = "kebab-case", untagged)]
pub(crate) enum Entry {
pub enum Entry {
Directory(Directory),
File(File),
}

impl Entry {
pub(crate) fn fingerprint(&self, component: &Component) -> Hash {
let mut hasher = ContextHasher::new(HashContext::Entry);
let mut hasher = FingerprintHasher::new(FingerprintPrefix::Entry);

hasher.field(0, component.as_bytes());

Expand Down
2 changes: 1 addition & 1 deletion src/file.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ pub struct File {

impl File {
pub(crate) fn fingerprint(&self) -> Hash {
let mut hasher = ContextHasher::new(HashContext::File);
let mut hasher = FingerprintHasher::new(FingerprintPrefix::File);
hasher.field(0, self.hash.as_bytes());
hasher.field(1, &self.size.to_le_bytes());
hasher.finalize()
Expand Down
31 changes: 14 additions & 17 deletions src/context_hasher.rs → src/fingerprint_hasher.rs
Original file line number Diff line number Diff line change
@@ -1,32 +1,29 @@
use super::*;

pub(crate) struct ContextHasher {
pub(crate) struct FingerprintHasher {
hasher: Hasher,
tag: u64,
}

impl ContextHasher {
impl FingerprintHasher {
pub(crate) fn field(&mut self, tag: u64, field: &[u8]) {
assert!(tag >= self.tag, "unexpected tag {tag}");
self.tag = tag;
self.integer(tag);
self.integer(field.len().into_u64());
self.hasher.update(&tag.to_le_bytes());
self.hasher.update(&field.len().into_u64().to_le_bytes());
self.hasher.update(field);
}

pub(crate) fn finalize(self) -> Hash {
self.hasher.finalize().into()
}

fn integer(&mut self, n: u64) {
self.hasher.update(&n.to_le_bytes());
}

pub(crate) fn new(context: HashContext) -> Self {
Self {
hasher: Hasher::new_derive_key(&format!("filepack:{context}")),
tag: 0,
}
pub(crate) fn new(context: FingerprintPrefix) -> Self {
let mut hasher = Hasher::new();
let prefix = context.prefix();
hasher.update(&prefix.len().into_u64().to_le_bytes());
hasher.update(prefix.as_bytes());
Self { hasher, tag: 0 }
}
}

Expand All @@ -37,16 +34,16 @@ mod tests {
#[test]
fn contexts_produce_distinct_hashes() {
let mut hashes = HashSet::new();
for context in HashContext::iter() {
assert!(hashes.insert(ContextHasher::new(context).finalize()));
for context in FingerprintPrefix::iter() {
assert!(hashes.insert(FingerprintHasher::new(context).finalize()));
}
}

#[test]
fn field_values_contribute_to_hash() {
let mut hashes = HashSet::new();
for value in 0..2 {
let mut hasher = ContextHasher::new(HashContext::Directory);
let mut hasher = FingerprintHasher::new(FingerprintPrefix::Directory);
hasher.field(0, &[value]);
assert!(hashes.insert(hasher.finalize()));
}
Expand All @@ -55,7 +52,7 @@ mod tests {
#[test]
#[should_panic(expected = "unexpected tag 0")]
fn tag_order() {
let mut hasher = ContextHasher::new(HashContext::File);
let mut hasher = FingerprintHasher::new(FingerprintPrefix::File);
hasher.field(1, &[]);
hasher.field(0, &[]);
}
Expand Down
10 changes: 4 additions & 6 deletions src/hash_context.rs → src/fingerprint_prefix.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,21 +2,19 @@ use super::*;

#[derive(Clone, Copy, EnumIter, IntoStaticStr)]
#[strum(serialize_all = "kebab-case")]
pub(crate) enum HashContext {
pub(crate) enum FingerprintPrefix {
Directory,
Entry,
File,
Message,
}

impl HashContext {
impl FingerprintPrefix {
fn name(self) -> &'static str {
self.into()
}
}

impl Display for HashContext {
fn fmt(&self, f: &mut Formatter) -> fmt::Result {
write!(f, "{}", self.name())
pub(crate) fn prefix(self) -> String {
format!("filepack:{}", self.name())
}
}
20 changes: 10 additions & 10 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,12 @@

use {
self::{
arguments::Arguments, component::Component, context_hasher::ContextHasher,
directory::Directory, display_path::DisplayPath, display_secret::DisplaySecret,
entries::Entries, entry::Entry, hash_context::HashContext, lint::Lint, lint_group::LintGroup,
message::Message, metadata::Metadata, options::Options, owo_colorize_ext::OwoColorizeExt,
path_error::PathError, private_key::PrivateKey, signature_error::SignatureError, style::Style,
subcommand::Subcommand, template::Template, utf8_path_ext::Utf8PathExt,
arguments::Arguments, component::Component, display_path::DisplayPath,
display_secret::DisplaySecret, entries::Entries, fingerprint_hasher::FingerprintHasher,
fingerprint_prefix::FingerprintPrefix, lint::Lint, lint_group::LintGroup, message::Message,
metadata::Metadata, options::Options, owo_colorize_ext::OwoColorizeExt, path_error::PathError,
private_key::PrivateKey, signature_error::SignatureError, style::Style, subcommand::Subcommand,
template::Template, utf8_path_ext::Utf8PathExt,
},
blake3::Hasher,
camino::{Utf8Component, Utf8Path, Utf8PathBuf},
Expand Down Expand Up @@ -54,16 +54,15 @@ use {
};

pub use self::{
error::Error, file::File, hash::Hash, manifest::Manifest, public_key::PublicKey,
relative_path::RelativePath, signature::Signature,
directory::Directory, entry::Entry, error::Error, file::File, hash::Hash, manifest::Manifest,
public_key::PublicKey, relative_path::RelativePath, signature::Signature,
};

#[cfg(test)]
use {assert_fs::TempDir, std::collections::HashSet, strum::IntoEnumIterator};

mod arguments;
mod component;
mod context_hasher;
mod directory;
mod display_path;
mod display_secret;
Expand All @@ -72,8 +71,9 @@ mod entry;
mod error;
mod file;
mod filesystem;
mod fingerprint_hasher;
mod fingerprint_prefix;
mod hash;
mod hash_context;
mod lint;
mod lint_group;
mod manifest;
Expand Down
2 changes: 1 addition & 1 deletion src/message.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ pub(crate) struct Message {

impl Message {
pub(crate) fn digest(self) -> Hash {
let mut hasher = ContextHasher::new(HashContext::Message);
let mut hasher = FingerprintHasher::new(FingerprintPrefix::Message);
hasher.field(0, self.fingerprint.as_bytes());
hasher.finalize()
}
Expand Down
2 changes: 1 addition & 1 deletion tests/fingerprint.rs
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ fn fingerprint() {

dir.child("filepack.json").assert(json.clone());

let fingerprint = "aef130d1d67b911c301079bf05ab37d4160f96c15cc4d1832ea77a9a24a2a73e";
let fingerprint = "864e5111ebe431702448d7d7c3f9b962d5659f761fb4287049d52d6376a4c20e";

cargo_bin_cmd!("filepack")
.arg("fingerprint")
Expand Down
4 changes: 2 additions & 2 deletions tests/verify.rs
Original file line number Diff line number Diff line change
Expand Up @@ -731,7 +731,7 @@ fn verify_fingerprint() {
.args([
"verify",
"--fingerprint",
"aef130d1d67b911c301079bf05ab37d4160f96c15cc4d1832ea77a9a24a2a73e",
"864e5111ebe431702448d7d7c3f9b962d5659f761fb4287049d52d6376a4c20e",
])
.current_dir(&dir)
.assert()
Expand All @@ -749,7 +749,7 @@ fn verify_fingerprint() {
"\
fingerprint mismatch: `.*filepack\\.json`
expected: 0000000000000000000000000000000000000000000000000000000000000000
actual: aef130d1d67b911c301079bf05ab37d4160f96c15cc4d1832ea77a9a24a2a73e
actual: 864e5111ebe431702448d7d7c3f9b962d5659f761fb4287049d52d6376a4c20e
error: fingerprint mismatch\n",
))
.failure();
Expand Down