Data Model

tarcieri edited this page Oct 7, 2014 · 62 revisions

The Cryptosphere exposes a "blob store" model similar to Amazon S3. All files in the system are immutable. Files provide both the base abstraction for all persistent distributed state in the system and a way to store user-provided data in a distributed manner. All files are encrypted. Access is controlled using a capability-based security model.

Files have two capability levels:

  • Read: Locate and decrypt all of the content in a given block tree. Readers can also Repair (see below) or provide only the Repair capability to others.
  • Repair: Only has access to a "shadow tree." Can enumerate and verify all blocks in the tree but cannot decrypt them

Files are composed of blocks which are the core encryption primitive of the system.

Threat Model

  • We are handing data to a potential attacker, which is securely identified by an ID and encrypted with a secret key
  • This portion of the threat model assumes we ask the potential attacker for data we originally gave them by ID, and they give something back to us, potentially the data we are interested in. For now we're unconcerned with them losing the data or whatever
  • They're going to give us something and we need to ensure it's what we expect, even if we lost the original data and have only an ID by which to access it
  • We would like to ensure the confidentiality of the data even when it's in the hands of an attacker, as long as they don't have the secret key for it
  • We would like to (electively) guard against the confirmation of file attack and learn the remaining information attack in convergent encryption systems by using a "convergence secret" (see below)
  • We would like for files to remain confidential even if an ID and the convergence secret are known to an attacker (but not the secret keys)

Diagram of an Immutable File

Immutable File Diagram

Diagram showing the structure and arrangement of blocks within the Cryptosphere:

  • K: Key Blocks
  • D: Data Blocks
  • S: Shadow Blocks
  • J: Join Blocks

Block Encryption Scheme

Blocks are the fundamental cryptographic building component of the system. Blocks contain between 0 and 1 Mebibyte (1048576 bytes) of plaintext, along with a small amount of metadata which is encrypted along with the plaintext. Every block in the entire system is convergently encrypted with a unique key, derived as the combination of the hash of the original content and an optional "convergence secret" which allows individuals, groups, or all users of the system globally to store data in a manner that always produces the same ciphertext for the given plaintext.

Version 0

The following describes the initial block format. Versioning is proactively included in the system in preparation for the event that security issues are discovered in the ciphers used in this version and they need to be changed out for newer, better ciphers without known vulnerabilities.

The following cryptographic primitives are used as part of a construction called Blake2bXSalsa20Poly1305:

BLAKE2b and secretbox are combined as follows:

  • p: is the 0 to 1 MiB plaintext to be encrypted
  • s: is an optional convergence secret of 0 (i.e. omitted) to 64 bytes in length
  • k: is the 32-byte symmetric key we will use, which we derive as BLAKE2b(p,32,s)
  • c: is the encrypted block computed as secretbox(k,0,p) meaning we encrypt p under k with a nonce of zero
  • i: is the ID of the block, computed as BLAKE2b(c,32,"crypt.block")

This scheme deterministically computes the same block for the same combination of plaintext p and convergence secret s, while also ensuring every block is encrypted under a unique key. Because we derive a unique key per block, we do not need to worry about nonces and can always supply zeroes.

We also calculate a unique identifier for the block by hashing the ciphertext under the key "block". While it would be nice to use BLAKE2's personalization strings here, for various reasons the string "block" is supplied as a key instead. This may be amended in future versions of the block format.

We can now locate and verify the integrity of any block in the system with knowledge of i, as the Cryptosphere provides a content addressable storage for blocks, i.e. given knowledge of i the Cryptosphere provides the ability to locate and obtain c (note: more details needed here). With knowledge of both i and k we can obtain the original plaintext of the block by using i to locate c then using k to decrypt c to obtain p.

A secure MAC (Poly1305) is used to verify the integrity of the data, however it's important that we also always verify that the hash of c matches i. This is because anyone with k can produce any ciphertexts they want, however it may not be the ciphertext we're interested in. To ensure we've received the desired ciphertext, we must ensure its hash matches what we're expecting. We can't check the hash of the plaintext that was used to derive k, because that would require knowing the convergence secret s used to calculate k, which may be undesirable to give away with the block. So integrity checking is twofold:

  • Calculate the BLAKE2b hash of c and ensure it matches i
  • Verify the integrity of c under k using Poly1305 (performed for us by secretbox)

Block Types

The Cryptosphere defines the following block types as the structural components of its data model:

Data Blocks

Data blocks store up to 1MiB of arbitrary data, in addition to their header. The header is 2 bytes. Once decrypted, data blocks have the following structure:

  • version byte ([a-z0-9]): indicates the version of the block format. The only allowed value is 0
  • D (0x44): indicates a data block
  • data (.{0,1048576}): up to 1MiB of arbitrary data

Key Blocks

Key blocks provide secret keys to multiple other sub-blocks, and are used to store files which are larger than 1MiB. Key blocks can point to multiple data sub-blocks, or potentially other list sub-blocks in a structure known as a Merkle tree, although the present version of the Cryptosphere does not yet support nested key blocks.

Key Blocks have the following structure:

  • version byte ([a-z0-9]): indicates the version of the block format. The only allowed value is 0
  • K (0x4b): indicates a key block
  • shadow block ID (.{32}): the ID of a block which contains only the IDs (but not the keys) of the blocks listed in this block (see "Shadow Blocks" below)
  • shadow block key (.{32}): the secret key to the shadow block. Together with the shadow block ID, provides a capability for verifying the integrity of the sub-blocks of this block
  • keys ((.{32}){1,32768}): The secret keys for between 1 and 32768 other sub-blocks which belong to this block. Each key block maps directly onto data blocks can therefore represent up to 32 Gibibytes (34359738368 bytes) of data.

In order to access a particular sub-block of a key block, a client must also obtain the corresponding shadow block which contains the IDs of the sub-blocks.

Shadow Blocks

Shadow blocks provide the verification capability for sub-blocks without revealing their contents. Each shadow block maps directly onto a key block, but contains only the IDs of the sub-blocks, rather than the secret keys.

Shadow Blocks have the following structure:

  • version byte ([a-z0-9]): indicates the version of the block format. The only allowed value is 0
  • S (0x53): indicates a shadow block
  • IDs ((.{32}){1,32768}): The IDs for between 1 and 32768 other sub-blocks which belong to this shadow block. Any peer with this block alone is able to verify the integrity of all of its sub-blocks, but cannot decrypt them without the corresponding key block

Join Blocks

Join blocks are used to shadow non-leaf levels in the Merkle tree. They join together two shadow blocks, identified in the diagram above as S1 and S2. S1 points to the next level of key blocks. S2 points to the next level of shadow blocks. In addition, join blocks contain the keys to all of the shadow blocks in the next level (i.e. the ones point to by S2). This allows someone with knowledge of the join block to perform a "deep verify" of the entire structure of the Merkle tree, without revealing its contents.

Join Blocks have the following structure:

  • version byte ([a-z0-9]): indicates the version of the block format. The only allowed value is 0
  • K (0x4b): indicates a key block
  • S1 block ID (.{32}): the ID of a shadow block which contains only the IDs (but not the keys) of the key blocks in the next level of the tree
  • S1 block key (.{32}): the secret key to the S1 block
  • S2 block ID (.{32}): the ID of a shadow block which contains only the IDs (but not the keys) of the shadow blocks in the next level of the tree
  • S2 block key (.{32}): the secret key to the S2 block
  • keys ((.{32}){1,32768}): The secret keys for between 1 and 32768 shadow blocks (whose IDs are stored in S2) in the next level of the tree

Pack Blocks

NOTE: the initial version of the Cryptosphere will not implement pack blocks. They are described here as a potential optimization to prevent slow, deeply nested recursive lookups

Pack Blocks allow embedding multiple, potentially recursively-nested sub-blocks within a single, outer block. This allows the entire subtrees of deeply nested structures to be represented as a single block, or potentially a smaller number of condensed (hence "pack") blocks.

Open Questions

Q: Is this combination of primitives redundant and unnecessarily slow?

Arguably in this sort of content addressable storage system where everything is authenticated through hash trees whose roots are signed, the Poly1305 MAC is redundant. Zooko argued against the use of a MAC in these sorts of systems in his blog post On the limits of the use cases for authenticated encryption.

Some pros and cons of a MAC here:

Pros: (in favor of XSalsa20Poly1305)

  • djb doesn't make mistakes (most of the time). Using cryptographer-supplied authenticated encryption gives us a larger safety margin
  • Cryptographically secure bug (or potential attack?) detection: guaranteed to blow up if we supply the wrong key or if the ciphertext is corrupt

Cons: (would favor "naked" Salsa20, potentially in combination with Blake2b)

  • We can authenticate content by its ID (hash of its ciphertext) alone
  • Anyone else who has access to the same key (e.g. potentially an attacker) can create ciphertexts under that same key. We need to check the Blake2b hash to ensure we have the correct content
  • Speed. Not having to compute the Poly1305 MAC would improve the performance of the system



  • Should compute hashes of both plaintext (unkeyed) and ciphertext
  • Shouldn't use convergence secret as key of plaintext hash. Hash the convergence secret with the hash of the plaintext after computing that unkeyed
  • Should specify how files are divided into chunks (i.e. can last block be <1MiB or is it padded?)
  • Should specify a file length
  • Block header format should be extensible
  • Versioning and compatibility suck
  • See Merkator File Spec

Suggested Reading

Please see P2P Storage section of our Reading List