Skip to content

Commit

Permalink
Adding several BCRs. Renamed CID -> ARID.
Browse files Browse the repository at this point in the history
  • Loading branch information
wolfmcnally committed Sep 4, 2023
1 parent 852f729 commit bd51df4
Show file tree
Hide file tree
Showing 11 changed files with 240 additions and 91 deletions.
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,13 +26,15 @@ This repository contains research and proposals of interest to the blockchain co
| [BCR-2021-001](papers/bcr-2021-001-request.md) | UR Type Definitions for Transactions Between Airgapped Devices | Wolf McNally |
| [BCR-2021-002](papers/bcr-2021-002-digest.md) | Digests for Digital Objects | Wolf McNally |
| [BCR-2022-001](papers/bcr-2022-001-encrypted-message.md) | Encrypted Message | Wolf McNally |
| [BCR-2022-002](papers/bcr-2022-002-cid-common-identifier.md) | CID: Common Identifier | Wolf McNally |
| [BCR-2022-002](papers/bcr-2022-002-arid.md) | ARID: Apparently Random Identifier | Wolf McNally |
| [BCR-2023-001](papers/bcr-2023-001-compressed-message.md) | Compressed Message | Wolf McNally |
| [BCR-2023-002](papers/bcr-2023-002-known-value.md) | Known Values: A Compact, Deterministic Representation for Ontological Concepts | Wolf McNally |
| [BCR-2023-003](papers/bcr-2023-003-envelope-known-value.md) | Gordian Envelope Extension: Known Values | Wolf McNally |
| [BCR-2023-004](papers/bcr-2023-004-envelope-symmetric-encryption.md) | Gordian Envelope Extension: Symmetric Encryption | Wolf McNally |
| [BCR-2023-005](papers/bcr-2023-005-envelope-compression.md) | Gordian Envelope Extension: Compression | Wolf McNally |
| [BCR-2023-006](papers/bcr-2023-006-envelope-attachment.md) | Gordian Envelope Extension: Attachments | Wolf McNally |
| [BCR-2023-007](papers/bcr-2023-007-envelope-output-desc.md) | Bitcoin Output Descriptors in Gordian Envelopes | Wolf McNally |
| [BCR-2023-008](papers/bcr-2023-008-dcbor-date.md) | Preferred Encoding of Dates in dCBOR | Wolf McNally |

_Also see our [Testimony](https://github.com/BlockchainCommons/Testimony/blob/master/README.md) and our [Wallet Improvement Proposals](https://github.com/BlockchainCommons/wips/blob/master/README.md)._

Expand Down
2 changes: 1 addition & 1 deletion papers/bcr-2020-006-urtypes.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ This document also lists the tag, if any, defined for the particular CBOR struct
| `crypto-sskr` | 309 | SSKR (Sharded Secret Key Reconstruction) shard | [[BCR-2020-011]](bcr-2020-011-sskr.md) |
| `crypto-psbt` | 310 | Partially Signed Bitcoin Transaction (PSBT) | This document |
| `crypto-account` | 311 | BIP44 Account | [[BCR-2020-015]](bcr-2020-015-account.md) |
| `cid` | 40012 | Common Identifier | [[BCR-2022-002]](bcr-2022-002-cid-common-identifier.md) | |
| `arid` | 40012 | Apparently Random Identifier | [[BCR-2022-002]](bcr-2022-002-arid.md) | |
| `seed-digest` | 40013 | Seed digest | [BCFoundation] |
| `nonce` | 40014 | Cryptographic nonce | [SecureComponents] |
| `password` | 40015 | Hashed password (e.g., Scrypt) | [SecureComponents] |
Expand Down
6 changes: 6 additions & 0 deletions papers/bcr-2020-010-output-desc.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,12 @@ Revised: November 8, 2021

---

### DEPRECATED: Superceded by Envelope-Based Output Descriptors

This document has been superceded by textual [Bitcoin output descriptors enclosed in Gordian Envelopes](bcr-2023-007-envelope-output-desc.md).

The content below is now deprecated and of historical interest only.

### Introduction

Output descriptors [OD-IN-CORE] [OSD], also called output script descriptors, are a way of specifying Bitcoin payment outputs that can range from a simple address to multisig and segwit using a simple domain-specific language. For more on the motivation for output descriptors, see [WHY-OD].
Expand Down
6 changes: 6 additions & 0 deletions papers/bcr-2020-015-account.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,12 @@ Revised: December 6, 2021<br/>

---

### DEPRECATED: To be superceded by Envelope-Based BIP44 Accounts

This document will soon be superceded by a format using Gordian Envelope, which will be based on [Bitcoin output descriptors enclosed in Gordian Envelopes](bcr-2023-007-envelope-output-desc.md).

The content below is now deprecated and of historical interest only.

### Abstract

This BCR describes a data format that promotes standards-based sharing of [BIP44] account level xpubs and other information, allowing devices to join wallets with little to no user interaction required.
Expand Down
126 changes: 126 additions & 0 deletions papers/bcr-2022-002-arid.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
# ARID: Apparently Random Identifier

## BCR-2022-002

**© 2022 Blockchain Commons**

Authors: Wolf McNally, Christopher Allen, Shannon Applecline<br/>
Date: Aug 10, 2022<br/>
Updated: Sep 3, 2023

---

## Introduction

Information systems use many kinds of identifiers for many purposes. The main purpose of an identifier is to uniquely point to an object, or *referent*, within a given domain. An identifier that is *universally* unique can be associated to any object or concept in all of existence and be relied on to be unique because it contains sufficient entropy (randomness) to ensure that it will, for every conceivable practical purpose, *never* collide with another such identifier.

## Survey

Universally unique identifiers have precedent in (for example) [UUIDs](https://en.wikipedia.org/wiki/Universally_unique_identifier), [URIs](https://en.wikipedia.org/wiki/Uniform_Resource_Identifier), and cryptographic digests.

### UUIDs

UUIDs are 128 bits in length and come in several different versions. Each version specifies several bitfields and their semantics. Version 4 is specified to be random, but is still not completely random because it does not specify that cryptographically strong randomness is always be used, and it reserves a 7 bits to identify it *as* a version 4 UUID, leaving 121 bits of actual randomness.

### URIs

URIs are (more or less) human readable text. The specification of URIs usually focuses on human-understandable semantics and are frequently hierarchical, starting with the `scheme` field, which describes a namespace within which the remainder of the URI is considered to point to a referent.

### Digests

A cryptographic hash algorithm such as SHA-256 or BLAKE3 maps a block of data of arbitrary size to a fixed-length "digest." This digest reveals nothing about the source image by itself, but can only be computed by applying the same algorithm to the same image. A digest can thereby be considered a "pointer" to a particular binary referent.

## Introducing the ARID

We propose herein a standard for a cryptographically strong, universally unique identifier known as an Apparently Random Identifier, or ARID.

The goals for this form of identifier are:

* Non-correlatability
* Neutral semantics
* Open generation
* Minimum strength
* Cryptographic suitability

## Non-Correlatability

To be an ARID, the sequence of bits that comprise it MUST NOT be correlatable with its referent, nor any other ARID. Therefore, it MUST NOT be a hash or digest of another object.

The sequence of bits in an ARID MUST be statistically indistinguishable from pure entropy. Therefore one method of generating an ARID is to use a cryptographically strong random number generator.

However, the source of entropy for an ARID does not itself have to actually be random; it simply has to be indistinguishable from randomness without additional hidden information. One example would be when a sequence of ARIDs are generated from a ratcheting key generation algorithm. Knowing the current state of the ratchet and correct ARID would give one the ability to ratchet the key to the next state and generate the next ARID in the sequence. A third-party observer would be unable to correlate the next ARID with the previous ARID without access to the secret ratchet state.

## Neutral Semantics

Existing identifiers frequently contain inherent type information (UUID version 4 identifies itself as such) and frequently specify the type of referent (URIs specify the `scheme` and often specify a referent type such as `.jpg` in their path.)

ARIDs contain no type information. Statistically, they are uniformly random sequences of bits. If you merely encoded an ARID as a sequence of binary or hexadecimal digits, it would appear to be a random sequence.

Type information can be added at higher levels. When encoded as [CBOR](https://cbor.io/), an ARID is tagged with #6.40012. Tagged this way, the receiver of an ARID can still only determine that it *is* an ARID, and nothing about the type or nature of its referent.

In particular, this construct provides no information about the lifetime of the referent. The referent could exist persistently for all time, such as in a blockchain, or it could exist for milliseconds, as in a distributed function call.

This construct also provides no information as to the source of its bit sequence. Since the sequence is statistically random, it could have been generated by a cryptographic random number generator or a sequence of ratcheting keys, and either case would be indistinguishable to a third-party observer.

Higher level semantics are provided by how an ARID is further tagged, or by how it is positioned in a larger structure, or both. For instance, a distributed function call could have a header that includes the construct `request(ARID(XXX))` where `request` is a CBOR tag indicating that the remainder of the structure specifies which function to call and with what parameters, and `ARID` specifies its tagged contents as conforming to the other requirements of this document. Positional information would include, for example the position of the ARID within a header, or which field an ARID populates, such as `person: ARID(XXX)`. In this example, being the value of the `person` field is sufficient to use the ARID as a "person identifier" *unless* there is more than one distinct kind of "person", in which case another tag would be needed to disambiguate this.

### Open Generation

As mentioned above, *any* method of generating an ARID is allowed as long as it fulfills the other requirements of this document, chiefly:

1. statistically random bits, and
2. universal uniqueness.

### Minimum Strength

ARIDs must be a minimum of 256 bits (32 bytes) in length. At this time, there is no perceived need for ARIDs to be longer, and thus conformant processes that receive ARIDs MAY reject ARIDs that are longer or shorter than 256 bits, while processes that generate ARIDs SHOULD only generate ARIDs that are exactly 256 bits in length.

### Cryptographic suitability

The foregoing notwithstanding, ARIDs MAY be used as inputs to cryptographic constructs such as a ratcheting key algorithms, or used as additional entropy for random number generators, or salt for hashing algorithms, as long as the output of such algorithms is necessarily related to the ARID's referent.

For example in the distributed call scenario, a caller might transmit a structure including `request(ARID(A))`, where A is an ARID generated from an iteration of a ratcheting key algorithm. The receiver compares `A` to its own internal state, rejecting the call if it does not match, and advancing the state of its ratchet if it does. The receiver computes the result of the call and returns a structure including `response(ARID(B))`, where B is generated from the new state of the ratchet. The caller receives the response and uses the algorithm to correlate `B` in the response to its call `A`, and if further exchanges are needed, uses the ratchet to produce the next expected transaction ID, `C`. Third parties viewing the exchange cannot correlate `A`, `B`, or `C`, and in particular, they cannot correlate a specific response to its call.

## Not to be Confused With

ARIDs MUST NOT be confused with any other sort of identifier or sequence of random or pseudorandom numbers.

* ARIDs MUST NOT be cast to or from other identifier types such as UUIDs, nor should they be considered isomorphic to any other type.
* ARIDs MUST NOT be cast from digests (hashes) or similar structures.
* ARIDs are not [nonces](https://en.wikipedia.org/wiki/Cryptographic_nonce). Unlike nonces, ARIDs always have a referent. ARIDs MUST NOT be used as nonces, and MUST NOT be created by casting from a nonce used anywhere else.
* ARIDs are not keys and MUST NOT be used as keys.
* ARIDs are not cryptographic seeds. They are generally not considered secret, and MUST NOT be used as secret key material from which keys or other secret constructs are derived.

## Q&A

### What advantage do ARIDs have over simply using hashes?

Hashes identify a fixed, immutable state of data. If the data changes, the hash changes. ARIDs, on the other hand, can serve as a stable identifier for mutable data structures. They provide universal uniqueness without tying them to a specific data snapshot, making them more versatile for identifying evolving or mutable referents.

### Why can't ARIDs be cast to or from other identifier types?

Casting ARIDs to or from other identifier types compromises their neutral semantics and could introduce correlation with their referent or with other ARIDs. It undermines the fundamental aim of being universally unique while remaining completely opaque regarding their origin or the data they reference.

### Why can't ARIDs be cast from digests?

Hashes like SHA-256 are deterministic and directly tied to the data they represent. This compromises the non-correlatability requirement of ARIDs. If you use a hash, anyone with the same input data could generate the same ARID, making it possible to correlate the identifier with its referent. This runs counter to the primary goals of ARIDs, which aim for complete opacity regarding their generation method and the data they are linked to.

### Why can't ARIDs be used as nonces?

ARIDs are designed to be universally unique identifiers tied to a referent, whereas nonces are often ephemeral and context-dependent. Using an ARID as a nonce could mislead into thinking it's meant to be associated with a specific object or event long-term. This discrepancy in purpose could cause semantic confusion and potential security risks.

### Why must ARIDs be exactly 256 bits in length?

ARIDs are set at 256 bits to meet a minimum threshold for cryptographic strength and universal uniqueness. Shorter lengths compromise these properties. Longer lengths don't offer proportionate benefits but increase computational and storage costs. Therefore, the 256-bit length is both sufficient and efficient.

### Why are ARIDs not considered secret key material?

ARIDs are not designed to be secret; their primary role is to serve as identifiers that are uncorrelated with their referents. Using them as secret key material would be a misuse of the structure and could compromise the security of cryptographic systems where actual secret keys are needed.

### How is universal uniqueness guaranteed for ARIDs when multiple generation methods are allowed?

The "universal uniqueness" of ARIDs comes from adhering to stringent entropy requirements. Regardless of the generation method—be it a cryptographically secure random number generator or a ratcheting key algorithm—the resulting ARID must be statistically indistinguishable from pure entropy and at least 256 bits long. The sheer scale of the entropy space for a 256-bit identifier effectively guarantees that the chance of collision, even when using multiple methods, is astronomically low. Therefore, as long as the entropy requirements are rigorously met, the "universal uniqueness" is practically assured.

### Why is it OK to use ARIDs as inputs to cryptographic constructs?

Using ARIDs as inputs to cryptographic constructs doesn't violate their non-correlatability or neutral semantics. It doesn't reveal information about the ARID or its referent, maintaining their core attributes. It simply utilizes their high-entropy nature for cryptographic operations.
Loading

0 comments on commit bd51df4

Please sign in to comment.