SIP 1 - Efficient Operation Mapping #766

csuwildcat · 2020-06-23T20:05:26Z

  SIP: 1
  Upgrade-Type: Hard Fork (to guarantee outcomes)
  Title: Efficient DID Operation Mapping
  Author: Daniel Buchner <daniel.buchner@microsoft.com>
  Comments-Summary: No comments yet.
  Comments-URI: https://github.com/decentralized-identity/sidetree/sips/1.md
  Status: Draft
  Created: 2020-06-23

Summary

By segregating the proving data contained in the operation entries currently housed in the Anchor File and Map File (for Recovery, Deactivate, and Update operations), it is possible to realize a rather dramatic ~75% reduction in the minimum dataset required to trustlessly resolve DIDs.

The effect of moving this data to a segregated Proving File is that the Anchor File and Map Files become lightweight, spam-protected operation indexes, allowing for deferred acquisition of Proving Data in a JIT fashion, for nodes of various configurations.

Motivation

These changes would make initialization of many node types faster, more efficient, and most importantly: operationally feasible for the average user-operator. Sustainable operation of nodes across consumer hardware is a key requirement for any decentralized network of this class, thus keeping network storage growth comfortably 'under the line' of the commodity storage technology cost curve and bandwidth growth curves is essential. While such curves lack precision, when one examines the trajectory of storage and bandwidth in reference to the waning cadence of the Kryder's Law and Edholm's Law doubling conjectures, it appears that a 2-3 terabytes per annum growth in the size of the minimum required dataset for a network is the top end of sustainability for a system that features peer-based replication of data and deferral of CPU intensive tasks until a JIT compilation/resolution phase.

Requirements

Target a minimum required dataset upper limit of 2-3 terabytes, assuming one year at a rate of 1000 operations per second.
Push as much data out of the primary indexing files (Anchor and Map Files) as possible.

Technical Proposal

The primary technical changes center around moving proving data out of the Anchor File and Map File, leaving those files to act as bare minimum indexes that enable a node to have global awareness of possible operations for any DID in the system. The proposed changes include the addition of two new intermediary files between the Anchor and Chunk Files. All changes to the existing Anchor and Map Files, as well as the new Proving Files, are as follows:

Anchor File

The Anchor File would be modified in the following ways:

Add a new CAS URI link to a Retained Proving File, which contains the signed operation data that used to exist in the recover and deactivate operation entries.
Add a new CAS URI link to a Transient Proving File, which contains the signed operation data that used to exist in the update operation entries of the Map File.
Modify the create operation across the spec to reflect the fact that the reveal_value is the hash of the hash of the JWK value that is being committed to.
Modify the recover and deactivate operation entries to only include the did_suffix and reveal_value properties. The reveal_value is the hash of the hash of the JWK in the signed_data object that was relocated to the Retained Proving File.

{
  "retained_proving_file": CAS_URI,
  "transient_proving_file": CAS_URI,
  "map_file": CAS_URI,
  "writer_lock_id": OPTIONAL_LOCKING_VALUE,
  "operations": {
    "create": [
      {
        "suffix_data": { // Base64URL encoded
          "delta_hash": DELTA_HASH,
          "recovery_commitment": COMMITMENT_HASH
        }
      },
      {...}
    ],
    "recover": [
      {
        "did_suffix": SUFFIX_STRING,
        "reveal_value": MULTIHASH_OF_JWK
      },
      {...}
    ],
    "deactivate": [
      {
        "did_suffix": SUFFIX_STRING,
        "reveal_value": MULTIHASH_OF_JWK
      },
      {...}
    ]
  }
}

Map File

The Map File would be modified in the following ways:

Modify the update operation entries to only include the did_suffix and reveal_value properties. The reveal_value is the hash of the hash of the JWK in the signed data object that was relocated to the Transient Proving File.

{
  "chunks": [
    { "chunk_file_uri": CHUNK_HASH },
    {...}
  ],
  "operations": {
    "update": [
      {
        "did_suffix": DID_SUFFIX,
        "reveal_value": MULTIHASH_OF_JWK
      },
      {...}
    ]
  }
}

Retained Proving File

The Retained Proving File will contain the following:

The signed_data portion of the recover and deactivate operation entries that used to live in the Anchor File are now present in the operations object under their respective properties, and MUST be ordered in the same index order their corresponding entries are present in the Anchor File.

{
  "operations": {
    "recover": [
      {
        "signed_data": { // Base64URL encoded, compact JWS
          "protected": {...},
          "payload": {
            "recovery_commitment": COMMITMENT_HASH,
            "recovery_key": JWK_OBJECT,
            "delta_hash": DELTA_HASH
          },
          "signature": SIGNATURE_STRING
        }
      },
      {...}
    ],
    "deactivate": [
      {
        "signed_data": { // Base64URL encoded, compact JWS
          "protected": {...},
          "payload": {
            "did_suffix": SUFFIX_STRING,
            "recovery_key": JWK_OBJECT
          },
          "signature": SIGNATURE_STRING
        }
      },
      {...}
    ]
  }
}

Transient Proving File

The Transient Proving File will contain the following:

The signed_data portion of the update operation entries that used to live in the Map File are now present in the operations object under their respective properties, and MUST be ordered in the same index order their corresponding entries are present in the Map File.

{
  "operations": {
    "update": [
      {
        "did_suffix": DID_SUFFIX,
        "signed_data": { // Base64URL encoded, compact JWS
          "protected": {...},
          "payload": {
            "update_key": JWK_OBJECT,
            "delta_hash": DELTA_HASH
          },
          "signature": SIGNATURE_STRING
        }   
      },
      {...}
    ]
  }
}

Operation Data Changes

Commitments are now the hash of the hash of the JWK revealed values, vs just the hash, as they are currently.
The revealed values in the Anchor and Map Files are the hash of the JWK, not the JWK itself, as they are currently.

The text was updated successfully, but these errors were encountered:

OR13 · 2020-06-23T20:13:59Z

@thehenrytsai @Therecanbeonlyone1969 any idea how this growth rate stacks up to bitcoin/ethereum growth rate?
obiviously those ledgers do stuff other than DIDs as well, but would be interesting to put "requirements" in the context of other real world production systems.

OR13 · 2020-06-23T20:16:09Z

should we consider eliminating the base64url encoding at the same time to stretch the storage gain to the limit?

troyronda · 2020-06-23T20:24:33Z

Suggest renaming "transient" - as the eventual meaning is that it is could be pruneable after checkpoints rather than transient at the current time.

tplooker · 2020-06-23T20:32:32Z

Suggested alternative syntax for anchor file

{
  "map_file": CAS_URI,
  "writer_lock_id": OPTIONAL_LOCKING_VALUE,
  "operations": {
    "create": [
      {
        "file_ref": CAS_URI,
        "suffix_data": { // Base64URL encoded
          "delta_hash": DELTA_HASH,
          "recovery_commitment": COMMITMENT_HASH
        }
      },
      {...}
    ],
    "recover": [
      {
        "file_ref": CAS_URI,
        "did_suffix": SUFFIX_STRING,
        "reveal_value": MULTIHASH_OF_JWK
      },
      {...}
    ],
    "deactivate": [
      {
        "file_ref": CAS_URI,
        "did_suffix": SUFFIX_STRING,
        "reveal_value": MULTIHASH_OF_JWK
      },
      {...}
    ]
  }
}

file_ref could actually be a JSON pointer CAS_URI

*Feedback

Does not achieve the space saving goals, the same CAS_URI would be repeated per operation

troyronda · 2020-06-23T20:59:06Z

I think enabling checkpoints and pruning is important, so I think a structure that enables that aspect is useful.

csuwildcat · 2020-06-23T21:06:17Z

Just want to note that the current file structures already implicitly support the addition of a checkpoint/pruning mechanism. This is about reducing the minimum dataset required to run a light node by ~75+%.

OR13 · 2020-06-23T22:09:53Z

I'm generally in favor of this proposal, but I'm a bit worried about how we go about implementing it.

Here is my proposal:

We inventory the set of features for which we believe we are shipping support for in spec v1.

We determine what level of testing is required to believe that the feature is supported in spec v1.

We create issues to ensure those tests exist in the reference implementation.

We close those issues when the tests exist.

We publish spec v1 and reference implementation and we bump to v1.1.

We open issues for the core set of features in v1.1 ( probably the same as v1).

We close those issues when we have tests that prove that they work.

We publish spec v1.1 and reference implementation.

Vendors that don't have production customers can choose to skip spec v1, and jump to v1.1... vendors who can't "wipe their production database" can use spec v1, until spec v1.1 is ready to migrate too.

We target SIP-1 to spec v1.1.

OR13 · 2020-06-23T22:12:01Z

We need to be careful to have a stable, rigorous, and confidence building release process, and versioning system, and I think its dangerously confidence destroying to rewrite versions and refuse to publish, vs choosing to publish regular versions with clear changes, tests and documentation to support each release. (our reference implementation does a good job of this... we need to ensure the spec does as well).

csuwildcat · 2020-06-23T22:20:34Z

@OR13 how about we cut an official version of the spec, as it stands now, to 0.1.0, and use this change as an opportunity to do a proper minor version bump of the spec in accordance with the version descriptions in the spec.

OR13 · 2020-06-23T22:21:56Z

I'm fine as long as we cut a version before we attempt to implement a sip. ideally, we try and make it as clean a version as we can, by closing out any low hanging fruit before the cut.

OR13 · 2020-06-23T22:22:39Z

it can be v0.1.0 and SIP-1 can target v0.2.0 or whatever... features should be planned to target versions...

csuwildcat · 2020-06-24T00:05:16Z

Aside: are folks here OK if I do a PR to add this general SIP template as a start for that sort of thing? I was thinking to create a SIP directory with MD files in it that would render just like our specs do.

csuwildcat · 2020-06-24T00:16:21Z

@tplooker I don't think the pointer URI to a place inside the linked file is worth it if we can do the same thing via a 0-byte alternative, given it degrades the primary goal of SIP 1. However, if we changed our mind about, we could always add it later in a way that Sidetree-based implementations could push out via a rather straightforward upgrade.

csuwildcat · 2020-06-29T21:22:31Z

@troyronda and others: if we don't want to go with Transient, what are some names for the files that will be cyclically eliminated after checkpoint pruning occurs?

tplooker · 2020-06-30T03:42:50Z

To further optimize the above proposal, we could remove an additional base64 encoding of suffix_data if we instead relied on using JCS to canonicalize the structure

OR13 · 2020-06-30T22:29:17Z

lets take the encoding performance debate to #781

any tests / proof for the "75%" reduction claim being made here?

csuwildcat · 2020-06-30T22:50:13Z

@OR13 here's the test: the entries with proving data was 275 bytes, and the new size of the entries without proving data is 65 bytes, which is a reduction of 76.5% in the minimum required for a node to boot up and have a global index of all op entries.

OR13 · 2020-06-30T22:53:49Z

^ nice test, you must code a lot ; )

… of operation references

… validation

…uest

…revealValue in requests

…lue directly

…revealValue in requests

…ltihash in validation

…t reveal value algorithm

* feat(ref-imp): #766 - added support to validate reveal value as a hash * feat(ref-imp): #766 - added test for applying operation with different reveal value algorithm * chore(ref-imp): hiked nodejs version support to 12 and 14

thehenrytsai · 2020-12-07T22:36:39Z

Fully implemented.

csuwildcat added the sip label Jun 23, 2020

OR13 assigned csuwildcat Jun 23, 2020

EzequielPostan mentioned this issue Jun 23, 2020

Checkpointing approach and late publication comments #768

Open

csuwildcat added beta labels Jun 23, 2020

thehenrytsai added this to 2020 July in Reference implementation Jun 25, 2020

OR13 added the Spec v0.1.1 label Jun 30, 2020

OR13 mentioned this issue Jul 14, 2020

Remove replace patch type #800

Closed

thehenrytsai moved this from 2020 July to 2020 August in Reference implementation Jul 23, 2020

thehenrytsai moved this from 2020 August to 2020 September in Reference implementation Aug 14, 2020

thehenrytsai added a commit that referenced this issue Aug 28, 2020

feat(ref-imp): #766 - Implemented hashing of public key as reveal value.

21ce7eb

thehenrytsai added a commit that referenced this issue Aug 31, 2020

feat(ref-imp): #766 - Required reveal value to be a multihash.

c9eb802

thehenrytsai added a commit that referenced this issue Nov 18, 2020

feat(ref-imp): #766 - Updated map (provisional index) file schema

984460c

thehenrytsai added a commit that referenced this issue Nov 19, 2020

feat(ref-imp): #766 - Updated map (provisional index) file schema

e5bbfdd

thehenrytsai added a commit that referenced this issue Nov 21, 2020

feat(ref-imp): #766 - Updated anchor (core index) file schema

9abce43

thehenrytsai added a commit that referenced this issue Nov 24, 2020

feat(ref-imp): #766 - Updated anchor (core index) file schema

8a62900

thehenrytsai added a commit that referenced this issue Nov 25, 2020

feat(ref-imp): #766 - introduced create references to align with rest…

11b0335

… of operation references

thehenrytsai added a commit that referenced this issue Nov 25, 2020

feat(ref-imp): #766 - introduced create references to align with rest…

b733694

… of operation references

thehenrytsai added a commit that referenced this issue Nov 25, 2020

feat(ref-imp): #766 - changed mapFileUri to provisionalIndexFileUri

414e41b

thehenrytsai added a commit that referenced this issue Nov 25, 2020

feat(ref-imp): #766 - changed mapFileUri to provisionalIndexFileUri

542421e

thehenrytsai added a commit that referenced this issue Dec 1, 2020

feat(ref-imp): #766 - renamed all references to anchor and map files

772d73a

thehenrytsai added a commit that referenced this issue Dec 2, 2020

feat(ref-imp): #766 - renamed all references to anchor and map files

fabf550

thehenrytsai added a commit that referenced this issue Dec 3, 2020

feat(ref-imp): #766 - updated error handling for file downloading and…

b522c31

… validation

thehenrytsai added a commit that referenced this issue Dec 3, 2020

feat(ref-imp): #766 - updates to make file validation more consistent

3274c20

thehenrytsai added a commit that referenced this issue Dec 3, 2020

feat(ref-imp): #766 - lint fixes

ddb3eb0

thehenrytsai added a commit that referenced this issue Dec 3, 2020

feat(ref-imp): #766 - removed remaining xit()s

ae14086

thehenrytsai added a commit that referenced this issue Dec 3, 2020

feat(ref-imp): #766 - addressed review comments

3ac7f22

thehenrytsai added a commit that referenced this issue Dec 3, 2020

feat(ref-imp): #766 - updated error handling for file downloading and…

9300ac5

… validation

thehenrytsai added a commit that referenced this issue Dec 4, 2020

feat(ref-imp): #766 - added revealValue to update operation request

d567d48

thehenrytsai added a commit that referenced this issue Dec 4, 2020

feat(ref-imp): #766 - added revealValue to recover and deactivate req…

13ae338

…uest

thehenrytsai added a commit that referenced this issue Dec 4, 2020

feat(ref-imp): #766 - updated API, resolver, batch writer to support …

5025c70

…revealValue in requests

thehenrytsai added a commit that referenced this issue Dec 4, 2020

feat(ref-imp): #766 - updated operation processor to obtain reveal va…

c0880d0

…lue directly

thehenrytsai added a commit that referenced this issue Dec 4, 2020

feat(ref-imp): #766 - updated API, resolver, batch writer to support …

c0a3332

…revealValue in requests

thehenrytsai added a commit that referenced this issue Dec 5, 2020

feat(ref-imp): #766 - added support to validate reveal value as a hash

826d0e6

thehenrytsai added a commit that referenced this issue Dec 5, 2020

feat(ref-imp): #766 - revealValue and didSuffix are required to be mu…

b8e31ca

…ltihash in validation

thehenrytsai added a commit that referenced this issue Dec 5, 2020

feat(ref-imp): #766 - added test for applying operation with differen…

4dfede4

…t reveal value algorithm

thehenrytsai added a commit that referenced this issue Dec 5, 2020

feat(ref-imp): #766 - added tests to increase code coverage

0263edf

thehenrytsai closed this as completed Dec 7, 2020

Reference implementation automation moved this from 2020 November to Done Dec 7, 2020

OR13 mentioned this issue May 19, 2021

Upgrade to support Sidetree 1.0 transmute-industries/sidetree.js#194

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SIP 1 - Efficient Operation Mapping #766

SIP 1 - Efficient Operation Mapping #766

csuwildcat commented Jun 23, 2020 •

edited by thehenrytsai

Loading

OR13 commented Jun 23, 2020

OR13 commented Jun 23, 2020

troyronda commented Jun 23, 2020 •

edited

Loading

tplooker commented Jun 23, 2020 •

edited by thehenrytsai

Loading

troyronda commented Jun 23, 2020

csuwildcat commented Jun 23, 2020

OR13 commented Jun 23, 2020 •

edited

Loading

OR13 commented Jun 23, 2020 •

edited

Loading

csuwildcat commented Jun 23, 2020

OR13 commented Jun 23, 2020

OR13 commented Jun 23, 2020

csuwildcat commented Jun 24, 2020

csuwildcat commented Jun 24, 2020 •

edited

Loading

csuwildcat commented Jun 29, 2020

tplooker commented Jun 30, 2020

OR13 commented Jun 30, 2020

csuwildcat commented Jun 30, 2020

OR13 commented Jun 30, 2020

thehenrytsai commented Dec 7, 2020

SIP 1 - Efficient Operation Mapping #766

SIP 1 - Efficient Operation Mapping #766

Comments

csuwildcat commented Jun 23, 2020 • edited by thehenrytsai Loading

Summary

Motivation

Requirements

Technical Proposal

Anchor File

Map File

Retained Proving File

Transient Proving File

Operation Data Changes

OR13 commented Jun 23, 2020

OR13 commented Jun 23, 2020

troyronda commented Jun 23, 2020 • edited Loading

tplooker commented Jun 23, 2020 • edited by thehenrytsai Loading

troyronda commented Jun 23, 2020

csuwildcat commented Jun 23, 2020

OR13 commented Jun 23, 2020 • edited Loading

OR13 commented Jun 23, 2020 • edited Loading

csuwildcat commented Jun 23, 2020

OR13 commented Jun 23, 2020

OR13 commented Jun 23, 2020

csuwildcat commented Jun 24, 2020

csuwildcat commented Jun 24, 2020 • edited Loading

csuwildcat commented Jun 29, 2020

tplooker commented Jun 30, 2020

OR13 commented Jun 30, 2020

csuwildcat commented Jun 30, 2020

OR13 commented Jun 30, 2020

thehenrytsai commented Dec 7, 2020

csuwildcat commented Jun 23, 2020 •

edited by thehenrytsai

Loading

troyronda commented Jun 23, 2020 •

edited

Loading

tplooker commented Jun 23, 2020 •

edited by thehenrytsai

Loading

OR13 commented Jun 23, 2020 •

edited

Loading

OR13 commented Jun 23, 2020 •

edited

Loading

csuwildcat commented Jun 24, 2020 •

edited

Loading