Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hash doc method #32

Merged
merged 13 commits into from
May 31, 2019
Merged

Hash doc method #32

merged 13 commits into from
May 31, 2019

Conversation

dhh1128
Copy link
Collaborator

@dhh1128 dhh1128 commented May 14, 2019

No description provided.

Signed-off-by: Daniel Hardman <daniel.hardman@gmail.com>
Signed-off-by: Daniel Hardman <daniel.hardman@gmail.com>
Signed-off-by: Daniel Hardman <daniel.hardman@gmail.com>
Signed-off-by: Daniel Hardman <daniel.hardman@gmail.com>
Signed-off-by: Daniel Hardman <daniel.hardman@gmail.com>
Signed-off-by: Daniel Hardman <daniel.hardman@gmail.com>
@Oskar-van-Deventer
Copy link
Contributor

Oskar-van-Deventer commented May 15, 2019

Excellent work! As mentioned, I like the general approach. Here are some minor comments.

  • "generator ... "
    Why is the separator a "-" here and not a ":"? Is there any relevant semantic difference? Am I correct in assuming that "1" is a version number?
    Also, are there already implementations that use the verkey-based Peer DID? If so, may the reuse of "1" create any compatibility issues?
  • "multihash ="
    Am I correct in assuming that forward compatibility is handled by the "1220" value, and that if ever sha256 gets depricated, the multihash part of the Peer DID specification gets updated?
  • "The genesis version must include at least one public key"
    I suggest to change this into "verification key". It is a best practice to keep the roles of verkey en enckey cleanly separated
  • "Remove the id field from the DID Doc."
    This may be ambiguous, as DID Doc is a nested document, with 'id' at several levels, also at the keys. How do we properly identify the 'main id field' that holds the DID value?
  • "lossy short form of a peer DID"
    I feel that there is an ambiguity here. The short form is useful for humans-reading-the-log. However, I don't see why the machine-readable log needs the short form. I can imagine there be two logs, or a parser that performs the shortening.
  • "convenient regex"
    Please check the regex. I see the "-" character used as literal and as for "a-f". Shouldn't there be an escape character? (Sorry, my regex knowledge is rusty ...)

@dhh1128
Copy link
Collaborator Author

dhh1128 commented May 15, 2019

"generator ... "
Why is the separator a "-" here and not a ":"? Is there any relevant semantic difference? Am I correct in assuming that "1" is a version number?

There was a proposal made in the DID spec working group that colons be used as separators for arguments. That would be incompatible with what we had before. The proposal was rejected, but it spooked me a bit that colon might end up having a messy conflict. Also, there is work afoot in the Sovrin/Indy space to support nested namespaces (network-of-networks) features, where the colon would be used as a namespace delimiter. So I thought switching to a different delimiter might be wise. I don't have any proof that the hyphen is dramatically better; it just seemed advisable.

Also, are there already implementations that use the verkey-based Peer DID? If so, may the reuse of "1" create any compatibility issues?

No implementations yet. If there were, we would of course have to keep "1" to maintain their features. For the time being, we can afford to have breaking changes.

"multihash ="
Am I correct in assuming that forward compatibility is handled by the "1220" value, and that if ever sha256 gets depricated, the multihash part of the Peer DID specification gets updated?

Yes, exactly.

"The genesis version must include at least one public key"
I suggest to change this into "verification key". It is a best practice to keep the roles of verkey en enckey cleanly separated

Good improvement. Will update.

"Remove the id field from the DID Doc."
This may be ambiguous, as DID Doc is a nested document, with 'id' at several levels, also at the keys. How do we properly identify the 'main id field' that holds the DID value?

Good point. I'll see if I can come up with clearer language.

"lossy short form of a peer DID"
I feel that there is an ambiguity here. The short form is useful for humans-reading-the-log. However, I don't see why the machine-readable log needs the short form. I can imagine there be two logs, or a parser that performs the shortening.

I guess my assumption was that logs are intended to be human-readable, not machine-readable. But I can see that this is not necessarily true (e.g., if someone is using Splunk). Do you have any suggestions about how to improve the wording in this section?

"convenient regex"
Please check the regex. I see the "-" character used as literal and as for "a-f". Shouldn't there be an escape character? (Sorry, my regex knowledge is rusty ...)

Several regex characters have special meaning inside the square brackets of character group. The dot (.) and the question mark can appear without escaping. The hyphen means a literal hyphen if it is the first character, but a range if it is between two other characters. It doesn't need to be escaped.

Signed-off-by: Daniel Hardman <daniel.hardman@gmail.com>
@Oskar-van-Deventer
Copy link
Contributor

Oskar-van-Deventer commented May 16, 2019

Hi Daniel,

Your response answers all my question. Thanks!

Do you have any suggestions about how to improve the wording in this section?

I am not sure whether we would need to specify the lossy short form at all. If we do, we could use text like this.
"If a log is not meant to be machine readable, but only human readable, then a lossy short form of peer DID can be used. In this case, the full DID is replaced by ... (further details ...)"

Signed-off-by: Daniel Hardman <daniel.hardman@gmail.com>
Signed-off-by: Daniel Hardman <daniel.hardman@gmail.com>
Copy link

@coder5876 coder5876 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in general! I added some comments. Is the wording about truncating to 16 bytes incorrect since it doesn't match with the DID examples further down?

index.html Outdated
</ul>
<p>By basing the numeric value of the DID on the genesis version of the DID Doc, the DID can begin its
lifecycle with any number of keys and endpoints, and when the doc is signed or auth-encrypted by one of
the keys, the recipient can know it has not been modified since creation. The guarantees the initial

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The guarantees -> This guarantees

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in next commit

index.html Outdated
SecP256, Curve448, or similar. Such updates might change the length of <code>idstring</code> as well.</p>
Peer DIDs use an underlying number with high entropy, called the <em>numeric basis</em>, as the
source of their uniqueness. This version of
the spec only defines one scheme for generating the numeric basis (<code>generator</code>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of explicitly saying that the generator field specifies how the numeric basis is generated, perhaps this field could be a simple version field to make it more flexible how different versions are defined?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I'll update.

<p>Peer DIDs must be compared according to whatever case-sensitivity is implied by the
hash encoding they use. Hex is not case-sensitive, but if a future version of the spec
uses base58 or base64 encoding, values will be case sensitive. Therefore, it is strongly
recommended that routines that handle peer DIDs not normalize case unless they take into

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good recommendation.

index.html Outdated
<p>As with git commit hashes, it is likely that a small subset of the bits in the <em>numeric basis</em>
of a peer DID will be sufficiently unique for some contexts, and will be more convenient for human
interaction. Therefore, a lossy <em>short form</em> of a peer DID is recognized if the hash encoding
is hex. This form must consist of at least 4 bytes (8 hex characters) after the multibase and

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little nervous about recommending 4 byte short forms since that would be relatively easy to brute force and might cause trouble if systems have implicit assumptions about these being unique.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These short forms are not intended to carry a strong uniqueness guarantee--only to be "unique enough" that they could be used in contexts where a human is involved (e.g., when inspecting a log or checking a DID value on a screen). They have the same uniqueness property as short git commit hashes. Git has chosen 7 hex digits as a "good enough" uniqueness, and I was trying to do something similar.

I introduced this concept because I wanted the simplicity of hex instead of base58 or base64 or something--but I didn't want the length to be a problem for humans, and I was abandoning the idea of shortening the SHA256 hash to just the top half of its value, so the long form of the DID would be quite verbose. If we don't like this short form, then I'd vote to go back to base58.

Does that reasoning change your opinion in any way?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@christianlundkvist I'm thinking through your comment about brute forcing. In a log file, I don't think brute forcing introduces much risk, but if we're talking about reading off the screen, maybe. I'm pondering...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we said the short form of the DID was the first 4 hex digits of the longer hex value, followed by "." followed by 4 hex digits of CRC of the value as a whole? This would require brute forcing to find a different doc that had the same sha256 value (extremely unlikely), or a different doc that had a sha256 value with the same 4 initial hex digits, plus the same CRC (fairly unlikely, but not as unlikely as a sha256 collision). Would this allow the short form while preventing the attack?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind. That only gives 1-in-4-billion security. Still easy to brute force.

index.html Outdated
<li>Remove the main <code>id</code> field (the one that would normally contain the DID value) from the DID Doc.
This creates a variant of the peer DID doc known as the <em>stored variant</em>, as opposed to the
<em>resolved variant</em> that would have an actual DID value in the <code>id</code> field.</li>
<li>Calculate the SHA256 hash of the bytes of the stored variant of the DID Doc, and select the first 16

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So are we truncating the SHA256 hash as the DID? In that case the multihash designation will not technically be correct since the hash length and algorithm does not really match. Further down the example DID is 32 bytes.

Copy link
Collaborator Author

@dhh1128 dhh1128 May 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops. I meant to remove all mention of truncation. I thought I had caught them all, but I guess this one slipped past me. No, we shouldn't be truncating anymore. I'll fix this.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok sounds good.

</p>
<p>A valid <code>peer</code> DID might be: <code>did:peer:1:YTEFYzByfU2RwJPyULfLLn</code>.</p>
<p>A valid <code>peer</code> DID might be: <code>did:peer:1-F1220479cbc07c3f991725836a3aa2a581ca2029198aa420b9d99bc0e131d9f3e2cbe</code>.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This DID is a 32 byte hex string but above it is suggested that it be truncated to 16 bytes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this one is right and the other is a mistake.

Signed-off-by: Daniel Hardman <daniel.hardman@gmail.com>
index.html Outdated
<p>Because peer DIDs are guaranteed to be globally unique at the moment of creation,
their 128-bit NSI will not exist on any other blockchain. Blockchain-based DID methods
<p>Because peer DIDs are globally unique at the moment of creation,
their <em>numeric basis</em> will not exist on any other blockchain unless until someone

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"unless until" - I think you mean "until" ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@mavarley
Copy link

+1 to this change being accepted; I think it gives flexibility to peer DIDs to be less transactional and more persistent (if required) - it lets them evolve with the relationship in which they have context.

Signed-off-by: Daniel Hardman <daniel.hardman@gmail.com>
@tplooker
Copy link
Contributor

+1 to this change also, I think the added flexibility is a great property.

Copy link

@llorllale llorllale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for this change for adding flexibility: decouples key rotation from the identifier, enabling continued use throughout the lifetime of a trusted context (if desired).

Copy link
Contributor

@kdenhartog kdenhartog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to the changes suggested here. I'm not sure the value of truncating is necessary to specify within the system. I think it's out of scope of the concern of the spec and is more of an implementation detail. For example, a full hash isn't that big of a deal when being copied from a log to a search field. If others would prefer to keep it, I'm not opposed to it though.

@coder5876
Copy link

I think it's a good idea by @kdenhartog to avoid specifying potential shortenings of the hash here. This seems more of a UX issue when presenting a DID to a user (which can hopefully be avoided in the long term).

Signed-off-by: Daniel Hardman <daniel.hardman@gmail.com>
Signed-off-by: Daniel Hardman <daniel.hardman@gmail.com>
@dhh1128
Copy link
Collaborator Author

dhh1128 commented May 31, 2019

I have removed the discussion of the short form for now. If we can think of a way to do it that is not susceptible to brute force attacks, I may add it back in, because I think brevity is a desirable option. I just got done confirming a transaction with an Ethereum multisig wallet, and although I copied and pasted the long hashes rather than retyping them, even checking the values on the screen was a bit difficult. I do agree that this is more of a UX issue than an underlying spec issue, though; maybe the UX problems should remain outside the doc. Anyway, it's gone for now.

Merging the PR as we've had substantial positive comments from collaborators and no dissenting votes.

@dhh1128 dhh1128 merged commit 52a492b into openssi:master May 31, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants