New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure Spec mitigates Double Spending by Colliding InternalH #738

Closed
defuse opened this Issue Feb 23, 2016 · 13 comments

Comments

@defuse
Contributor

defuse commented Feb 23, 2016

I think I found an attack that would let an attacker create as much money as they want for themselves at the cost of finding 128-bit hash collisions.

Background: A coin is a tuple (apk, v, ρ, r). The coin commitment CoinCommitment((apk, v, ρ, r)) is computed as:

InternalH = Leading128(CRH(apk || ρ))
k = CRH(r || InternalH)
cm = CRH(v || k)

The truncation of InternalH is done to make the commitment statistically-hiding, i.e no matter how much computational resources I have, I can't find out what v or apk are.

Attack: Because InternalH is only 128 bits, in 2^64 operations I can find some ρ != ρ' such that:

Leading128(CRH(apk || ρ)) = Leading128(CRH(apk || ρ'))

And therefore:

CoinCommitment((apk, v, ρ, r)) = CoinCommitment((apk, v, ρ', r))

To carry out a double-spend attack I first acquire some real Zcash (e.g. by buying it with USD), and then double spend it to myself as follows. Ignoring the faerie gold fix, I create a new coin for myself, but as I do so I make sure to find a collision (apk, v, ρ, r) and (apk, v, ρ', r) where ρ != ρ'. By completeness of the protocol, I'll be able to spend the (apk, v, ρ, r) coin, and so that's what I do. This will reveal the serial number PRFsnask(ρ). After that first spend has made it on to the ledger, as far as I can see, nothing prevents me from spending that coin again, using ρ', because:

  • I can still give a path from the root to CoinCommitment((apk, v, ρ', r)), namely the same path as I gave in the first spend (and because of the zero-knowledge property nobody will know the path is the same).
  • With high probability PRFsnask(ρ') is not equal to PRFsnask(ρ), and I can obviously still prove I computed PRFsnask(ρ') correctly inside the pour.
  • I can still satisfy spend authority (I know ask for the apk I used).
  • I have to use the same r (old) for both spends (otherwise the commitments won't collide), but again because of the zero-knowledge property I don't think anyone (except for me, the holder of ask) can tell it's been reused.

After the faerie gold fix, I'm forced to find colliding ρ by altering hsig, and I'm forced to commit to either ρ or ρ' because hsig gets published. This doesn't prevent the attack, since when I go to spend for the second time it is not going to be noticed that ρ' doesn't match the old hsig, at least not according to the current protocol.

Fix: Maybe CoinCommitment needs to be collision-resistant? Or maybe we should re-use hsig as a commitment to ρ and check it when I try to spend with ρ?

@defuse

This comment has been minimized.

Show comment
Hide comment
@defuse

defuse Feb 23, 2016

Contributor

In other words, CoinCommitment() isn't computationally binding, and the above attack is a consequence of that.

Contributor

defuse commented Feb 23, 2016

In other words, CoinCommitment() isn't computationally binding, and the above attack is a consequence of that.

@daira

This comment has been minimized.

Show comment
Hide comment
@daira

daira Feb 23, 2016

Contributor

Yes, this appears to work. Excellent catch!

After the faerie gold fix, I'm forced to find colliding ρ by altering hSig

Or φ.

Contributor

daira commented Feb 23, 2016

Yes, this appears to work. Excellent catch!

After the faerie gold fix, I'm forced to find colliding ρ by altering hSig

Or φ.

@ebfull

This comment has been minimized.

Show comment
Hide comment
@ebfull

ebfull Feb 23, 2016

Contributor

Awesome find @defuse!

I was also suspicious of the security of that function but didn't realize there was an attack at the time.

Contributor

ebfull commented Feb 23, 2016

Awesome find @defuse!

I was also suspicious of the security of that function but didn't realize there was an attack at the time.

@daira

This comment has been minimized.

Show comment
Hide comment
@daira

daira Feb 23, 2016

Contributor

In the proof of Balance in section D.3, the attack violates either Condition I, if the double-spend is within a single Pour (i.e. the doubled coin is spent in both inputs), or Condition II, if the double-spend occurs across Pours.

For Condition I, the failure is here:

(i) If cmold1 = cmold2, then the fact that snold1 = snold implies that the witness a contains two distinct openings of cmold1 (the first opening contains (aoldsk,1, ρold1), while the second opening contains (aoldsk,2, ρold2)). This violates the binding property of the commitment scheme COMM.

For Condition II, the failure is here:

However (as argued already above), if both transactions spend cm but produce different serial numbers, then the corresponding witnesses a, a contain different openings of cm. This contradicts the binding property of the commitment scheme COMM.

(Note that the proofs should really be distinguishing between COMMr and COMMs since they are different constructions, but we'll let that pass for now.)

Let's look at whether the above arguments actually follow from the definition of binding for a commitment scheme (spoiler: they do). Note that the Zerocash paper doesn't define binding at all; it only says:

Statistically-hiding commitments. We use a commitment scheme COMM where the binding property holds computationally, while the hiding property holds statistically. It is denoted {COMMx : {0, 1} → {0, 1}O(λ)}x where x denotes the commitment trapdoor. Namely, to reveal a commitment cm to a value z, it suffices to provide z and the trapdoor x; then one can check that cm = COMMx(z).

Finding a precisely-stated definition of computational binding security was suprisingly time-consuming. Wikipedia gives a silly definition that I won't confuse you with. (Exercise: beside being asymptotic, what else is wrong with it?) Definition 24.1 in http://www.zurich.ibm.com/~cca/sft13/smart08chaps2426.pdf will suffice:

A commitment scheme is said to be [computationally] binding if no [computationally bounded] adversary can win the following game.
• The adversary outputs a value c, plus values x and r which produce this commitment.
• The adversary must then output a value x′ ≠ x and a value r′ such that C(x, r) = C(x′, r′).

(r is the randomness, x is the committed value.) This indeed would be sufficient for the arguments in the security proof. COMMr would have this binding property if Leading128 ∘ CRH were collision-resistant, but obviously, that doesn't hold at a 128-bit security level. (Actually, we need that the composition of COMMr and COMMs is binding on all of the inputs, i.e. apk, ρ, and v, but the problem in this attack is with COMMr.)

Note that collision resistance of Leading128 ∘ CRH was not explicitly assumed in the construction of COMM in section 5.1. (Even if we didn't need that, we would still need Leading253 ∘ CRH to be collision-resistant, which is also not explicitly assumed, because of the truncation of hSig and ρ.)

Contributor

daira commented Feb 23, 2016

In the proof of Balance in section D.3, the attack violates either Condition I, if the double-spend is within a single Pour (i.e. the doubled coin is spent in both inputs), or Condition II, if the double-spend occurs across Pours.

For Condition I, the failure is here:

(i) If cmold1 = cmold2, then the fact that snold1 = snold implies that the witness a contains two distinct openings of cmold1 (the first opening contains (aoldsk,1, ρold1), while the second opening contains (aoldsk,2, ρold2)). This violates the binding property of the commitment scheme COMM.

For Condition II, the failure is here:

However (as argued already above), if both transactions spend cm but produce different serial numbers, then the corresponding witnesses a, a contain different openings of cm. This contradicts the binding property of the commitment scheme COMM.

(Note that the proofs should really be distinguishing between COMMr and COMMs since they are different constructions, but we'll let that pass for now.)

Let's look at whether the above arguments actually follow from the definition of binding for a commitment scheme (spoiler: they do). Note that the Zerocash paper doesn't define binding at all; it only says:

Statistically-hiding commitments. We use a commitment scheme COMM where the binding property holds computationally, while the hiding property holds statistically. It is denoted {COMMx : {0, 1} → {0, 1}O(λ)}x where x denotes the commitment trapdoor. Namely, to reveal a commitment cm to a value z, it suffices to provide z and the trapdoor x; then one can check that cm = COMMx(z).

Finding a precisely-stated definition of computational binding security was suprisingly time-consuming. Wikipedia gives a silly definition that I won't confuse you with. (Exercise: beside being asymptotic, what else is wrong with it?) Definition 24.1 in http://www.zurich.ibm.com/~cca/sft13/smart08chaps2426.pdf will suffice:

A commitment scheme is said to be [computationally] binding if no [computationally bounded] adversary can win the following game.
• The adversary outputs a value c, plus values x and r which produce this commitment.
• The adversary must then output a value x′ ≠ x and a value r′ such that C(x, r) = C(x′, r′).

(r is the randomness, x is the committed value.) This indeed would be sufficient for the arguments in the security proof. COMMr would have this binding property if Leading128 ∘ CRH were collision-resistant, but obviously, that doesn't hold at a 128-bit security level. (Actually, we need that the composition of COMMr and COMMs is binding on all of the inputs, i.e. apk, ρ, and v, but the problem in this attack is with COMMr.)

Note that collision resistance of Leading128 ∘ CRH was not explicitly assumed in the construction of COMM in section 5.1. (Even if we didn't need that, we would still need Leading253 ∘ CRH to be collision-resistant, which is also not explicitly assumed, because of the truncation of hSig and ρ.)

@daira

This comment has been minimized.

Show comment
Hide comment
@daira

daira Feb 23, 2016

Contributor

So let's see: we need to commit to apk, ρ, and v, which are a total of 256+256+64 = 576 bits. It's questionable whether we actually need the statistical hiding property, or whether we can get it (note that the scheme must also be noninteractive, which rules out some constructions in the literature) at reasonable circuit cost. Note that we don't have "everlasting anonymity" anyway because of the encryption. If we only need computational hiding and computational binding, then we could just do:

InternalH = CRH(apk || ρ)
cm = CRH(v || r || InternalH) where r is 192 bits

Contributor

daira commented Feb 23, 2016

So let's see: we need to commit to apk, ρ, and v, which are a total of 256+256+64 = 576 bits. It's questionable whether we actually need the statistical hiding property, or whether we can get it (note that the scheme must also be noninteractive, which rules out some constructions in the literature) at reasonable circuit cost. Note that we don't have "everlasting anonymity" anyway because of the encryption. If we only need computational hiding and computational binding, then we could just do:

InternalH = CRH(apk || ρ)
cm = CRH(v || r || InternalH) where r is 192 bits

@daira

This comment has been minimized.

Show comment
Hide comment
@daira

daira Feb 23, 2016

Contributor

Another option is to use the full SHA-256 hash:

cm = SHA-256(apk || v || ρ || r)

(Length extension attacks aren't a problem here.) This also requires two compression function evaluations but leaves more scope for possible future changes, because we're effectively saving 256 bits from the input to the second compression function (minus the minimum one byte of padding) by putting it in the chaining variable.

Edit: swap order of v and ρ to be the same order as in the definition of a coin and as encoded in a coin plaintext.

Contributor

daira commented Feb 23, 2016

Another option is to use the full SHA-256 hash:

cm = SHA-256(apk || v || ρ || r)

(Length extension attacks aren't a problem here.) This also requires two compression function evaluations but leaves more scope for possible future changes, because we're effectively saving 256 bits from the input to the second compression function (minus the minimum one byte of padding) by putting it in the chaining variable.

Edit: swap order of v and ρ to be the same order as in the definition of a coin and as encoded in a coin plaintext.

@defuse

This comment has been minimized.

Show comment
Hide comment
@defuse

defuse Feb 23, 2016

Contributor

Using two hashes or full SHA256 would be nicer because we could have better domain separation between things, and not have to worry as much about things like this: #500 (comment)

Contributor

defuse commented Feb 23, 2016

Using two hashes or full SHA256 would be nicer because we could have better domain separation between things, and not have to worry as much about things like this: #500 (comment)

@daira

This comment has been minimized.

Show comment
Hide comment
@daira

daira Feb 24, 2016

Contributor

I'm inclined to use the full hash for this. It isn't very much more complicated or expensive to implement the full hash in the circuit.

That leaves the question of whether we're also going to change the PRFs to use the full hash. Either way, there's still a domain separation issue: if we don't change the PRFs, then the IV will be the same for a use of the bare compression function and a use of the full hash (i.e. the intermediate chaining variable between the two blocks of the full hash will be the same as the output of some use of the bare compression function). This is fixable if we put the domain separation tag at the beginning.

Changing the PRFs to use the full hash means that we only have 504 bits of input available rather than 512, due to the padding. This is not necessarily a problem; we could truncate ask and φ to 244 bits (4 bits of domain separation, 8 bits padding) without significant loss of security, since those values do not need collision resistance.

[Edit: this was wrong, I'd forgotten the 64-bit length field. So we have 440 bits of input available, and would have had to truncate ask and φ by a further 64 bits to 180 bits, which would have been dubious.]

If the Merkle tree sticks with the SHA-256 compression function, then there seems little motivation to change the PRF to use the full hash (we would have to do the analysis assuming that the compression function is used in multiple places, anyway). If we change the Merkle tree to something else then I'd like to get rid of the only remaining use of the bare compression function for the PRF.

Contributor

daira commented Feb 24, 2016

I'm inclined to use the full hash for this. It isn't very much more complicated or expensive to implement the full hash in the circuit.

That leaves the question of whether we're also going to change the PRFs to use the full hash. Either way, there's still a domain separation issue: if we don't change the PRFs, then the IV will be the same for a use of the bare compression function and a use of the full hash (i.e. the intermediate chaining variable between the two blocks of the full hash will be the same as the output of some use of the bare compression function). This is fixable if we put the domain separation tag at the beginning.

Changing the PRFs to use the full hash means that we only have 504 bits of input available rather than 512, due to the padding. This is not necessarily a problem; we could truncate ask and φ to 244 bits (4 bits of domain separation, 8 bits padding) without significant loss of security, since those values do not need collision resistance.

[Edit: this was wrong, I'd forgotten the 64-bit length field. So we have 440 bits of input available, and would have had to truncate ask and φ by a further 64 bits to 180 bits, which would have been dubious.]

If the Merkle tree sticks with the SHA-256 compression function, then there seems little motivation to change the PRF to use the full hash (we would have to do the analysis assuming that the compression function is used in multiple places, anyway). If we change the Merkle tree to something else then I'd like to get rid of the only remaining use of the bare compression function for the PRF.

@daira

This comment has been minimized.

Show comment
Hide comment
@daira

daira Feb 24, 2016

Contributor

(Strictly speaking, full SHA-256 takes a bit string input and so we could hash 511 bits in a single compression function evaluation. Maybe we should do that; most SHA-256 implementations do not support bit string inputs but that's not really a problem.)

[Edit: 447 bits, not 511 bits.]

Contributor

daira commented Feb 24, 2016

(Strictly speaking, full SHA-256 takes a bit string input and so we could hash 511 bits in a single compression function evaluation. Maybe we should do that; most SHA-256 implementations do not support bit string inputs but that's not really a problem.)

[Edit: 447 bits, not 511 bits.]

@paragonie-scott

This comment has been minimized.

Show comment
Hide comment
@paragonie-scott

paragonie-scott Mar 8, 2016

Contributor

Instead of SHA-256, could SHA-512/256 be used? :)

Contributor

paragonie-scott commented Mar 8, 2016

Instead of SHA-256, could SHA-512/256 be used? :)

@daira

This comment has been minimized.

Show comment
Hide comment
@daira

daira Mar 12, 2016

Contributor

@paragonie-scott Thanks for the suggestion, but we don't have a zk-SNARK circuit implementation of SHA-512, and I suspect it has similar circuit complexity to SHA-256 with two blocks of input, in any case.

Contributor

daira commented Mar 12, 2016

@paragonie-scott Thanks for the suggestion, but we don't have a zk-SNARK circuit implementation of SHA-512, and I suspect it has similar circuit complexity to SHA-256 with two blocks of input, in any case.

@daira

This comment has been minimized.

Show comment
Hide comment
@daira

daira Mar 15, 2016

Contributor

This is fixed in the current spec (not in the current implementation).

Contributor

daira commented Mar 15, 2016

This is fixed in the current spec (not in the current implementation).

@nathan-at-least nathan-at-least changed the title from Double Spending by Colliding InternalH to Ensure Spec mitigates Double Spending by Colliding InternalH Mar 29, 2016

@daira daira closed this Apr 1, 2016

@daira

This comment has been minimized.

Show comment
Hide comment
@daira

daira Apr 6, 2016

Contributor

There's a variation of the attack that I overlooked, where you collide InternalH by varying apk, without necessarily changing ρ. (It's sufficient to know both corresponding ask; the serial numbers / nullifiers will be different because ask is the key input to PRFsn/nf.) Our new note commitment scheme blocks that variation as well.

Contributor

daira commented Apr 6, 2016

There's a variation of the attack that I overlooked, where you collide InternalH by varying apk, without necessarily changing ρ. (It's sufficient to know both corresponding ask; the serial numbers / nullifiers will be different because ask is the key input to PRFsn/nf.) Our new note commitment scheme blocks that variation as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment