Skip to content

Commit

Permalink
shachain: clarify design in terms of binary tree, reverse indexes.
Browse files Browse the repository at this point in the history
Olaoluwa Osuntokun came up with an alternative which used binary trees;
that's a much better way to explain it, so do that in design.txt and
update the implementation to work the same way.

Anthony Towns pointed out that the numbering is the reverse of the normal
hash chaining descriptions, so fix that too.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
  • Loading branch information
rustyrussell committed Mar 8, 2016
1 parent bb88463 commit d23fb57
Show file tree
Hide file tree
Showing 7 changed files with 248 additions and 169 deletions.
184 changes: 96 additions & 88 deletions ccan/crypto/shachain/design.txt
Expand Up @@ -16,99 +16,107 @@ A simple system is a hash chain: we select a random seed value, the
hash it 1,000,000 times. This gives the first "random" number.
Hashed 999,999 times gives the second number, etc. ie:

R(1,000,000) = seed
R(N-1) = SHA256(R(N))
R(0) = seed
R(N+1) = SHA256(R(N))

This way the remote node needs only to remember the last R(N) it was
given, and it can calculate any R for N-1 or below.
given, and it can calculate any R for N+1 or above.

However, this means we need to generate 1 million hashes up front, and
then do almost as many hashes to derive the next number. That's slow.

A More Complex Solution
-----------------------

Instead of a one-dimensional chain, we can use two dimensions: 1000
chains of 1000 values each. Indeed, we can set generate the "top" of
each chain much like we generated a single chain:

Chain 1000 Chain 999 Chain 998 ...........Chain 1
seed SHA256(C1000) SHA256(C999) ....... SHA256(C2)

Now, deriving chain 1000 from seed doesn't quite work, because it'll
look like this chain, so we flip the lower bit to generate the chain:

Chain 1000 Chain 999 Chain 998 ...........Chain 1
1000 seed^1 SHA256(C1000)^1 SHA256(C999)^1...... SHA256(C2)^1
999 SHA256(above) SHA256(above) SHA256(above) ..... SHA256(above)
998 SHA256(above) SHA256(above) SHA256(above) ..... SHA256(above)
...

Now, we can get the first value to give out (chain 1, position 1) with
999 hashes to get to chain 1, and 999 hashes to get to the end of the
chain. 2000 hashes is much better than the 999,999 hashes it would
have taken previously.

Why Stop at 2 Dimensions?
-------------------------

Indeed, the implementation uses 64 dimensions rather than 2, and a
chain length of 2 rather than 1000, giving a worst-case of 63 hashes
to derive any of 2^64 values. Each dimension flips a different bit of
the hash, to ensure the chains are distinct.

For simplicity, I'll explain what this looks like using 8 dimensions,
ie. 8 bits. The seed value always sits the maximum possible index, in
this case index 0xFF (b11111111).

To generate the hash for 0xFE (b11111110), we need to move down
dimension 0, so we flip bit 0 of the seed value, then hash it. To
generate the hash for 0xFD (b11111101) we need to move down dimension
1, so we flip bit 1 of the seed value, then hash it.

To reach 0xFC (b11111100) we need to move down dimension 1 then
dimension 0, in that order.

Spotting the pattern, it becomes easy to derive how to reach any value:

hash = seed
for bit in 7 6 5 4 3 2 1 0:
if bit not set in index:
flip(bit) in hash
hash = SHA256(hash)

Handling Partial Knowledge
--------------------------

How does the remote node, which doesn't know the seed value, derive
subvalues?

Once it knows the value for index 1, it can derive the value for index
0 by flipping bit 0 of the value and hashing it. In effect, it can
always derive a value for any index where it only needs to clear bits.

So, index 1 gives index 0, but index 2 doesn't yield index 1. When
index 3 comes along, it yields 2, 1, and 0.

How many hash values will we have to remember at once? The answer is
equal to the number of dimensions. It turns out that the worst case
for 8 dimensions is 254 (0b11111110), for which we will have to
remember the following indices:

127 0b01111111
191 0b10111111
223 0b11011111
239 0b11101111
247 0b11110111
251 0b11111011
253 0b11111101
254 0b11111110

127 lets us derive any hash value for index <= 127. Similarly, 191
lets us derive anything > 127 but <= 191. 254 lets us derive only
itself.

When we get index 255 this collapses, and we only need to remember
that one index to derive everything.
A Tree Solution
---------------

A better solution is to use a binary tree, with the seed at the root.
The left child is the same as the parent, the right child is the
SHA256() of the parent with one bit flipped (corresponding to the
height).

This gives a tree like so:

seed
/ \
/ \
/ \
/ \
seed SHA256(seed^1)
/ \ / \
seed SHA256(seed^2) SHA256(seed^1) SHA256(SHA256(seed^1)^2)
Index: 0 1 2 3

Clearly, giving R(2) allows you to derive R(3), giving R(1) allows you
to derive nothing new (you still have to remember R(2)), and giving
R(0) allows you to derive everything.

In pseudocode, this looks like the following for a 64 bit tree:

generate_from_seed(index):
value = seed
for bit in 0 to 63:
if bit set in index:
flip(bit) in value
value = SHA256(value)
return value


The Receiver's Tree
-------------------

To derive the value for a index N, you need to have the root of a tree
which contains it. That is the same as needing an index I which is N
rounded down in binary: eg. if N is 0b001100001, you need 0b001100000,
0b001000000 or 0b000000000.

Pseudocode:

# Can we derive the value for to_index from from_index?
can_derive(from_index, to_index):
# to_index must be a subtree under from_index; this is the same as
# saying that to_index must be the same as from_index up to the
# trailing zeros in from_index.
for bit in count_trailing_zeroes(from_index)..63:
if bit set in from_index != bit set in to_index:
return false
return true

# Derive a value from a lesser index: generalization of generate_from_seed()
derive(from_index, to_index, from_value):
assert(can_derive(from_index, to_index))
value = from_value
for bit in 0..63:
if bit set in to_index and not bit set in from_index:
flip bit in value
value = SHA256(value)
return value

If you are receiving values (in reverse order), you need to remember
up to 64 of them to derive all previous values. The simplest method
is to keep an array, indexed by the number of trailing zeroes in the
received index:

# Receive a new value (assumes we receive them in order)
receive_value(index, value):
pos = count_trailing_zeroes(index)
# We should be able to generate every lesser value, otherwise invalid
for i in 0..pos-1:
if derive(index, value, known[i].index) != known[i].value:
return false
known[pos].index = index
known[pos].value = value
return true

To derive a previous value, find an element in that array from which
you can derive the value you want, eg:

# Find an old value
regenerate_value(index):
for i in known:
if can_derive(i.index, index):
return derive(i.index, i.value, index)
fail

You can see the implementation for more optimized variants of the
above code.

Rusty Russell <rusty@rustcorp.com.au>
77 changes: 49 additions & 28 deletions ccan/crypto/shachain/shachain.c
Expand Up @@ -5,15 +5,39 @@
#include <string.h>
#include <assert.h>

#define INDEX_BITS ((sizeof(shachain_index_t)) * CHAR_BIT)

static void change_bit(unsigned char *arr, size_t index)
{
arr[index / CHAR_BIT] ^= (1 << (index % CHAR_BIT));
}

/* We can only ever *unset* bits, so to must only have bits in from. */
static int count_trailing_zeroes(shachain_index_t index)
{
#if HAVE_BUILTIN_CTZLL
return index ? __builtin_ctzll(index) : INDEX_BITS;
#else
int i;

for (i = 0; i < INDEX_BITS; i++) {
if (index & (1ULL << i))
break;
}
return i;
#endif
}

static bool can_derive(shachain_index_t from, shachain_index_t to)
{
return (~from & to) == 0;
shachain_index_t mask;

/* Corner case: can always derive from seed. */
if (from == 0)
return true;

/* Leading bits must be the same */
mask = ~((1ULL << count_trailing_zeroes(from))-1);
return ((from ^ to) & mask) == 0;
}

static void derive(shachain_index_t from, shachain_index_t to,
Expand All @@ -28,7 +52,7 @@ static void derive(shachain_index_t from, shachain_index_t to,
/* We start with the first hash. */
*hash = *from_hash;

/* This represents the bits set in from, and not to. */
/* This represents the bits set in to, and not from. */
branches = from ^ to;
for (i = ilog64(branches) - 1; i >= 0; i--) {
if (((branches >> i) & 1)) {
Expand All @@ -41,45 +65,42 @@ static void derive(shachain_index_t from, shachain_index_t to,
void shachain_from_seed(const struct sha256 *seed, shachain_index_t index,
struct sha256 *hash)
{
derive((shachain_index_t)-1ULL, index, seed, hash);
derive(0, index, seed, hash);
}

void shachain_init(struct shachain *chain)
{
chain->num_valid = 0;
chain->max_index = 0;
chain->min_index = 0;
}

bool shachain_add_hash(struct shachain *chain,
shachain_index_t index, const struct sha256 *hash)
{
int i;
int i, pos;

/* You have to insert them in order! */
assert(index == chain->max_index + 1 ||
(index == 0 && chain->num_valid == 0));

for (i = 0; i < chain->num_valid; i++) {
/* If we could derive this value, we don't need it,
* not any others (since they're in order). */
if (can_derive(index, chain->known[i].index)) {
struct sha256 expect;

/* Make sure the others derive as expected! */
derive(index, chain->known[i].index, hash, &expect);
if (memcmp(&expect, &chain->known[i].hash,
sizeof(expect)) != 0)
return false;
break;
}
assert(index == chain->min_index - 1 ||
(index == (shachain_index_t)(-1ULL) && chain->num_valid == 0));

pos = count_trailing_zeroes(index);

/* All derivable answers must be valid. */
/* FIXME: Is it sufficient to check just the next answer? */
for (i = 0; i < pos; i++) {
struct sha256 expect;

/* Make sure the others derive as expected! */
derive(index, chain->known[i].index, hash, &expect);
if (memcmp(&expect, &chain->known[i].hash, sizeof(expect)))
return false;
}

/* This can happen if you skip indices! */
assert(i < sizeof(chain->known) / sizeof(chain->known[0]));
chain->known[i].index = index;
chain->known[i].hash = *hash;
chain->num_valid = i+1;
chain->max_index = index;
chain->known[pos].index = index;
chain->known[pos].hash = *hash;
if (pos + 1 > chain->num_valid)
chain->num_valid = pos + 1;
chain->min_index = index;
return true;
}

Expand Down

0 comments on commit d23fb57

Please sign in to comment.