Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

write some words about lease renewal secrets #1118

6 changes: 5 additions & 1 deletion docs/proposed/http-storage-node-protocol.rst
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ Glossary
(sometimes "slot" is considered a synonym for "storage index of a slot")

storage index
a short string which can address a slot or a bucket
a 16 byte string which can address a slot or a bucket
(in practice, derived by hashing the encryption key associated with contents of that slot or bucket)

write enabler
Expand Down Expand Up @@ -380,6 +380,10 @@ then the expiration time of that lease will be changed to 31 days after the time
If it does not match an existing lease
then a new lease will be created with this ``renew-secret`` which expires 31 days after the time of this operation.

``renew-secret`` and ``cancel-secret`` values must be 32 bytes long.
The server treats them as opaque values.
:ref:`Share Leases` gives details about how the Tahoe-LAFS storage client constructs these values.
exarkun marked this conversation as resolved.
Show resolved Hide resolved

In these cases the response is ``NO CONTENT`` with an empty body.

It is possible that the storage server will have no shares for the given ``storage_index`` because:
Expand Down
87 changes: 87 additions & 0 deletions docs/specifications/derive_renewal_secret.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@

"""
This is a reference implementation of the lease renewal secret derivation
protocol in use by Tahoe-LAFS clients as of 1.16.0.
"""

from allmydata.util.base32 import (
a2b as b32decode,
b2a as b32encode,
)
from allmydata.util.hashutil import (
tagged_hash,
tagged_pair_hash,
)


def derive_renewal_secret(lease_secret: bytes, storage_index: bytes, tubid: bytes) -> bytes:
assert len(lease_secret) == 32
assert len(storage_index) == 16
assert len(tubid) == 20

bucket_renewal_tag = b"allmydata_bucket_renewal_secret_v1"
file_renewal_tag = b"allmydata_file_renewal_secret_v1"
client_renewal_tag = b"allmydata_client_renewal_secret_v1"

client_renewal_secret = tagged_hash(lease_secret, client_renewal_tag)
file_renewal_secret = tagged_pair_hash(
file_renewal_tag,
client_renewal_secret,
storage_index,
)
peer_id = tubid

return tagged_pair_hash(bucket_renewal_tag, file_renewal_secret, peer_id)

def demo():
secret = b32encode(derive_renewal_secret(
b"lease secretxxxxxxxxxxxxxxxxxxxx",
b"storage indexxxx",
b"tub idxxxxxxxxxxxxxx",
)).decode("ascii")
print("An example renewal secret: {}".format(secret))

def test():
# These test vectors created by intrumenting Tahoe-LAFS
# bb57fcfb50d4e01bbc4de2e23dbbf7a60c004031 to emit `self.renew_secret` in
# allmydata.immutable.upload.ServerTracker.query and then uploading a
# couple files to a couple different storage servers.
test_vector = [
dict(lease_secret=b"boity2cdh7jvl3ltaeebuiobbspjmbuopnwbde2yeh4k6x7jioga",
storage_index=b"vrttmwlicrzbt7gh5qsooogr7u",
tubid=b"v67jiisoty6ooyxlql5fuucitqiok2ic",
expected=b"osd6wmc5vz4g3ukg64sitmzlfiaaordutrez7oxdp5kkze7zp5zq",
),
dict(lease_secret=b"boity2cdh7jvl3ltaeebuiobbspjmbuopnwbde2yeh4k6x7jioga",
storage_index=b"75gmmfts772ww4beiewc234o5e",
tubid=b"v67jiisoty6ooyxlql5fuucitqiok2ic",
expected=b"35itmusj7qm2pfimh62snbyxp3imreofhx4djr7i2fweta75szda",
),
dict(lease_secret=b"boity2cdh7jvl3ltaeebuiobbspjmbuopnwbde2yeh4k6x7jioga",
storage_index=b"75gmmfts772ww4beiewc234o5e",
tubid=b"lh5fhobkjrmkqjmkxhy3yaonoociggpz",
expected=b"srrlruge47ws3lm53vgdxprgqb6bz7cdblnuovdgtfkqrygrjm4q",
),
dict(lease_secret=b"vacviff4xfqxsbp64tdr3frg3xnkcsuwt5jpyat2qxcm44bwu75a",
storage_index=b"75gmmfts772ww4beiewc234o5e",
tubid=b"lh5fhobkjrmkqjmkxhy3yaonoociggpz",
expected=b"b4jledjiqjqekbm2erekzqumqzblegxi23i5ojva7g7xmqqnl5pq",
),
]

for n, item in enumerate(test_vector):
derived = b32encode(derive_renewal_secret(
b32decode(item["lease_secret"]),
b32decode(item["storage_index"]),
b32decode(item["tubid"]),
))
assert derived == item["expected"] , \
"Test vector {} failed: {} (expected) != {} (derived)".format(
n,
item["expected"],
derived,
)
print("{} test vectors validated".format(len(test_vector)))

test()
demo()
1 change: 1 addition & 0 deletions docs/specifications/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,5 +14,6 @@ the data formats used by Tahoe.
URI-extension
mutable
dirnodes
lease
servers-of-happiness
backends/raic
69 changes: 69 additions & 0 deletions docs/specifications/lease.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
.. -*- coding: utf-8 -*-

.. _share leases:

Share Leases
============

A lease is a marker attached to a share indicating that some client has asked for that share to be retained for some amount of time.
The intent is to allow clients and servers to collaborate to determine which data should still be retained and which can be discarded to reclaim storage space.
Zero or more leases may be attached to any particular share.

Renewal Secrets
---------------

Each lease is uniquely identified by its **renewal secret**.
This is a 32 byte string which can be used to extend the validity period of that lease.

To a storage server a renewal secret is an opaque value which is only ever compared to other renewal secrets to determine equality.

Storage clients will typically want to follow a scheme to deterministically derive the renewal secret for a particular share from information the client already holds about that share.
This allows a client to maintain and renew single long-lived lease without maintaining additional local state.

The scheme in use in Tahoe-LAFS as of 1.16.0 is as follows.

* The **netstring encoding** of a byte string is the concatenation of:

* the ascii encoding of the base 10 representation of the length of the string
* ``":"``
* the string itself
* ``","``

* The **sha256d digest** is the **sha256 digest** of the **sha256 digest** of a string.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole section would be a lot clearer if there was also some code included to demonstrate (with type annotations).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something like this?

"""                                                                                                                                                                                           
This is a reference implementation of the lease renewal secret derivation                                                                                                                     
protocol in use by Tahoe-LAFS clients as of 1.16.0.                                                                                                                                           
"""

from base64 import (
    b32encode,
)

from allmydata.util.hashutil import (
    tagged_hash,
    tagged_pair_hash,
)


def derive_renewal_secret(lease_secret: bytes, storage_index: bytes, tubid: bytes) -> bytes:
    assert len(lease_secret) == 32
    assert len(storage_index) == 20
    assert len(tubid) == 20

    bucket_renewal_tag = b"allmydata_bucket_renewal_secret_v1"
    file_renewal_tag = b"allmydata_file_renewal_secret_v1"
    client_renewal_tag = b"allmydata_client_renewal_secret_v1"

    client_renewal_secret = tagged_hash(lease_secret, client_renewal_tag)
    file_renewal_secret = tagged_pair_hash(
        file_renewal_tag,
        client_renewal_secret,
        storage_index,
    )
    peer_id = b32encode(tubid).lower().strip(b"=")

    return tagged_pair_hash(bucket_renewal_tag, file_renewal_secret, peer_id)

print(derive_renewal_secret(
    b"lease secretxxxxxxxxxxxxxxxxxxxx",
    b"storage indexxxxxxxx",
    b"tub idxxxxxxxxxxxxxx",
))

I'm not really sure where to put it, though, or what kind of quality control to try to apply to it ... I suppose I could try to observe a few derived renewal secrets from the actual client software and then capture them as test vectors ... But that code isn't exactly easy to invoke.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that is what I was thinking. You could just put it after the prose section.

I was able to validate that implementation by looking at existing code, other than the logic that calculates peer_id. But... that logic doesn't match the prose spec anyway, since it's supposed to be based on x509 certificate, not tub ID? So if you fix that line I think it's fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tubid is the sha1 hash of the x509 certificate. So I think it works out correctly the way I implemented it above. It's true that "tubid" is a Foolscap concept and the code that computes it is in Foolscap.

I didn't feel like grabbing cryptography and making the certificate an input to this function. The tubid is currently quite convenient to find (it's in every storage fURL) and that won't immediately change even after we drop Foolscap (because Tahoe will have to be capable of accepting fURLs for a long, long time).

So I'd like to leave this as-is is for now but I would be happy to see it revisited in the near future to scrub all traces of Foolscap from the spec.

* The **sha256d tagged digest** is the **sha256d digest** of the concatenation of the **netstring encoding** of one string with one other unmodified string.
* The **sha256d tagged pair digest** the **sha256d digest** of the concatenation of the **netstring encodings** of each of three strings.
* The **bucket renewal tag** is ``"allmydata_bucket_renewal_secret_v1"``.
* The **file renewal tag** is ``"allmydata_file_renewal_secret_v1"``.
* The **client renewal tag** is ``"allmydata_client_renewal_secret_v1"``.
* The **lease secret** is a 32 byte string, typically randomly generated once and then persisted for all future uses.
* The **client renewal secret** is the **sha256d tagged digest** of (**lease secret**, **client renewal tag**).
* The **storage index** is constructed using a capability-type-specific scheme.
See ``storage_index_hash`` and ``ssk_storage_index_hash`` calls in ``src/allmydata/uri.py``.
* The **file renewal secret** is the **sha256d tagged pair digest** of (**file renewal tag**, **client renewal secret**, **storage index**).
* The **base32 encoding** is ``base64.b32encode`` lowercased and with trailing ``=`` stripped.
* The **peer id** is the **base32 encoding** of the SHA1 digest of the server's x509 certificate.
* The **renewal secret** is the **sha256d tagged pair digest** of (**bucket renewal tag**, **file renewal secret**, **peer id**).

A reference implementation is available.

.. literalinclude:: derive_renewal_secret.py
:language: python
:linenos:

Cancel Secrets
--------------

Lease cancellation is unimplemented.
Nevertheless,
a cancel secret is sent by storage clients to storage servers and stored in lease records.

The scheme for deriving **cancel secret** in use in Tahoe-LAFS as of 1.16.0 is similar to that used to derive the **renewal secret**.

The differences are:

* Use of **client renewal tag** is replaced by use of **client cancel tag**.
* Use of **file renewal secret** is replaced by use of **file cancel tag**.
* Use of **bucket renewal tag** is replaced by use of **bucket cancel tag**.
* **client cancel tag** is ``"allmydata_client_cancel_secret_v1"``.
* **file cancel tag** is ``"allmydata_file_cancel_secret_v1"``.
* **bucket cancel tag** is ``"allmydata_bucket_cancel_secret_v1"``.
1 change: 1 addition & 0 deletions newsfragments/3774.documentation
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
There is now a specification for the scheme which Tahoe-LAFS storage clients use to derive their lease renewal secrets.