Skip to content

Latest commit

 

History

History
546 lines (329 loc) · 29.7 KB

draft-20220616.md

File metadata and controls

546 lines (329 loc) · 29.7 KB

Spring '83

Discussion: This is an updated draft specification published in June 2022. You can find a narrative description of the protocol, and my contact information, in this newsletter.

Introduction

Spring '83 is a protocol that allows users to follow publishers on the internet -- who might be people, computer programs, or anything else -- in a way that's simple, expressive, and predictable.

The basic unit of the protocol is the board, which is an HTML fragment, limited to 2217 bytes, unable to execute JavaScript or load external resources, but otherwise unrestricted.

Each publisher maintains just one board. There is no concept of a history; think instead of a whiteboard that is amended or erased.

Spring '83 aspires to be:

Simple. This means the protocol is easy to understand and implement, even for a non-expert programmer.

Expressive. This means the protocol embraces the richness, flexibility, and chaos of modern HTML and CSS. It does not formalize interactions and relationships into database schemas.

(It also means the protocol doesn't provide any mechanism for replies, likes, favorites, or, indeed, feedback of any kind. Publishers are encouraged to use the flexibility of HTML to develop their own approaches, inviting readers to respond via email, join a live chat, send a postcard... whatever!)

Predictable. This means boards holds their place, maintaining a steady presence. It means also that clients only receive the boards they request, when they request them; there is no mechanism by which a server can "push" an unsolicited board.

In addition, Spring '83 is

Federated. This means boards live on different servers operated by different people. It means also that groups of peers share boards with one another, a strategy intended to support exploration and analysis of the full "board flow".

Spring '83 draws inspiration from many existing protocols and technologies; you can read about these in Discussion 1.

Implementation

The key words "must", "must not", "required", "shall", "shall not", "should", "should not", "recommended", "may", and "optional" in this document are to be interpreted as described in RFC 2119.

https://www.ietf.org/rfc/rfc2119.txt

Terminology

board: a fragment of HTML, not necessarily a valid HTML5 document, not more than 2217 bytes long, encoded in UTF-8.

publisher: the entity responsible for specifying a board's content.

key: a public key on the Ed25519 curve, formatted as 64 hex characters.

signature: a public key signature, formatted as 128 hex characters.

client: an application, web or standalone, that publishes, retrieves, and displays boards.

server: an application, reachable on the public internet, that accepts requests from clients to publish and retrieve boards.

listener: an application, reachable on the public internet, that receives boards not from clients but from servers, subscribing to the full "board flow" through a realm. (The listener's possible reasons for doing this are described later.)

peer: a server or listener.

realm: a group of peers who have agreed to circulate the full "board flow" using a simple gossip algorithm.

difficulty factor: a server's current requirement for new boards.

Publishers, keys, and servers

Publishers -- who might be people, computer programs, or anything else -- are identified and authorized by keypairs on the Ed25519 curve. The public part of the keypair, formatted as 64 hex characters, is the publisher's identifier, used to request their board from the server. The secret part of the keypair allows the publisher alone to edit their board.

See Discussion 2 for a consideration of the benefits (substantial) and drawbacks (likewise) of keypair identity schemes.

Each keypair corresponds to exactly one board. Again, there is no concept of a history; think instead of a whiteboard that is amended or erased.

Every publisher has a home server to which they transmit their board, and from which clients generally retrieve it. Publishers can always migrate to a new home server, taking their key with them. (This process is described in the section on clients, below.)

Because a publisher is identified by their key, boards are easy to verify. When a client requests a board for a particular key, the server might send an incorrect or modified response -- inserting an advertisement, perhaps -- but the client will detect the invalid signature, drop the board, and mark the server as untrustworthy.

Keys are globally unique, and they act as "coordination-free" identifiers, similar to UUIDs. Publishers don't need to register with any central authority. Instead, they need only to generate a keypair on the Ed25519 curve that conforms to a format requirement and then select, or start, a home server.

Aside: The use of keys for identification and authorization is, like many things in this protocol, motivated partially by a desire to keep implementation easy and stateless for server programmers and operators. Who wants to manage a complete login system? And send password reset emails? Not me!

A server is identified by a domain name and, optionally, a path. The transmission protocol is an HTTP API over TLS; therefore, the full URL for a board is

https://<host>/<path>/<key>

For example, a board currently hosted on the server bogbody.biz would be identified as

https://bogbody.biz/ca93846ae61903a862d44727c16fed4b80c0522cab5e5b8b54763068b83e0623

Discussion: A client may allow users to follow Spring '83 publishers alongside RSS feeds and other resources. In these situations, the client should distinguish Spring '83 URLs with a regex that recognizes conforming keys, or simply by making a request and noting the presence of a Spring-Version header. The ambiguity of Spring '83 URLs is somewhat intentional; they are just HTML fragments, and users can always preview them in a web browser.

Generating conforming keys

The protocol imposes a format requirement on keys. The content of Ed25519 keypairs is mostly (but not totally) random, so conforming keys can only be generated by trial and error.

The format requirement accomplishes two things at once:

  1. It presents a "client puzzle" which requires a one-time investment of compute resources to "solve". Like Hashcash, this provides a rudimentary form of abuse mitigation: a malicious publisher cannot generate conforming keys instantly and endlessly.

  2. It "bakes" some useful metadata into the key itself.

A conforming key's final seven hex characters must be 83e followed by four characters that, interpreted as MMYY, express a valid month and year in the range 01/00 .. 12/99. Formally, the key must match this regex:

/83e(0[1-9]|1[0-2])(\d\d)$/

Again, a conforming keypair can only be generated by trial and error. On a single thread in an Apple M1 chip, this is accomplished in tens of minutes, not seconds or less.

Aside: I am not totally sure it's impossible to "program" the content of an Ed25519 public key, particularly if you abandon security considerations. My impression is that it can't be done, but I could be wrong. If you have any insight into this, let me know. This format requirement could always be applied instead to the SHA-256 hash of the key.

The date "encoded" in those final four characters has teeth: the key is only valid in the two years preceding it, and expires at the end of the last day of the month specified. (This is analogous to a credit card expiration date.)

For example, the key

ca93846ae61903a862d44727c16fed4b80c0522cab5e5b8b54763068b83e0623

has an encoded expiration date of 0623, or 06/2023. This key is valid between 2021-06-01T00:00:00:00Z and 2023-07-01T00:00:00Z.

This expiration policy makes key rotation mandatory over the long term. Clients may implement features that make the process more convenient, even automatic, but the recurring "stress test" on the publisher-follower link is a feature, not a bug. The goal is to keep Spring '83 relationships "live" and engaged, with fresh opt-ins every two years at most.

The policy suggests, "you're going to lose your secret key eventually... why not lose it now?" and uses that loss as a mechanism to strengthen, rather than weaken, the network.

Discovering keys

The process of discovering a particular publisher's key, or discovering keys to follow generally, is not part of this specification.

However, it is presumed that a home page or profile page might contain a <link> element analogous to the kind used to specify RSS feeds. A client scanning a web page for an associated board should look for <link> elements with the type attribute set to text/board+html.

<link rel="alternate" type="text/board+html" href="https://bogbody.biz/ca93846ae61903a862d44727c16fed4b80c0522cab5e5b8b54763068b83e0623" />

Boards in the client

Much of the burden of the protocol falls on the client: to store a user's keypair securely, allow them to manage a collection of followed keys, and display boards safely.

The client must:

  • limit incoming boards to 2217 bytes
  • verify each board's cryptographic signature
  • situate each board inside its own Shadow DOM

The client must not:

  • execute any JavaScript included in the board
  • load any images, media, or fonts linked by the board

These two requirements should be satisfied with a Content Security Policy. The design of this policy is not part of this specification, but here's an example that works for a simple web-based client:

default-src 'none';
style-src 'self' 'unsafe-inline';
font-src 'self';
script-src 'self';
form-action *;
connect-src *;

It's important to allow unsafe-inline CSS so boards can style themselves!

Aside: Yes, HTML forms are allowed -- unless someone explains to me why this is a horrible idea 😝

Aside: The prohibition against images and other external resources is a matter of privacy and safety. Privacy, because it prevents the use of tracking pixels and other "transponders". Safety, because it lowers the stakes for malicious and illegal content. However, it is totally possible to imagine a future version of this protocol, Spring '84 or beyond, in which

  • a particular domain could be added to the CSP, providing a shared image catalog, analogous to the funky clip art choices offered to purchasers of classified ads. "You can use any pic you want, as long as it's on bukk.it!"
  • or, images and other external resources could simply be permitted.

The client should:

  • display each board in a region with an aspect ratio of either 1:sqrt(2) or sqrt(2):1
  • make available to boards the Spring '83 suite of CSS variables, listed in TKTK
  • open links in new windows or tabs

Spring '83 places no limitations on the HTML content of a board. Knowing that the client will not execute JavaScript, publishers will probably not want to spend their precious bytes on inert code... but they are free to do so.

Boards might be:

  • mini home pages, updated regularly with new work
  • lists of links to web pages recently read, perhaps with brief notes
  • daily logs, wiped clean each morning
  • yawlps to the universe

Beyond the standards described above, Spring '83 doesn't specify how the client should display boards. For example, the client may:

  • allow users to explore overflowing boards by scrolling
  • organize boards according to an abstruse algorithm
  • handle links to boards in a helpful way, even "transcluding" them (!)

Publishers should expect their boards to be placed on a 2D canvas alongside many others.

Boards with special client instructions

Clients should scan for the <link rel="next"> element:

<link rel="next" href="<URL>">

This element is used by publishers to migrate from key to key and server to server. When the client finds a <link rel="next"> element, it should retrieve the board at the URL, verify it normally, and update its record of the publisher's home server and/or key accordingly.

The board should contain only one one <link rel="next"> element. If it contains more than one, the client must honor the first and ignore the rest.

The client may also scan for arbitrary data stored in data-spring-* attributes throughout the board. These attributes and their uses will be defined by publishers and client developers.

Boards on the server

Spring '83 servers are, in operation, very similar to "plain old web" servers, with a few additional behaviors.

The server must

  • maintain a persistent store of boards
  • enforce a TTL, dropping boards over 22 days old
  • share newly-received boards with listeners

The transmission portion of Spring '83 operates over HTTP, using TLS, so it can be implemented easily using existing tools.

A Spring '83 server must respond at the following HTTP endpoints:

OPTIONS
PUT /<key>
GET /
GET /<key>

That's a pretty small API surface! We'll go through the endpoints one by one.

Serving clients: OPTIONS

Many clients will, by dint of being web apps, depend on CORS to retrieve boards from non-origin servers. Servers must support preflight OPTIONS requests to all endpoints, replying with:

HTTP/1.1 204 No Content
Access-Control-Allow-Methods: GET, PUT, OPTIONS
Access-Control-Allow-Origin: *
Access-Control-Allow-Headers: Content-Type, If-Modified-Since, Spring-Signature, Spring-Version
Access-Control-Expose-Headers: Content-Type, Last-Modified, Spring-Difficulty, Spring-Signature, Spring-Version

Publishing boards: PUT /<key>

To publish a board, a client must send a request of this form:

PUT /<key> HTTP/1.1
Content-Type: text/html;charset=utf-8
Spring-Version: 83
If-Unmodified-Since: <date and time in UTC, HTTP (RFC 5322) format>
Spring-Signature: <signature>

<board>

Upon receipt, servers should check the non-cryptographic part of the PUT request immediately, and, if necessary, respond with an error code.

Checking boards

Size

If the board is larger than 2217 bytes, the server must reject the PUT request with 413 Payload Too Large.

Timestamp

The If-Unmodified-Since header is transmitted as a convenience for caches and other intermediaries that "speak HTTP". The server must disregard it.

The "real" timestamp -- the one that will be verified cryptographically -- is transmitted as part of the board HTML. The client must include a <time> element with its datetime attribute set to a UTC timestamp in ISO 8601 format, without milliseconds: YYYY-MM-DDTHH:MM:SSZ.

For the convenience of server implementers, the <time> element must fit the following format exactly; "valid HTML" is not sufficient:

<time datetime="YYYY-MM-DDTHH:MM:SSZ">

The client should append the <time> element to the beginning of the board, allowing the server to "match fast".

The <time> element may have text content and a closing tag, but this isn't required. The client should include default CSS that hides the element; a publisher may choose to override that CSS, revealing it.

Aside: Is this too fiddly? I am trying to save server implementers the trouble of running full HTML parsers!

It's important that the timestamp is transmitted this way, because it needs to be signed along with the rest of the board's HTML. The client should add the <time> element automatically, just before signing and transmitting the board.

The server must reject the PUT request, returning 400 Bad Request, if

  • the request is transmitted without a <time> element; or
  • its <time> element's datetime attribute is not a UTC timestamp in ISO 8601 format; or
  • its <time> element's datetime attribute is set to a timestamp in the future.

The board should contain only one <time> element. If it contains more than one, the server must honor the first and ignore the rest.

If the board's timestamp is older than or equal to the timestamp of the server's version of the board, the server must reject the request, returning 409 Conflict.

These criteria are EXTREMELY important. The <time> element modulates a board's change over time; if it gets screwed up, the publisher could lose the ability to update their board.

Cryptographic validity

The server must verify that <signature> is <key>'s valid signature for the board, exactly as transmitted. If the board isn't properly signed, the server must reject the request, returning 401 Unauthorized.

Conforming keys

The key format requirements are described earlier in this specification, and reproduced here for convenience.

A conforming key's final seven hex characters must be "83e" followed by four characters that, interpreted as MMYY, express a valid month and year in the range 01/00 .. 12/99. Formally, the key must match this regex:

/83e(0[1-9]|1[0-2])(\d\d)$/

If the key does not match that regex, the server must reject the request, returning 403 Forbidden.

The key is only valid in the two years preceding its encoded expiration date, and expires at the end of the last day of the month specified. For example, the key

c761fd8e4abc6ee4ca6d0883a95b7f0c88d33835a085b382dfbfb435283e0623

has an encoded expiration date of 0623, or 06/2023. This key is valid between 2021-06-01T00:00:00:00Z and 2023-07-01T00:00:00Z.

If the key has expired or is set to a date more than two years in the future, the server must reject the request, returning 403 Forbidden.

Difficulty factor

If the server doesn't yet have any board stored for the key, then it must apply an additional check. The key's first 16 hex characters, interpreted as a 64-bit number, must be less than a threshold defined by the server's difficulty factor:

MAX_KEY_64 = (2**64 - 1)
key_64_threshold = round(MAX_KEY_64 * ( 1.0 - difficulty_factor))

If the key fails this check, the server must reject the PUT request, returning 403 Forbidden.

This check is not applied to keys for which the server already has a board stored. (You can read more about the difficulty factor later in this specification.)

If the difficulty factor is 1.0, the server must reject all PUT requests for new boards.

Aside: The difficulty factor check only operates on the key's first 64 bits for convenience, avoiding the use of special "big number" data types. JavaScript developers will, however, still have to use BigInt.

Final considerations

Spring '83 defines a test keypair:

public: ab589f4dde9fce4180fcf42c7b05185b0a02a5d682e353fa39177995083e0583
secret: 3371f8b011f51632fea33ed0a3688c26a45498205c6097c352bd4d079d224419

Servers must reject PUT requests for this key, returning 401 Unauthorized. (See the section detailing GET /<key>, below, for additional guidance.)

The server may use a denylist to block certain keys, rejecting all PUTs for those keys.

Aside: It's likely this would be a shared denylist, but I don't yet have a proposal for how that would operate. It fits into the realm, of course, and feels like part of the contract: you get to be included in the "board flow", but/and you need to honor this shared denylist. Where is the list stored? How is it updated? Does efficient matching become a problem if a denylist has 100,000 entries? Lots of puzzles here.

If the preceding checks are all met, the server must store the board and share it with listeners in its realm.

Listeners and realms: peer-to-peer PUT /<key>

ALERT: This section is extremely fiddly, and none of it is implemented in my demo server. I am leaving the text intact for the sake of discussion, but feel free to skip it.

A realm is a group of Spring '83 servers and listeners who have agreed to circulate new boards using a simple gossip algorithm. It provides a framework for aggregating "the whole network" in one place, something that's otherwise difficult to do in a federated system.

Why would a listener want to join a realm and receive all its PUT requests? Well, with access to full "board flow", the listener could

  • scan boards and produce a report of top links across the realm
  • calculate algorithmic “who to follow” recommendations and offer them to anyone who might be interested
  • scan for malicious or illegal content and alert server operators

In this way, listeners can add new "features" to the protocol.

A realm is described by a UTF-8 text document reachable on the public internet. Each of the document's lines lists one server. Lines containing only whitespace must be ignored. Lines that begin with # are comments and must be ignored. For example, the file https://spring83.net/supreme.txt might describe a very small realm:

# behold, the Realm Supreme
# to join, email robin@robinsloan.com

bogbody.biz
working.directory
publisher.pizza

# well, maybe it's not so Supreme

Aside: This is where I got frustrated and gave up on this section, at least for the time being. I feel like servers and listeners in a realm really ought to be able to authenticate each other's requests... but does that require each to have a public key listed in this text document? And to calculate an additional signature for all the boards they circulate? Suddenly, this is feeling sort of "heavy" and, perhaps, stupid.

There is no automatic or "trustless" way to join a realm. As with BGP, the foundational routing protocol of the internet: you gotta talk to somebody!

In the section below, the term "peer" means servers, who receive PUT requests from clients, AND listeners, who receive PUT requests from servers.

The peer should belong to one realm. It may belong to zero or many, but the protocol's design favors one. A server that belongs to zero realms can still store a publisher's board without any problem, providing it to clients of all kinds; it just won't provide that board to any listeners.

After receiving and verifying a new board, the peer must share it with N listeners selected randomly from the realm. N is

min(round(total_peer_count * 0.5), 5)

New boards should be transmitted to listeners asynchronously. The server should wait at least five minutes before sharing, and it may wait longer. In this way, the server can act as a buffer, absorbing and "compacting" rapid PUTs.

To share a board with a listener, the server must transmit a PUT request exactly like the one described above.

When the listener receives a PUT, it should respond immediately with 202 Accepted and place the board into a queue for asynchronous processing. (If the listener doesn't have a queue, it may respond synchronously.)

If a listener is not reachable or responds with an error code, the server must wait for a minimum timeout of 5 minutes before attempting to contact that listener again. If the listener is still not reachable, the server must apply a jittered backoff strategy. The server should cap its timeout at some maximum; one hour is recommended.

Relief: It's over. Thank goodness.

Difficulty factor: GET /

Spring '83 is intended to invite participation from people with "normal" compute resources, whether that means a home server with a public IP address or a free-tier cloud instance. The protocol includes, therefore, a mechanism to apply "backpressure" to publishers, slowing or stopping the creation of new boards.

The difficulty factor is a decimal number in the range 0.0 .. 1.0, and is calculated based on the maximum number of boards the server would like to store. As described earlier in this specification, the difficulty factor is used to apply an extra (and potentially quite stringent) check on new boards.

Aside: It's very possible that none of this is needed. Instead, when a server is under stress, it could just flip the sign on the door: CLOSED! Aha, yes, there's an elegant algorithm for you. Besides, the format requirement already imposes a substantial "cost" on the creation of new boards. Ugh, but this was so CLEVER. We are in "kill your darlings" territory here...

The calculation of the difficulty factor is not part of this specification, but here's an example formula that works well:

difficulty_factor = ( num_boards_stored / max_boards )**4

The exponent is included because the difficulty factor is only intended to dissuade new boards "at the margin"; after all, we're here to store and provide HTML, aren't we?

Let's consider the example of a server that is currently storing 60,000 boards and would like to store, at most, 100,000.

difficulty_factor = ( 60_000 / 100_000 )**4 = 0.1296

Using that difficulty factor, we can calculate the key threshold:

MAX_KEY_64 = (2**64 - 1)
key_64_threshold = MAX_KEY_64 * ( 1.0 - 0.1296 ) = <a 64-bit number>

The server must check the key's first 64 bits against that 64-bit number, as described earlier in this specification.

A client should determine a server's current difficulty factor using a request of this form:

GET / HTTP/1.1
Spring-Version: 83

The server's response must take the form:

HTTP/1.1 200 OK
Content-Type: text/html;charset=utf-8
Spring-Version: 83
Spring-Difficulty: <difficulty factor>

<server greeting>

Using that information, the client can choose either to spend the time required to generate a conforming key... or give up, and perhaps try later.

Retrieving boards: GET /<key>

To retrieve the board for <key> from its home server, the client must transmit a request of this form:

GET /<key> HTTP/1.1
Spring-Version: 83
If-Modified-Since: <date and time in UTC, RFC 5322 format>

If the server has a board for <key> but it is not newer than the timestamp specified in If-Modified-Since, it must respond with 304 Not Modified.

If the client omits If-Modified-Since, the server should return whatever board it has for <key>, if it has one.

The server's response must take the form:

HTTP/1.1 200 OK
Content-Type: text/html;charset=utf-8
Spring-Version: 83
Last-Modified: <date and time in UTC, HTTP (RFC 5322) format>
Spring-Signature: <signature>

<board>

The client must verify that <signature> is <key>'s valid signature for the board, exactly as transmitted. If the board isn't signed properly, the client must drop the response. It should communicate to the user that the server is either faulty or dishonest.

Spring '83 specifies a test keypair:

public: ab589f4dde9fce4180fcf42c7b05185b0a02a5d682e353fa39177995083e0583
secret: 3371f8b011f51632fea33ed0a3688c26a45498205c6097c352bd4d079d224419

The server should respond to GET requests for this key with an ever-changing board, generated internally, with a timestamp set to the time of the request. This board is provided to help client developers understand and troubleshoot their applications.

Forgetting boards

The server must store boards with a TTL of, at most, 22 days.

The server must provide identical responses to requests for

  • <key A>, for which it once stored a board, now deleted, and
  • <key B>, for which it never stored any board.

Finally, the server must not enumerate keys or boards for any requester; the server must only respond to requests for specific keys.

Error code quick reference

  • 400: Board was submitted with an impromper timestamp.
  • 401: Board was submitted without a valid signature.
  • 403: Board was submitted for a non-conforming key.
  • 404: No board for this key found on this server.
  • 409: Board was submitted with a timestamp older than the server's timestamp for this key.
  • 413: Board is larger than 2217 bytes.

Discussions

1: Inspirations

This protocol draws inspiration

  • from Secure Scuttlebutt: the power of cryptographic keypairs as identities
  • from Hashcash: the notion of guarding server resources with client puzzles
  • from ZeroTier: the spirit of "decentralize until it hurts, then centralize until it works".

and, most of all,

For those not familiar: QOTD operates over both TCP and UDP; the server responds to a TCP connection or a UDP datagram with a single brief message.

In fact, I very badly wanted Spring '83 to operate over UDP -- responding to a request with, in some cases, a single datagram packet: beautiful -- but the architecture of the modern internet makes that much more difficult today than it was in 1983, and, anyway, there's a universe of capable tooling for HTTP.

So, not without regret, Spring '83 goes with the flow.

2: The agony and ecstasy of public key cryptography

There are a lot of reasons NOT to use cryptographic keypairs as identities, not least of which: the certainty that a user will eventually lose their secret key.

But the profound magic trick of the signature: that it allows a piece of content to move through the internet, handed from server to server, impossible to tamper with... it's too good to pass up.

And the way public key cryptography allows anyone to "join" the system, without registering anywhere, simply by generating an appropriate keypair -- again, it's a trick so good it seems like it shouldn't work.

The trapdoor function at the heart of public key cryptography is mirrored in the trapdoor of its user experience: once a secret key is lost or compromised, the identity is done. Game over. The system offers no customer support, because the system is math, and all the angels are busy assisting other callers.

It's one of the harshest if/thens in all of computing, and a steep price to pay for the magic trick. Spring '83, given its other priorities and constraints, is willing to pay that price -- but only barely.

The compromise is mandatory key rotation, enforced with the time-of-use requirement. This policy suggests, "you're going to lose your secret key eventually... why not lose it now?" and uses that loss as a mechanism to strengthen, rather than weaken, the network.

Beyond that, there is plenty of space for Spring '83 clients to offer hosted ("custodial") secret keys, abstracting the keypair behind a more traditional username and password.