(PREVIEW BRANCH) all the new specs combined #120

bnewbold · 2023-06-15T02:42:42Z

Don't review or merge this branch!

This is just to have a combined PR preview to link and share.

Actual PRs for review:

cloudflare-workers-and-pages · 2023-06-15T02:50:31Z

Deploying with Cloudflare Pages

Latest commit:	`9b4560a`
Status:	✅ Deploy successful!
Preview URL:	https://50b1fc1b.atproto-website.pages.dev
Branch Preview URL:	https://bnewbold-draft-specs-june.atproto-website.pages.dev

View logs

warpfork · 2023-06-15T16:57:40Z

There's quite a few links to www.notion.so/blueskyweb/* which are non-public -- if those docs live in notion but are public, maybe you want to get their share links instead? Notion isn't very good about redirecting non-logged-in users to the public versions of pages if they arrive at the internal links.

bnewbold · 2023-06-15T19:55:06Z

@warpfork missed some, thanks! These don't link to actual documents, they were relative links between spec pages in Markdown which Notion mangled. Will resolve these in the individual PRs.

bnewbold · 2023-06-15T21:51:43Z

TODO(bnewbold): "The first entry in the array for a given node should contain the full key, and a common prefix length of 0." => must not should

clarify that prefix compression is required and deterministic.

content/specs/data-model.md

content/specs/atp.md

snarfed · 2023-06-17T14:05:41Z

content/specs/atp.md


-## Server-to-server API
+**Application:** APIs and record schemas for applications built on atproto are specified in [Lexicons](/specs/lexicon), which are are referenced by [Namespaced Identifiers](/specs/nsid) (NSIDs). Application-specific aggregations (such as search) are provided by an Application View (AppView) service. Clients can include mobile apps, desktop software, or web interfaces.


Hmm! AppView is new to me. Sounds like it overlaps a lot with BGS. Is BGS the architectural concept, and AppView is more of a service/interface?

The BGS is primarily a "dumb" rehoster of content across PDSes, whereas the AppView provides the views specific to the Bluesky application. The BGS doesn't need to have semantic awareness of the content it's hosting, whereas the AppView is interpreting "like", "follow", "post", "profile", etc. records into more complex views to service an application. Roughly speaking, BGS is concerned with com.atproto.sync.* and com.atproto.repo.* lexicons, and the Bluesky AppView is concerned with app.bsky.* lexicons.

content/specs/lexicon.md

snarfed · 2023-06-17T14:18:19Z

content/specs/lexicon.md

+
+## Possible Future Changes
+
+The validation rules for unexpected additional fields may change. For example, a mechanism for Lexicons to indicate that the schema is "closed" and unexpected fields are not allowed, or a convention around field name prefixes (`x-`) to indicate unofficial extension.


Just FYI, the standards community has generally moved away from experimental X- prefixes, and agreed that they caused more harm than good. https://datatracker.ietf.org/doc/html/rfc6648#appendix-B . (Also relevant to your idea of including experimental in new lexicon NSIDs.) Same with vendor prefixes more recently, https://www.webstandards.org/2012/02/09/call-for-action-on-vendor-prefixes/index.html

Thanks for the refs! We do need to think through what the best practice is here better. I guess we think of the primary extension mechanism (new lexicons) as being pretty clear, but that might be because we are in a weird perspective of having full control over the most popular lexicons. other folks sure do seem to want to stuff fields in these lexicons, and we should have a better answer.

snarfed · 2023-06-17T14:21:39Z

content/specs/lexicon.md

+
+Unexpected fields in data which otherwise conforms to the Lexicon should be ignored. When doing schema validation, they should be treated at worst as warnings. This is necessary to allow evolution of the schema by the controlling authority, and to be robust in the case of out-of-date Lexicons.
+
+Third parties can technically insert any additional fields they want in to data. This is not the recommended way to extend applications, but it is not specifically disallowed. One danger with this is that the Lexicon may be updated to include fields with the same field names but different types, which would make existing data invalid.


...also one thing that's missing here is guidance on whether PDSes should accept records with unknown lexicons, and if so, how they should validate them, etc. Depends on lexicon resolution I guess?

@pfrazee implied/said yes in bluesky-social/atproto#855 :

the goal is to make the PDS indifferent to application schemas and provide generic behaviors only; that way adding some new schema doesnt run the risk of losing forward progress

Yeah, we definitely want PDS implementations to be more agnostic to the lexicons of data they host. Right now, in prod, we still have PDS+AppView conjoined, which is why we haven't loosened up on this yet.

We wanted to have a bit more confidence in this by first actually deploying federation with the app.bsky lexicons working with PDS+BGS+AppView as distinct network services (happening as a slow-roll transition right now), and probably us ourselves doing an example separate/independent lexicon as a secondary app to see how that goes, before we are confident in this entire architecture. That does take time and we should probably just get out of the way and let devs experiment in other spaces though. One step would be to allow random un-validated records outside the com.atproto and app.bsky spaces, to keep those on a shorter leash to start.

We also haven't worked out the "fetch Lexicon automatically over the network given the NSID". A couple options on how to do that, none super complex, but not clear if that will be reliable enough to require fetching and validation for new record types, or to end up with a schema where PDS instances have some "popular" or "common" set of lexicons built-in (compiled-in?) and work with just those.

snarfed · 2023-06-17T14:29:40Z

content/specs/handle.md

+
+Note that very long handles can not be resolved using this method if the additional `_atproto.` name segment pushes the overall name over the 253 character maximum for DNS queries. The HTTPS method will work for such handles.
+
+DNSSEC is not required.


Yeah... sort of poking the hornets nest there.

maybe we should require IPv6 though? 😈

content/specs/xrpc.md

snarfed · 2023-06-17T14:46:45Z

content/specs/handle.md

@@ -92,11 +92,68 @@ laptop.local
 blah.arpa
 ```

-### Usage and Implementation Guidelines
+## Handle Resolution


Thanks! This is all great.

As mentioned before, definitely questions here around how this interacts with federation, but that's for the future. And ideally guidance on when/how to deal with handles in records as alternatives to DIDs. (Sounds like consumers are generally expected to prefer DIDs but accept both interchangeably.)

snarfed · 2023-06-17T14:56:44Z

content/specs/xrpc.md

-|-|-|
-|`query`|`GET`|
-|`procedure`|`POST`|
+### Admin Token


Interesting! I'm guessing this is primarily a legacy holdover, so it's more descriptive of the current prod PDS than normative, right? Otherwise it seems like this could be an implementation detail that doesn't really need to standardized, or if it does, maybe it would fit better in the protocol or repository docs?

Yeah, all the auth stuff is sort of temporary and we expect to improve a lot on this. But that isn't going to be a quick fix, and we want to be transparent and clear how the existing scheme works in the meanwhile.

We do expect to have some of the other things, like repo history and sync stream, firmed up sooner than auth, which is why we just don't include those at all yet.

snarfed · 2023-06-17T15:07:56Z

content/specs/event-stream.md

+
+If a client can not keep up with the rate of messages, the server may send a "too slow" error and close the connection.
+
+### Sequence Numbers


Oof, this is a lot. Sure would be nice if we could get some of it for free by switching the namespace from PDS hostname + endpoint to user DID + collection and then depending on monotonically increasing TIDs instead. I guess that would mean servers and clients would have to keep track of O(users) sequence TIDs instead of just O(servers/clients + endpoints) sequence numbers though.

This is under pretty active discussion internally, and with a few external tech advisors. Hard to compress all the conversation to a short summary. Also, realistically, this is fairly high up in the protocol stack and the sort of thing that might have multiple implementations over time. Eg, internally and at scale, can't really imagine not using Kafka for the big firehoses; maybe we'll have a formal "how to do this in Kafka within an org/consortium" spec or recommendation.

Also, clarified that the sequence stuff is optional. Can totally create subscription Lexicons which don't include it.

Thanks Emily! Co-authored-by: Emily Liu <emilyliu7321@gmail.com>

Co-authored-by: Emily Liu <emilyliu7321@gmail.com>

Co-authored-by: Daniel Holmgren <dtholmgren@gmail.com> Co-authored-by: Emily Liu <emilyliu7321@gmail.com>

bnewbold · 2023-06-17T23:47:25Z

Thanks for feedback from everybody on this group of changes!

Everything in thing branch has now been merged and deployed in to the live website. Future corrections should go in new issues or PRs.

There are some active discussions here, and this branch was linked from a couple places. I'll leave it for another day or two then close the PR (which IIUC will remove the PR preview site).

bnewbold added 9 commits June 14, 2023 19:40

specs/repository: note about signature metadata

97752be

specs/event-stream: initial specification

52b1578

specs/data-model: initial specification

f04c658

specs/data-model: note about false-y containers

bcd6c9a

specs/did: add DID document parsing info

8564485

specs/handle: add Handle Resolution details

89968d9

specs/xrpc: re-write; frame as HTTP API

c4c9fd9

specs/lexicon: re-write

7ed7b21

specs/atp: re-write specs intro

ca8f3ab

bnewbold marked this pull request as ready for review June 15, 2023 02:42

bnewbold marked this pull request as draft June 15, 2023 02:43

bnewbold added 3 commits June 14, 2023 19:49

specs/lexicon: header, and rename sidebar link

ea0a3c4

specs/atp: add header; rename sidebar link

7dbbf6f

specs: re-order sidebar

87309c3

snarfed reviewed Jun 16, 2023

View reviewed changes

content/specs/data-model.md Outdated Show resolved Hide resolved

snarfed reviewed Jun 16, 2023

View reviewed changes

content/specs/data-model.md Show resolved Hide resolved

snarfed reviewed Jun 17, 2023

View reviewed changes

content/specs/atp.md Outdated Show resolved Hide resolved

snarfed reviewed Jun 17, 2023

View reviewed changes

content/specs/lexicon.md Outdated Show resolved Hide resolved

snarfed reviewed Jun 17, 2023

View reviewed changes

content/specs/xrpc.md Outdated Show resolved Hide resolved

snarfed reviewed Jun 17, 2023

View reviewed changes

bnewbold and others added 17 commits June 17, 2023 14:21

specs/event-stream: apply copy-edit fixes from code review

d47f9f9

Thanks Emily! Co-authored-by: Emily Liu <emilyliu7321@gmail.com>

specs/event-stream: re-word changes from review

c23d857

Apply suggestions from code review

df40c2a

Co-authored-by: Emily Liu <emilyliu7321@gmail.com>

specs/data-model: improve data type table

8ba50f4

specs/data-model: more details and re-wording from review

1c8cafd

specs/handle: handle resolution tweaks from review

c3c6a6c

specs/did: DID resolution tweaks from review

0bb7af5

specs/xrpc: more metadata flag notes; typo fix

ecaff44

specs/lexicon: tweaks from review feedback

f7974e7

Apply suggestions from code review

9778327

Co-authored-by: Daniel Holmgren <dtholmgren@gmail.com> Co-authored-by: Emily Liu <emilyliu7321@gmail.com>

specs/cryptography: break out from 'atp' doc

611889d

specs: fix accidental notion links

ce2e0a5

specs/repository: clarify that MST key compaction is mandatory

1fffd39

specs/lexicon: fix typo

3c1caa5

specs/xrpc: fix typo

c67fbee

specs/lexicon: 'revision' field in top-level file

b3f36a2

specs/cryptography: add missing header metadata

9b4560a

bnewbold closed this Jun 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(PREVIEW BRANCH) all the new specs combined #120

(PREVIEW BRANCH) all the new specs combined #120

bnewbold commented Jun 15, 2023 •

edited

Loading

cloudflare-workers-and-pages bot commented Jun 15, 2023 •

edited

Loading

warpfork commented Jun 15, 2023

bnewbold commented Jun 15, 2023

bnewbold commented Jun 15, 2023

snarfed Jun 17, 2023

devinivy Jun 17, 2023 •

edited

Loading

snarfed Jun 17, 2023 •

edited

Loading

bnewbold Jun 17, 2023

snarfed Jun 17, 2023

snarfed Jun 17, 2023

snarfed Jun 17, 2023

bnewbold Jun 17, 2023

snarfed Jun 17, 2023

bnewbold Jun 17, 2023

snarfed Jun 17, 2023 •

edited

Loading

snarfed Jun 17, 2023

bnewbold Jun 17, 2023

snarfed Jun 17, 2023

bnewbold Jun 17, 2023

bnewbold Jun 17, 2023

bnewbold commented Jun 17, 2023


		## Server-to-server API
		Application: APIs and record schemas for applications built on atproto are specified in [Lexicons](/specs/lexicon), which are are referenced by [Namespaced Identifiers](/specs/nsid) (NSIDs). Application-specific aggregations (such as search) are provided by an Application View (AppView) service. Clients can include mobile apps, desktop software, or web interfaces.


		## Possible Future Changes

		The validation rules for unexpected additional fields may change. For example, a mechanism for Lexicons to indicate that the schema is "closed" and unexpected fields are not allowed, or a convention around field name prefixes (`x-`) to indicate unofficial extension.


		Unexpected fields in data which otherwise conforms to the Lexicon should be ignored. When doing schema validation, they should be treated at worst as warnings. This is necessary to allow evolution of the schema by the controlling authority, and to be robust in the case of out-of-date Lexicons.

		Third parties can technically insert any additional fields they want in to data. This is not the recommended way to extend applications, but it is not specifically disallowed. One danger with this is that the Lexicon may be updated to include fields with the same field names but different types, which would make existing data invalid.


		Note that very long handles can not be resolved using this method if the additional `_atproto.` name segment pushes the overall name over the 253 character maximum for DNS queries. The HTTPS method will work for such handles.

		DNSSEC is not required.


		If a client can not keep up with the rate of messages, the server may send a "too slow" error and close the connection.

		### Sequence Numbers

(PREVIEW BRANCH) all the new specs combined #120

(PREVIEW BRANCH) all the new specs combined #120

Conversation

bnewbold commented Jun 15, 2023 • edited Loading

cloudflare-workers-and-pages bot commented Jun 15, 2023 • edited Loading

Deploying with Cloudflare Pages

warpfork commented Jun 15, 2023

bnewbold commented Jun 15, 2023

bnewbold commented Jun 15, 2023

Choose a reason for hiding this comment

devinivy Jun 17, 2023 • edited Loading

Choose a reason for hiding this comment

snarfed Jun 17, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

snarfed Jun 17, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bnewbold commented Jun 17, 2023

bnewbold commented Jun 15, 2023 •

edited

Loading

cloudflare-workers-and-pages bot commented Jun 15, 2023 •

edited

Loading

devinivy Jun 17, 2023 •

edited

Loading

snarfed Jun 17, 2023 •

edited

Loading

snarfed Jun 17, 2023 •

edited

Loading