-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(PREVIEW BRANCH) all the new specs combined #120
Conversation
Deploying with
|
Latest commit: |
9b4560a
|
Status: | ✅ Deploy successful! |
Preview URL: | https://50b1fc1b.atproto-website.pages.dev |
Branch Preview URL: | https://bnewbold-draft-specs-june.atproto-website.pages.dev |
There's quite a few links to |
@warpfork missed some, thanks! These don't link to actual documents, they were relative links between spec pages in Markdown which Notion mangled. Will resolve these in the individual PRs. |
TODO(bnewbold): "The first entry in the array for a given node should contain the full key, and a common prefix length of 0." => must not should clarify that prefix compression is required and deterministic. |
content/specs/atp.md
Outdated
|
||
## Server-to-server API | ||
**Application:** APIs and record schemas for applications built on atproto are specified in [Lexicons](/specs/lexicon), which are are referenced by [Namespaced Identifiers](/specs/nsid) (NSIDs). Application-specific aggregations (such as search) are provided by an Application View (AppView) service. Clients can include mobile apps, desktop software, or web interfaces. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm! AppView is new to me. Sounds like it overlaps a lot with BGS. Is BGS the architectural concept, and AppView is more of a service/interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The BGS is primarily a "dumb" rehoster of content across PDSes, whereas the AppView provides the views specific to the Bluesky application. The BGS doesn't need to have semantic awareness of the content it's hosting, whereas the AppView is interpreting "like", "follow", "post", "profile", etc. records into more complex views to service an application. Roughly speaking, BGS is concerned with com.atproto.sync.*
and com.atproto.repo.*
lexicons, and the Bluesky AppView is concerned with app.bsky.*
lexicons.
|
||
## Possible Future Changes | ||
|
||
The validation rules for unexpected additional fields may change. For example, a mechanism for Lexicons to indicate that the schema is "closed" and unexpected fields are not allowed, or a convention around field name prefixes (`x-`) to indicate unofficial extension. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just FYI, the standards community has generally moved away from experimental X- prefixes, and agreed that they caused more harm than good. https://datatracker.ietf.org/doc/html/rfc6648#appendix-B . (Also relevant to your idea of including experimental
in new lexicon NSIDs.) Same with vendor prefixes more recently, https://www.webstandards.org/2012/02/09/call-for-action-on-vendor-prefixes/index.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the refs! We do need to think through what the best practice is here better. I guess we think of the primary extension mechanism (new lexicons) as being pretty clear, but that might be because we are in a weird perspective of having full control over the most popular lexicons. other folks sure do seem to want to stuff fields in these lexicons, and we should have a better answer.
|
||
Unexpected fields in data which otherwise conforms to the Lexicon should be ignored. When doing schema validation, they should be treated at worst as warnings. This is necessary to allow evolution of the schema by the controlling authority, and to be robust in the case of out-of-date Lexicons. | ||
|
||
Third parties can technically insert any additional fields they want in to data. This is not the recommended way to extend applications, but it is not specifically disallowed. One danger with this is that the Lexicon may be updated to include fields with the same field names but different types, which would make existing data invalid. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...also one thing that's missing here is guidance on whether PDSes should accept records with unknown lexicons, and if so, how they should validate them, etc. Depends on lexicon resolution I guess?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pfrazee implied/said yes in bluesky-social/atproto#855 :
the goal is to make the PDS indifferent to application schemas and provide generic behaviors only; that way adding some new schema doesnt run the risk of losing forward progress
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we definitely want PDS implementations to be more agnostic to the lexicons of data they host. Right now, in prod, we still have PDS+AppView conjoined, which is why we haven't loosened up on this yet.
We wanted to have a bit more confidence in this by first actually deploying federation with the app.bsky
lexicons working with PDS+BGS+AppView as distinct network services (happening as a slow-roll transition right now), and probably us ourselves doing an example separate/independent lexicon as a secondary app to see how that goes, before we are confident in this entire architecture. That does take time and we should probably just get out of the way and let devs experiment in other spaces though. One step would be to allow random un-validated records outside the com.atproto
and app.bsky
spaces, to keep those on a shorter leash to start.
We also haven't worked out the "fetch Lexicon automatically over the network given the NSID". A couple options on how to do that, none super complex, but not clear if that will be reliable enough to require fetching and validation for new record types, or to end up with a schema where PDS instances have some "popular" or "common" set of lexicons built-in (compiled-in?) and work with just those.
|
||
Note that very long handles can not be resolved using this method if the additional `_atproto.` name segment pushes the overall name over the 253 character maximum for DNS queries. The HTTPS method will work for such handles. | ||
|
||
DNSSEC is not required. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 😭
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah... sort of poking the hornets nest there.
maybe we should require IPv6 though? 😈
@@ -92,11 +92,68 @@ laptop.local | |||
blah.arpa | |||
``` | |||
|
|||
### Usage and Implementation Guidelines | |||
## Handle Resolution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This is all great.
As mentioned before, definitely questions here around how this interacts with federation, but that's for the future. And ideally guidance on when/how to deal with handles in records as alternatives to DIDs. (Sounds like consumers are generally expected to prefer DIDs but accept both interchangeably.)
|-|-| | ||
|`query`|`GET`| | ||
|`procedure`|`POST`| | ||
### Admin Token |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting! I'm guessing this is primarily a legacy holdover, so it's more descriptive of the current prod PDS than normative, right? Otherwise it seems like this could be an implementation detail that doesn't really need to standardized, or if it does, maybe it would fit better in the protocol or repository docs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, all the auth stuff is sort of temporary and we expect to improve a lot on this. But that isn't going to be a quick fix, and we want to be transparent and clear how the existing scheme works in the meanwhile.
We do expect to have some of the other things, like repo history and sync stream, firmed up sooner than auth, which is why we just don't include those at all yet.
|
||
If a client can not keep up with the rate of messages, the server may send a "too slow" error and close the connection. | ||
|
||
### Sequence Numbers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oof, this is a lot. Sure would be nice if we could get some of it for free by switching the namespace from PDS hostname + endpoint to user DID + collection and then depending on monotonically increasing TIDs instead. I guess that would mean servers and clients would have to keep track of O(users) sequence TIDs instead of just O(servers/clients + endpoints) sequence numbers though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is under pretty active discussion internally, and with a few external tech advisors. Hard to compress all the conversation to a short summary. Also, realistically, this is fairly high up in the protocol stack and the sort of thing that might have multiple implementations over time. Eg, internally and at scale, can't really imagine not using Kafka for the big firehoses; maybe we'll have a formal "how to do this in Kafka within an org/consortium" spec or recommendation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, clarified that the sequence stuff is optional. Can totally create subscription Lexicons which don't include it.
Thanks Emily! Co-authored-by: Emily Liu <emilyliu7321@gmail.com>
Co-authored-by: Emily Liu <emilyliu7321@gmail.com>
Co-authored-by: Daniel Holmgren <dtholmgren@gmail.com> Co-authored-by: Emily Liu <emilyliu7321@gmail.com>
Thanks for feedback from everybody on this group of changes! Everything in thing branch has now been merged and deployed in to the live website. Future corrections should go in new issues or PRs. There are some active discussions here, and this branch was linked from a couple places. I'll leave it for another day or two then close the PR (which IIUC will remove the PR preview site). |
Don't review or merge this branch!
This is just to have a combined PR preview to link and share.
Actual PRs for review: