RFC: Lexicon Resolution #3074

bnewbold · 2024-11-22T20:47:11Z

bnewbold
Nov 22, 2024
Maintainer

One of the missing pieces in atproto is a mechanism to resolve NSIDs to Lexicon schema definitions. While there are existing workarounds, such as just publishing schemas on projects websites or git repos, we think having a consistent mechanism is important to ensure interoperability, and to make the developer experience more accessible.

This document is a true “Request for Comments”: we have considered many design options, and narrowed down to a general structure, but there are some unresolved questions and we are looking for feedback from the atproto developer community before finalizing the design. We also discuss some design alternatives we considered but decided not to go with at the bottom.

This document is oriented towards atproto hackers, and assumes familiarity with existing atproto specifications, including the Lexicon schema system, NSID syntax, handles and DIDs, and repository layouts/semantics. This earlier community blog post may also be helpful background: https://docs.smokesignal.events/blog/lexicons-as-records/

Background and Framing

At a high level, the "authority" for Lexicons is rooted in the NSID being a transformed domain name. We want a way to resolve NSIDs to Lexicon schema definitions, in a way that verifies control of the relevant domain name.

This does not mean that NSIDs will always, or even often, be resolved or validated using this mechanism.

Who will need to resolve NSIDs to Lexicons, and when? Relays should not parse or validate record schemas. Client apps, AppViews and supporting app services (like feed generators), are expected to support specific versions of known Lexicons, and not do dynamic resolution or processing. Developer tooling might do live resolution at any time, but is something of an exception. PDS instances are in a position to validate records at creation time, which is helpful for keeping data clean and interoperable. However, PDS implementations and instances could do dynamic resolution and validation on an optional or “best effort” basis: they do not need to fetch or resolve every NSID they see, and can cache schemas when they do. More about PDS validation behavior:

One area we expect live resolution might be necessary and important is the definition of auth scopes, to support OAuth. There will be a separate proposal discussing how auth scopes will work in more detail, but they are expected to be implemented as Lexicon definitions, and require more dynamic fetching and validation by PDS instances. We expect to give focused guidance around how and when to resolve auth scope declarations.

For services which do dynamic resolution and validation, we think that making requests to aggregation/index services could be a better option instead of doing live resolution. Such services can act as CDNs, archives, and mediating services to resolve disputes or security issues in Lexicons. However, we want the abstract authority for Lexicons to rest in control of domain names, and for there to be a standard mechanism to access or crawl schemas, providing a clear bootstrap path for new indexing services.

So to summarize, we don’t expect a large throughput of resolution requests in the live network. If the resolution mechanism fails temporarily, there should not be immediate breakage in the network, and there should be social/governance mechanisms for the ecosystem to mitigate permanent resolution failure or hijacking of domains.

The current Lexicon language is a JSON format, and is relatively stable in both syntax/structure and features. However, we want to enable evolution in the future, which could even include moving to a non-JSON primary representation (something close to YAML or protobuf declarations). This doesn’t mean the syntax will or is even likely to evolve away from JSON, just that we want to leave that path open.

Something which we are flexible on is whether core protocol-level Lexicons have their “authority” in the protocol specification text, or via this Lexicon resolution mechanism. For example, com.atproto.sync.subscribeRepos, which describes the firehose format. We will probably enable/facilitate lexicon resolution for these endpoints as a convenience, but loosely expect the governance/authority to stick with the definition of the overall protocol, not with control of the atproto.com domain registration specifically. This whole bundle of questions is adjacent to lexicon resolution, but a bit special, and not a big focus for this document/RFC.

Keep in mind that Lexicon resolution is only a publishing mechanism. Projects, teams, and organizations can use whatever mechanism they like for development and discussion of Lexicons. For example, they can be versioned in git, and a CI/CD deployment hook can “publish” updates to an atproto account (repo). Lexicons also do not need to be “published” (or updated) until they are ready. Software can be prototyped against new Lexicons or new features all in the live network. Publishing is only necessary when interoperability becomes important. We do expect it to be a norm to publish Lexicons once a project gets traction: skipping publication indefinitely would be a warning sign about a project.

Syntax, Terminology, Use Cases

To clarify some details and terminology around NSID syntax and domain names, consider the following NSID: edu.university.dept.lab.blogging.getBlogPost

The last segment (getBlogPost) is called a “name”. It is not part of any domain name. The syntax is restricted by the current spec, and we generally want these to be safe for use as programming language function names or symbols.

All of the NSIDs/names with the same prefix we are calling a “group”. Eg, edu.university.dept.lab.blogging.postwould be in the same “group” asgetBlogPost. Note that post` here is also a “name”, and not directly related to a domain name.

The earlier part of the NSID (edu.university.dept.lab.blogging) can be flipped around as a domain name (blogging.lab.dept.university.edu). Syntactically, this could be a domain name, though we can’t tell just by looking at it if it is actually registered: we would need to do a network request (DNS resolution) to find out. In the current NSID specification, this is collectively called the “domain authority”.

One part of this domain is independently “registered”: university.edu. It is possible to use the Public Suffix List (PSL) to look at any domain name and infer some details about which parts are “registered” and which are sub-domains. But the PSL doesn’t capture all cases (it is continuously updated, and old software might not have a current/valid copy), and sub-structure within organizations is not always captured. In our example, dept.university.edu is probably an organisational sub-unit with it’s own authority. But this isn’t possible to know confidently, or even infer in most cases, certainly not without additional network requests.

Note that there might be parts of the domain name which are not really DNS-related. We have set a pattern for this in the app.bsky.* Lexicons using additional parts like feed in app.bsky.feed.post. (or, potentially, blogging in the above example). Some of the open questions below get in to how to deal with these structuring sub-domains.

One potential use-case we will discuss below is collective hosting of Lexicons (NSIDs). This could be a “Lexicon Hosting Hub”, with NSIDs like com.lexhub.project.someEndpoint (analogous to github.io). This might (eventually) get registered in the PSL. It could also be in the form of a standards body, like org.w3c.mentions.webMention. In these cases, the project sub-domains are somewhat separate authorities from the registered domains (even if the registered domain party has some ultimate DNS control).

A final consideration is app/brand domain names, like bsky.app or megacorp.com. These might have atproto handles (eg, @bsky.app) and associated accounts/repos with security concerns and access controls distinct from purely technical security concerns like control over Lexicon resolution.

Sketch Design

We have made a few high-level design decisions.

Authority over Lexicons should primarily remain in the global DNS system, as described above and in the existing NSID specification. We don’t want to abandon this overall design.

Lexicons schemas will be published as records in atproto repositories, either directly (the record is the schema) or via reference (the record references the schema by hash). NSIDs will resolve to DIDs, which resolve to an atproto repo. The records will be in a collection like com.atproto.lexicon.schema (final name TBD), and record keys as the NSID, which allows directly fetching the record (possibly including a signed MST “proof chain”). The full schema might be in the record itself, or might be pointed to by the record (for example, the schema file might be a blob, reference by CID). Either way, the repo mechanism provides authenticity (signing), replication, change detection (via firehose), and some form of content addressing (hashes). Having schemas in records also enables enumeration of all published schemas in the entire network, and can aid with “local” enumeration (aka, discover all the NSIDs in the same group).

If the Lexicon schema is directly included in records itself, as CBOR/JSON, we will not try to have a “meta Lexicon” which validates the Lexicon language itself. This might be possible and a fun side-project, but for now it is a distraction and could make smaller evolution of the language more difficult.

Somehow, there needs be a way to take an NSID string and convert it to a single domain name, which then resolves it to a DID (which identifies the repository). It should be possible to change DIDs (repositories) over the lifetime of the NSID. We do not want the NSID-to-domain process to involve multiple/iterative network requests to “discover” which specific domain name (in the sequence of sub-domains) is active. The primary reason for this is that we don’t want the resolution process to return different DIDs depending on network errors or transient resolution differences. A secondary reasons are efficiency (eg, latency and total volume of network requests) and security (not having to mitigate “request amplification”).

As discussed in the sections above, we are enthusiastic about the concept of “Lexicon Aggregators”, which index and provide discovery, history, safety checks, mirroring, and more. These are effectively AppViews for Lexicons.

Open Questions

NSID to Authority Mapping

Given an NSID, which domain names (and sub-domains) are relevant to authority? This is mostly a question about nested hostnames:

app.toy.record is simple: only toy.app could be relevant
app.bsky.feed.post: it isn’t clear just looking at just the NSID string that bsky.app is the relevant authority, and feed is as a sub-domain is a grouping/organization mechanism.
edu.university.dept.lab.blogging.getBlogPost is a possible NSID with multiple levels of real authority/power: institution, department, lab, project.
com.lexhub.bnewbold.post is an example where com.lexhub.* might be an open-registration hosting service (similar to io.github.*). It may or may not be in the Public Suffix List.

In most of these cases there is a “natural” administrative boundary if you research the organizations and individuals involved, but it isn’t always apparent just looking at the NSID where that is.

One solution would be to add syntax to NSIDs. In early days, we could have done something like app.bsky:feed.post, where the colon (or comma, etc) separates the “domain authority” part from a “compound name” part. This would be a disruptive syntax change today. An alternative is to add underscores to individual segments, like app.bsky._feed.post. When doing codegen etc, the underscore would be stripped; it just indicates the split. This would only work for new projects. Note that app.bsky._feed.sub.inner would be distinct from app.bsky.feed._sub.inner.

A second solution we are considering is to always resolve the entire group for every schema. Eg, app.bsky.feed is always the authority for app.bsky.feed.post. Note that many authorities can resolve to the same DID and repo.

Another option is to rely on the Public Suffix List to determine the shortest independent domain, and assume that is the authority. Eg, for app.bsky.feed.post, determine (offline, using PSL) that bsky.app is the registered domain, and take that as the authority.

There are several possible mechanisms we considered around iteratively (or concurrently) checking all the sub-domains to see which resolve at all, but we decided against that overall approach (see discussion in Sketch Design).

Repo Resolution Question

Given an authoritative domain, how do we map it to a repo?

Eg, if we have NSID app.toy.record, and identity that the authoritative domain is toy.app, how do we find the DID (and thus atproto repo) associated with that domain?

One approach is to use the existing handle/identity system. Authoritative domains would be configured as handles, using the existing handle resolution mechanisms. This could be done with a separate identity/repo per “group” for now, and later we might extend the identity system to include multiple handles (”aliases”) per DID.

An alternative is to use the same DNS TXT mechanism as for handles, but with a different prefix (_lexicon not _atproto). We may or may not support an HTTPS well-known mechanism: it could provide stronger authenticity (via the TLS PKI), but could require valid TLS certificates and HTTPS servers for a potentially large number of distinct hostnames. In other words, the namespace of handles and the namespace of authoritative NSID domains would be distinct, though they could be configured to overlap (aka, could add both _lexicon and _atproto TXT records pointing to the same DID).

Some of the considerations with this are:

complexity of introducing a new DNS record prefix, eg in documentation and SDK support
whether top-level “brand” accounts (eg, Bluesky accounts) are distinct from Lexicon-hosting repos. With respects to general security and separation-of-concerns, and ease of delegating Lexicon management to a separate service or organization.
ease of adding and using NSIDs on popular / high-value domains (eg, com.megacorp.*)
ability for casual devs to define new Lexicons with their existing accounts/handles, vs needing to do DNS setup

Schema Representation Question

This one is a bit more open-ended: we have two representative directions, but there are a whole spectrum of design options.

Some of the considerations here are the degree to which a Lexicon schema is a record (vs being “wrapped” or “enclosed” in a record), and ensuring that future version of the Lexicon schema language can have arbitrary features and syntax (eg, might look more like protobuf, YAML, XML, typescript defs, etc).

In none of these options is the full Lexicon language encoded as a record schema itself (aka, fully/circularly defined).

One direction is to make the records look very similar to current schema JSON files:

{
  "$type": "com.atproto.lexicon.schema",
  // "lexicon" and "id" fields would probably be required and typed
  "lexicon": 1,
  "id": "app.bsky.actor.profile",
  // "defs" would probably be `unknown` type, and optional (to allow evolution)
  "defs": {
    "main": {
      "type": "record",
      "key": "literal:self",
      "record": {
        [...]
      }
    }
  }
}

This effectively ends up using the “lexicon” field to guide parsing behavior, which moves some validation (eg, “defs” must exist) out of Lexicon validation and in to application (or really SDK) logic.

At the opposite end of the spectrum, the entire schema definition could be encoded a blob, which only gets referenced by the record. The schema would have some MimeType. Using a blob here would result in a file hash (CID) that can be used to support Lexicon integrity via Lexicon aggregator services. One tradeoff is that it requires a network roundtrip to fetch the Lexicon contents.

{
  "$type": "com.atproto.lexicon.schema",
  "id": "app.bsky.feed.post",
  "schema": {
    "$type": "blob",
    "ref": { "$link": "bafycid..." }
    "size": 1495,
    "mimeType": "application/json"
  }
}

A related option would be embedding the file as string or bytes in the record.

A middle path would be to use the “open union” pattern that Lexicons provide:

{
  "$type": "com.atproto.lexicon.schema",
  // top-level open union
  "schema": {
    "$type": "com.atproto.lexicon.schema#V1",
    // the "V1" object could be just a single field ("schema") with type "unkonwn"
    "schema": {
      // this is the raw existing Lexicon schema JSON, with no types
		  "lexicon": 1,
		  "id": "app.bsky.actor.profile",
		  "defs": {
		    "main": {
		      "type": "record",
		      "key": "literal:self",
		      "record": {
		        [...]
		      }
		    }
		  }
    }
  }
}

The aesthetics of having three distinct NSIDs in the later example is off putting, and it feels like a fair amount of ceremony and boilerplate for such a core protocol feature.

Alternatives Considered

HTTP well-known Endpoints

For example, could have a fixed well-known endpoint like /.well-known/atproto-lexicons which would return a JSON object, where each key is an NSID, and the value is the schema. Or we could have endpoints like /.well-known/atproto-lexicon/com.atproto.aaa.bbb.ccc.ddd.apiEndpoint which would return just that single Lexicon. Or some combination of multiple endpoints and URL structures.

Any JSON format can be returned from such an endpoint, with very flexible size constraints. HTTP content negotiation could be used in the future to allow multiple response types, for example different schema language versions. Content negotiation could make basic hosting harder though.

Advantages:

simple to describe to regular devs: doesn't require understanding all of atproto, implementation is very simple (just HTTPS request)
keeps Lexicon a relatively modular component of atproto: could use Lexicon schemas without the rest of atproto (aka, without repos, DIDs, handles)
few network requests/round-trips in most cases when resolving Lexicons
authority is directly tied to domain name control, with few intermediate abstractions

Disadvantages:

potential IT friction for large orgs/brands hosting idiomatic NSIDs. Eg, imagine com.megacorp.blog as an NSID: needs to have a file hosted on some specific path.
no authentication, redistribution, or change detection (aka, no firehose)

DID Service Entry

Instead of storing Lexicons in repositories, we could resolve NSIDs to a DID (using one of the discussed options), then look for a “LexiconHostingService” service entry in the DID document. Then connect to that service, and use defined Lexicon endpoints (in the com.atproto.lexicon.* namespace) to query and enumerate endpoints. The DID would not need a “full” atproto account or repo, just a valid DID doc with this one entry.

Hosting services could be shared by many groups and projects.

A related idea would be transforming NSIDs to a domain, resolving via did:web , and then finding a service entry in that DID document.

Some downsides of this approach are introducing a new network role (Lexicon host) to the core protocol; not having a mechanism for change detection (aka, no firehose); and not getting authentication (signatures).

bleonard252 · 2024-11-22T21:29:35Z

bleonard252
Nov 22, 2024

For a little background on my thoughts, I'm currently considering "XRPC" to be the API pattern ATProto uses to define endpoints; that is, Lexicon-defined, NSID-scoped HTTP(S) endpoints under /xrpc/. (Whether or not this is the official stance.) ATProto then is built on top of XRPC. It could be useful to support non-ATProto use cases of the XRPC pattern; one place I've thought a similar mechanism should be used is in the Mastodon API, since extensions on top of it and other replacement APIs can become incompatible with unspecific scoping.

The way I've thought about doing lexicon discovery is by storing them as records in an ATProto repository, and indexing and searching them with whatever mechanisms we have (relays?). This would be fine for ATProto-specific use cases of the Lexicons system, like records, but what if a non-ATP XRPC implementation wanted to resolve NSIDs? In that case, they would probably have to use a known, centralized API (XRPC at directory.lexicons endpoints, at lexicons.directory by default, perhaps) or implement at least some version of the AT Protocol.

If you used relay search or indexed from a filtered firehose, you wouldn't need to determine the domain name. You'd just want to make sure that it has a handle (even if it's not the "default" one) which validates as part of the NSID "prefix": so, @university.edu and @dept.university.edu would be allowed to publish records for NSID edu.university.dept.lab.blogging.getBlogPost, but @blogging.lab would not. (And the last segment would probably not count for this check.) A chokepoint service, like a hypothetical lexicons.directory, that lets clients more easily look up lexicons (ideally, PDSes wouldn't use this, but development tools might) would not save/update lexicons that don't pass that check.

I've also thought one use for resolving NSIDs would be to figure out where an unknown XRPC endpoint should be proxied to, or what AppView or client is known to understand a certain record type (i.e. in an "embedded record" in a Bluesky post). This would take adding an optional field to the lexicon, probably for a domain name or base URL.

1 reply

bnewbold Nov 25, 2024
Maintainer Author

We are interested in supporting non-atproto uses of Lexicons, to some degree. This was a consideration in choosing whether to repositories/records for storage. We think the benefits are worth relying on some other atproto concepts, and "lexicon aggregators" will make it possible to bridge out of non-atproto software.

We want to keep Lexicon definitions separate from service providers. Having optionality and "credible exit" from service providers is one of the major design goals of atproto. Blessing specific service providers at a machine-readable (even if only a "default" or "recommended") feels like going a bit too far in that direction. The domain name of an NSID aligning with a specific provider is already a pretty heavy signal / indicator.

mary-ext · 2024-11-22T22:13:17Z

mary-ext
Nov 22, 2024

The one consistent problem I've seen with putting lexicon schemas in records is ease of maintainability and sovereignty.

Both of these property seem important to me or anyone that wants to build a "toy" or any small project that makes full use of the network's capabilities.

With a well-known endpoint, the schemas can be served alongside the reference AppView and it'll stay up to date for the duration of its existence. It also doesn't require setting up a PDS just for said schemas to be owned independently.

4 replies

snarfed Nov 24, 2024

If you're a lexicon publisher, you don't necessarily need to run your own PDS, right? You do have to own the domain or subdomain for the lexicons' NSIDs, but as long as you do that, you control the resolution of a given lexicon NSID to a repo DID. You can then choose to self-host that DID's repo in your own PDS, or in anyone else's, and you can still have some assurance of control over that repo due to ATProto itself, account portability and rotation keys etc, right?

bnewbold Nov 25, 2024
Maintainer Author

You don't need to run a PDS, but some account/repo is necessary. If you want full self-hosted "sovereignty", then PDS hosting would potentially be in the mix.

The low-stakes starting pattern we imagine is devs using their own personal accounts for Lexicon hosting. If the project takes off and they don't like conflating their personal account with an overall project, they can update the resolution to a different DID (and repo).

Having the Lexicons tied to a project domain has some upsides, but really ties the lifetime of the schemas to the lifetime of the project domain. Keeping some DNS records resolving is fairly low-stakes, and atproto repos generally persist in the network by default. Keeping well-known files up and working isn't necessarily that much more work, but is less likely to passively keep going without maintenance (in my opinion/experience). In any case, the lexicon indexing/aggregation services can help.

bnewbold Nov 25, 2024
Maintainer Author

I take @mary-ext's option pretty seriously, so I do want to hear more about PDS independence concerns, which were mentioned in a parallel discussion thread as well.

mary-ext Nov 25, 2024

for what it's worth I'm also defining a really specific kind of "small projects" here, I just thought that it seemed worth mentioning because we definitely want this to be very approachable for anyone starting out.

tom-sherman · 2024-11-22T22:41:53Z

tom-sherman
Nov 22, 2024

whether top-level “brand” accounts (eg, Bluesky accounts) are distinct from Lexicon-hosting repos. With respects to general security and separation-of-concerns, and ease of delegating Lexicon management to a separate service or organization.

With oauth, is this really a concern? Such services would be restricted to only perform lexicon updates and nothing else.

2 replies

bnewbold Nov 25, 2024
Maintainer Author

OAuth helps protect relationships with individual authorized clients/services. But overall account security is still a "thing" and always will be. Eg, who is approving which OAuth grants.

matthieusieben Nov 25, 2024
Collaborator

OAuth scopes allow a user ("account") to give a reduced access to their resources to a particular client ("app").

OAuth scopes won't help much when it comes to delegating permissions to access an account's resources from another account.

I guess that giving delegated access to particular records in a PDS is something the PDS could implement (even off-spec). But that's kind of off topic.

tom-sherman · 2024-11-22T22:44:45Z

tom-sherman
Nov 22, 2024

I don't understand the requirement or need to move away from JSON as the lexicon definition language, why do you want to leave that option open?

I know JSON isn't an ideal authoring format, but I'd prefer for innovation of this to happen in userspace and have any DSLs compile to JSON lexicon defs.

5 replies

bnewbold Nov 25, 2024
Maintainer Author

There is a concrete thing which is that Paul has wanted to have a more DSL syntax since the early days (2022?). He isn't super involved in the design of this lex resolution mechanism, but we want to keep that path open for him.

More broadly, we think editing and reading JSON is a bit of a pain. We definitely use shorthand internally when designing Lexicons. It will probably always be possible to have some "intermediate representation" as JSON. But I believe it is good to use and share the actual format used for editing. This principle is sometimes encoded in open licensing, where you want people to share the "format used for development" (eg, open hardware licenses, some FLOSS licenses). An exception would be if there is a true one-to-one mapping between representations.

More broadly, it just shouldn't be too hard to keep this door open!

bleonard252 Nov 25, 2024

In that case, having the "original" form saved alongside the term, as equals, might be preferable. So, assuming it's a record, you'd want to wrap the whole "binding" Lexicon into one field in the record, and have the "original," optional, in another field beside it. You could probably put extra metadata around it, like what format the "original" is in, what version the Lexicon is, maybe information for display like a title, description, avatar, etc.

Maybe this model helps explain it:

{
  "$type": "com.atproto.lexicon.schema",
  "lexicon": {
    "lexicon": 1,
    "id": "app.bsky.actor.profile",
    "defs": {
      "main": {
        "type": "record",
        "key": "literal:self",
        "record": {
          [...]
        }
      }
    }
  },
  "originalFormat": "application/vnd.bluesky.lexicon_dsl",
  "original": "..." // this could be a blob ref
}

I expect that, if you went with publishing it this way, the network cost for the "original" field as a blobref would be acceptable, since it would generally only be useful for human-initiated actions like viewing or editing that info.

nklisch Nov 25, 2024

The amount of tooling for JSON and it's maturity in every language means that it's a fairly reasonable 'base language' for the schemas.

Other DLSs can just complie to JSON, I am not sure the value of encoding a bespoke format makes much sense.
I could see it be a valid alternative, if tooling exists to translate from the DSL to JSON. But moving off JSON as the base would mean that choices could be made in the future DSL that JSON doesn't easy complie to. When this happens, all the tooling built on JSON would run into issues.

A primary or ever permanent translation between the two seems like the right call imo.

bleonard252 Nov 29, 2024

@nklisch, the purpose of also sharing the original format is for collaboration and transparency. I expect any Lexicon DSL will be compiled to JSON to actually be consumed.

nklisch Nov 29, 2024

Ah k, that makes more sense. I mostly was concerned that the base level schema object uses for validations and the like, was planned to move off of JSON. That seemed like a mistake to me.

If it's at a higher layer and gets compiled to JSON and is just reference for human readability, that makes sense.

DavidBuchanan314 · 2024-11-22T23:24:31Z

DavidBuchanan314
Nov 22, 2024

I think that ultimately, consensus about validity/semantics of lexicons happens between application developers, on a social level. Lexicon just helps us write those decisions down.

If a critical mass of clients/PDSes/appviews decide that the app.bsky.feed.post lexicon now has a text length limit of 500 graphemes, it does! - and it doesn't matter what the owners of the bsky.app DNS name think :P

I think the reverse-DNS notation should serve as a guide for authority. It should make it hard for us to accidentally step on each others toes, but I don't think it needs to be directly machine-resolvable with hard rules (so we can stop worrying about how to parse out the authority section from an NSID, for example).

In the example of edu.university.dept.lab.blogging.getBlogPost, what happens if the project moves to a new department? Or the primary developer leaves and wishes to continue the project independently? If we had to worry about keeping DNS records updated, or .well-known document paths in place, I think it'd be a bit of a mess (even if everyone is collaborating amicably). If the lexicon was popular "in the wild", changing the NSID would be impractical.

I was intrigued to learn about the "Definitely Typed" project, part of the typescript ecosystem: https://github.com/DefinitelyTyped/DefinitelyTyped. It's a giant monorepo that collects community-sourced type annotation information for JavaScript libraries that don't ship with their own definitions. This article gives a bit more background on why it exists, and potential alternative approaches https://johnnyreilly.com/symbiotic-definitely-typed - I think there's some overlap with the "lexicon resolution problem".

I'd be in favour of a similar git monorepo approach for lexicon resolution. A given monorepo would need to be managed centrally, but specific namespaces could be delegated to other git repos. If there's a really big controversial consensus-divergence, the repo can be forked - and then you can start convincing other developers to start using your forked definitions.

I do realise I'm almost reinventing DNS here (wrt. delegating control of sub-namespaces), but "fork the DNS root namespace" is completely non-viable¹, whereas maintaining a forked git repo is (IMHO) just the right amount of friction.

I'm not necessarily attached to the idea of using git specifically, I just think the tooling that already surrounds it could be a good starting point, and it would fit neatly into existing developer workflows. A big repo collecting known lexicon defs would be generally useful to have, even if it's not part of the "lexicon resolution" process proper.

The big question with this approach would be, who is going to volunteer to maintain the initial "root namespace"?

someone should tell HNS ↩

5 replies

tom-sherman Nov 23, 2024

I personally wouldn't point to DefinitelyTyped as a model to copy for something as greenfield as lexicon resolutions. @types served the purpose of bootstrapping adoption of typescript by providing third party types to the several hundred thousand JS packages that existed at the time.

It has a bunch of drawbacks that I won't go into here, but safe to say that a package having @types is a smell at this point in the TS ecosystem.

I think though the system you're describing works alongside the protocol-level lexicon publishing described in this RFC. It's a layer above. A public NSID prefix could exist that runs off of a central GitHub repo or it could be something that looks like a package manager.

futurGH Nov 23, 2024

Yeah, my understanding is that a ridiculous number of person-hours go into maintaining DT — my immediate reaction is that something like that would to some degree stifle iteration, and a scenario where two different "lexicon authority" repos contain conflicting versions of a lexicon sounds like a nightmare

DavidBuchanan314 Nov 23, 2024

My vague thought is that you can reduce the central workload with more delegation. Less of a monorepo, and more of an index that points to other repos. But either way, yes, the major downside of this sort of "managed" approach is that it will require more maintenance effort

snarfed Nov 24, 2024

I think that ultimately, consensus about validity/semantics of lexicons happens between application developers, on a social level. Lexicon just helps us write those decisions down.

If a critical mass of clients/PDSes/appviews decide that the app.bsky.feed.post lexicon now has a text length limit of 500 graphemes, it does! - and it doesn't matter what the owners of the bsky.app DNS name think :P

This is a bold interpretation of lexicon/app ownership! It's clearly the open source ethos, community and consensus and forking, etc. I raised a similar question in #2885 (reply in thread):

I was maybe thinking more about whether third parties could or should build appviews for lexicons/apps they don't own, beyond merely running independent instances of the reference implementation. Any nontrivial app will have nontrivial product semantics and behavior that are built into their appview. Would third party appviews be expected to try to match those semantics? Or is part of appview adversarial interop that they could offer alternative semantics? Either way could be reasonable, or maybe other ways entirely, but the answer obviously impacts what it means to "own" an app or set of lexicons, and what the ecosystem norms and expectations are.

I'm not sure I fully agree with the aggressive community ownership stance, but I definitely do get it. I guess I'd like to hear more of a clear opinion from the Bluesky team themselves on how they envision ownership and semantics for a given app/lexicon evolving over time. If foo.org wants to build an independent appview implementation for bar.com's app, I get that foo.org can do whatever they want, but I'm curious what the team thinks they should do, ie what the rights and responsibilities are. Should foo.org strive to be exactly compatible? Actually do whatever they want? Only diverge moderately, in controlled ways that they communicate explicitly to their users? Something else entirely?

bnewbold Nov 25, 2024
Maintainer Author

I think that ultimately, consensus about validity/semantics of lexicons happens between application developers, on a social level.

We are pretty sympathetic to this viewpoint! This is the realpolitik, and we have tried to indicate that.

What we don't want is a norm of total free-for-all, with folks using mil.army.* for the lulz, or endless forking and re-litigation of intentional design decisions like post character counts. We think creating new NSIDs is the best "exit" in almost all cases, and that collective vetos/overrides against the party "owning" the namespace should be an uncommon exception. Perhaps even never used! But the rip-cord is there if needed.

Having everything bottlenecked on a global git repo (or small collection of them) feels like a big bummer and headache to me. I know this is a common pattern (Homebrew, crates.io, Julia, etc), but it feels worst of both roles: either there is meaningful review (which is valuable, like Debian, but also slow, like Debian), or there is no review (like npm) and there is not much advantage over an indexing service.

I think the escape hatch for folks that do want something more reviewed/collective is to build such a think on top of (or underneath?) the proposed Lexicon resolution mechanism. Folks can totally run an org.lexhub.* namespace out of a git repo, and have CI hooks validate and update all schemas in to an atproto repo. Conversely, a bot can subscribe to the firehose, identify new schemas, and submit them as automatic PRs to a repo, which humans can review (or override).

futurGH · 2024-11-23T00:12:22Z

futurGH
Nov 23, 2024

I've always preferred the well-known route as the least bad option. In a sense Lexicon feels "lower level" than PDSes/repos/etc. Maintaining a dedicated repo feels like a significantly greater lift than hosting a file. I don't think authentication or change detection is a huge concern — on a social level, "this record was validated by services that I know do that, but it's considered invalid according to the current lexicon file" has a pretty obvious answer.

5 replies

bleonard252 Nov 23, 2024

Maintaining a dedicated repo

I don't think it would have to be "dedicated." It would either be on one's personal ATProto repository, or on an associated project account. It might be useful to be able to move lexicon definition records between repos, though... maybe just a re-upload, delete, and validate would be enough? (Unless you meant a Git repo, in which case forget I said any of this.)

mary-ext Nov 23, 2024

If you want your schemas to be hosted independently, you do have to run a PDS, regardless of whether you personally would use it or not.

It seems like a big blocker for small projects wanting independence to benefit from lexicon validations.

bnewbold Nov 25, 2024
Maintainer Author

@futurGH: as a counterpoint, I think for some folks it is actually a fair amount of friction to deploy /.well-known/ endpoints, especially if there are a bunch of them. From single-purpose domains with a single codebase, it is pretty easy.

But if you have a real website running, it can involve a bunch of reverse-proxy config, or integration in to unrelated code. It might also require setting up a bunch of sub-domains and TLS certs. For example, I run robocracy.org on a sever with a bunch of other stuff going on. If I wanted to run org.robocracy.blog.blah, i'd probably need to set up new Let's Encrypt certs and a few nginx configs. And then update on that specific server every time I make a change. For the app.bsky.* namespace, it would be a big pain to integrate with our web app deploys, so we'd probably need to figure out some load-balancer work-around with a separate hosting project. This is all doable, but it sure isn't as convenient as just creating a new account and pointing some DNS records at it.

bleonard252 Nov 25, 2024

Concurring with bnewbold here: it might be a lot more difficult to host on well-known if, say, you're using hosted WordPress for the domain you want to publish to. Plus, depending on how the NSID-to-authority resolution works, you might not be able to use a subdomain for it -- meaning you'd need to fork over a lot of extra money for, at least, a brand new domain for every toy project; and although many of us do that anyway, that kind of financial irresponsibility is probably not something we want to encourage :)

Using ATProto repos does circumvent this. It does make things a little more difficult; however, I think the concerns about PDS hosting are probably unimportant: who on earth is developing on ATProto without already having an account? Like @bnewbold said upthread:

The low-stakes starting pattern we imagine is devs using their own personal accounts for Lexicon hosting.

snarfed Nov 25, 2024

Concurring with bnewbold here: it might be a lot more difficult to host on well-known

...or if you're at eg a 100k person tech company, and you're a small team somewhere in the depths of that big org doing a small prototype or new project.

johnspurlock · 2024-11-23T00:15:21Z

johnspurlock
Nov 23, 2024

+1 to .well-known or even dns txt, since the naming convention is reverse-dns based

3 replies

tom-sherman Nov 23, 2024

The issues around managing certificates for subdomains puts me off well-known methods.

The coolest part of this proposal are public lexicon hosts that contain many lexicons that are managed by different repos. This becomes much harder to do without a DNS method I think.

orthanc Nov 23, 2024

The firehose aspect and allowing for them to be aggregated does seem to create some very interesting possibilities. E.g. a central lookups and indexes that people can use to find and resolve lexicons and possibly even format conversions.

I can imagine a service similar to the various Javascript CDNs that pull from npm that allows people to request a lexicon as JSON schema, as Protobuf, as XSD etc as well as various discovery services built on the at protocol sync.

bleonard252 Nov 23, 2024

or even dns txt

The coolest part of this proposal are public lexicon hosts that contain many lexicons that are managed by different repos. This becomes much harder to do without a DNS method I think.

I think you're on to something with this! If nothing else, you could try to do a DNS lookup for, i.e. _atproto-lexicon as a shortcut, reference, validation, discovery, etc. to the AT URI of the definition record, or the URL (or base URI, or whatever) of the lexicon schema file. You could still do the crawling/firehose approach to discover them in advance of needing them, too, i.e. for a documentation site.

edited to add: I commented about how potential non-ATProto Lexicon users might need to "implement at least some version of the AT Protocol" to be able to look them up, but having a shortcut in DNS would make this a lot easier, without needing to centralize: they look up that DNS record, and if it's a correct/valid at:// URI, that can be resolved into a request to <pds>/xrpc/com.atproto.repo.getRecord, and voilá, now you have the Lexicon! Also, you wouldn't need to validate based on a handle: you can validate based on whether the DNS record references that ATProto record. Plus like @tom-sherman said, it would make Lexicon definitions repo-independent.

orthanc · 2024-11-23T01:49:47Z

orthanc
Nov 23, 2024

I feel like there's an abuse vector here I've not seen addressed. We've seen with NPM a number of situations where the owners of popular libraries revoke them or publish deliberately breaking or even malicious changes as a form of protest. These have generally been resolved b y NPM taking over control of the library and restoring it at a good known state while people fork to create a path forward.

In the context of lexicon this feels even more problematic because it would not be possible to change the NSID in existing records to allow mapping new data over to a fork. There also isn't AFAIK an reference to the version of the lexicon that a record should be validated against.

So it feels like there is possible issue if there are elements of the system that automatically verify against resolved lexicons and the author of a moderately popular lexicon deliberately replaced it with an incompatible version. Without some form of central authority around lexicons this would seem to then require updates to every element of the network that was doing automated validation to resolve the issue.

Maybe this is just about being very clear about the places where it's appropriate to validate. But even if it was a PDS doing verification on new records only, this has the potential to break the ability to migrate to a new PDS.

It feels like the only answer to a deliberate and unwanted breaking change like this would be for every consumer of the lexicon lookup to have a local override for that particular lexicon.

Alternatively, maybe source of truth sits with DNS as is proposed, but most consumers are not resolving directly but rather resolving through aggregators where the aggregator has the power to override the lexicon in these cases.

5 replies

bleonard252 Nov 23, 2024

One way to reduce the impact of this is by validating that changed lexicons don't have breaking changes. I don't remember where, but the ATProto docs said somewhere that if a breaking change is required, it is to be done in an entirely new lexicon. This doesn't address what happens on first lookups, though. Plus it would require strictly defining what is and is not a "breaking change" to be validated against.

snarfed Nov 24, 2024

Sounds like you're remembering https://atproto.com/specs/lexicon#lexicon-evolution :

Lexicons are allowed to change over time, within some bounds to ensure both forwards and backwards compatibility. The basic principle is that all old data must still be valid under the updated Lexicon, and new data must be valid under the old Lexicon.

Any new fields must be optional

Non-optional fields can not be removed. A best practice is to retain all fields in the Lexicon and mark them as deprecated if they are no longer used.

Types can not change

Fields can not be renamed

If larger breaking changes are necessary, a new Lexicon name must be used.

I think this largely addresses the compatibility concern, but also that concern seems somewhat orthogonal to this RFC. It existed before lexicon publishing and resolution, and it will probably exist after. We can address it, but we don't have to do so inside the lexicon resolution mechanism.

orthanc Nov 24, 2024

but also that concern seems somewhat orthogonal to this RFC

The reason I'm raising it here is because introducing an automated resolution mechanism makes this an operational issue rather than just a policy issue.

As I understand it, that statement on lexicon evolution is a guideline or policy. There is not technical enforcement in the system that lexicons are evolved in a backwards compatible way. Which is fine at the moment, should someone violate that for a widely used lexicon people would not adopt the changes and the conflicting drivers would be resolved by discussion at the human level.

By introducing an automated lexicon resolution mechanism that distributes control over the lexicon and expecting it to be used for validation of records into PDS and similar we're introducing a situation where breaking changes can be introduced into a lexicon at runtime potentially creating disruption to the network as existing clients etc would be unable to submit previously valid records, migrations between PDS might be blocked etc.

In the current state of affairs these would only happen when a new version of the PDS is deployed after pulling in a lexicon change so there is an obvious rollback path if issues were discovered.

Where as in the case of a deliberate disruptive breakage with dynamic resolution we potentially have an issue introduced to a lot of independently developed and operated elements of the atmosphere where the only path forward is to release and deploy a change to override or ignore that particular NSID.

So it is a bit orthogonal to the specific mechanism of dynamic resolution but it is an issue being introduced by dynamic resolution.

The bit that does relate quite specifically to the questions on this RFC is that the mitigation to this issue taken by most registries is that there is an ability for a central body to override such disruptive actions. While it is always contravention when done it happens.

The resolution mechanism proposed here does not allow for that.

snarfed Nov 24, 2024

By introducing an automated lexicon resolution mechanism that distributes control over the lexicon and expecting it to be used for validation of records into PDS and similar

This is the key point. Distributed control over lexicons and record validation aren't new with this proposal; they've been there since the beginning. PDSes and appviews have always validated records against lexicons, and ATProto app developers have always authored and published their lexicons independently.

Having said that, it sounds like @bnewbold agrees with you on the operational risks, and tried to mitigate them here. From near the top:

Who will need to resolve NSIDs to Lexicons, and when? Relays should not parse or validate record schemas. Client apps, AppViews and supporting app services (like feed generators), are expected to support specific versions of known Lexicons, and not do dynamic resolution or processing. Developer tooling might do live resolution at any time, but is something of an exception. PDS instances are in a position to validate records at creation time, which is helpful for keeping data clean and interoperable. However, PDS implementations and instances could do dynamic resolution and validation on an optional or “best effort” basis: they do not need to fetch or resolve every NSID they see, and can cache schemas when they do. More about PDS validation behavior:

bluesky-social/atproto-website#353
https://bsky.app/profile/bnewbold.net/post/3l6fyxsw7za2o

bnewbold Nov 25, 2024
Maintainer Author

This is an excellent question to raise, and the mitigations are exactly what others have mentioned: primarily using intermediate indexing services, which can compare new Lexicons against historical versions. These can do automated validation for most situations, and flag "breaking" changes for human review and approval (for example).

What we are not particularly comfortable is a true "global authority" which can uniquely make these decisions, controlling/owning the namespace. We would much rather have a system which allows multiple indexing/validating services to emerge. Even if one index ends up being dominant, the mechanism would make "exit" to a new provider much more credible than a system like npm.

An example in this design space is the Go package system, pkg.go.dev (which is not without it's own centralization controversies!).

chapel · 2024-11-23T12:10:09Z

chapel
Nov 23, 2024

When I was thinking about this issue while learning all the ins and outs (still a lot of ins and outs I don't know), I came to the conclusion that using two aspects that have been proposed and brought up cover at least the logical overhead of hosting and resolving the lexicon schemas.

The first is to resolve to the base DNS record, app.bsky.* maps to bsky.app regardless of how long and how many namespaces there are on the Lexicon. This solves lookup because you need to own the root domain and you only need one cert.

The second is using .well-known/atproto-lexicon (could be named something else) with a file based approach, meaning that each lexicon schema would have a file named after the full lexicon name/id. This makes resolution consistent and easy to reason about and ultimately easier to maintain.

I'm not sure large orgs would like other approaches any more than the above but the above would map well to all levels of developer and not need as much setup or investment.

As said this doesn't solve anything with versioning. I do wonder if there is an aspect of also having a published atproto record or something. Just thinking about this part though brings up so many questions I am not sure I could answer myself.

1 reply

matthieusieben Nov 25, 2024
Collaborator

All domain names in the UK end with .co.uk. In order to avoid that every uk.co.* NSID resolves to co.uk (which is not a valid domain name), the "Public suffix list" must be used. This is an issue as stated by @bnewbold in the original post.

bb010g · 2024-11-24T10:11:27Z

bb010g
Nov 24, 2024

It is possible to use the Public Suffix List (PSL) to look at any domain name and infer some details about which parts are “registered” and which are sub-domains. But the PSL doesn’t capture all cases (it is continuously updated, and old software might not have a current/valid copy), and sub-structure within organizations is not always captured.

With regards to the PSL, those fetching lexicons from the Internet can also go and fetch any updates to the PSL beforehand. Your concerns about sub-structure within organizations stand. I think it's worth noting that those concerns also apply to web development, where organizations would not receive proper cross-origin protections in user agents.

+1 for formalizing sanity checking against the PSL.

0 replies

ngerakines · 2024-11-25T04:06:28Z

ngerakines
Nov 25, 2024

I want to start with the requirements that I'm trying to solve. Not all of these are a part of this discussion, but I'm including them for transparency.

I want a way to distribute the Smoke Signal lexicon so other applications and systems can resolve references to lexicon types to schemas.
I want a way to prove that a given Smoke Signal lexicon record is valid and authentic.
I want to incorporate access control in a way that prevents bad actors from causing damage to Smoke Signal records.
I want people to have a "credible exit" away from Smoke Signal.
I want to support changes to the Smoke Signal lexicon either through record versioning or lexicon schema controls
I'm interested in introducing governance and attestation to the Smoke Signal lexicon

At a high level, the "authority" for Lexicons is rooted in the NSID being a transformed domain name.

This is the core of the problem as domains are immutable and fixed things in the ATProtocol ecosystem. DNS dramatically lowers the bar and makes several key issues disappear. Still, it also seems contrary to what did-method-plc achieves and a peculiar compromise given all of the work that has gone into identity and data portability in the ATMosphere.

This does not mean that NSIDs will always, or even often, be resolved or validated using this mechanism.

I think any internet engineer who has released production specs and resources understands the inherent permanence that the release process implies. This document exists because there is an existing thing in the wild, and we're trying to work through some gaps.

...
However, we want the abstract authority for Lexicons to rest in control of domain names, and for there to be a standard mechanism to access or crawl schemas, providing a clear bootstrap path for new indexing services.
...
However, PDS implementations and instances could do dynamic resolution and validation on an optional or “best effort” basis: they do not need to fetch or resolve every NSID they see, and can cache schemas when they do.
...
We do expect it to be a norm to publish Lexicons once a project gets traction: skipping publication indefinitely would be a warning sign about a project.
...

This makes it challenging to satisfy the requirements of supporting a "credible exit". This hard rule makes conflict unavoidable. Staying fixed with the DNS model makes these very real and common things way more problematic.

If I ever lose control of the "smokesignal.events" domain, a bad actor could immediately cause introduce breaking changes that could cause validation and verification of records to fail.
- Domain registration lapse -- Oops
- Domain seizure -- Section 230 poofs and court orders seizure
- Security incident -- DNSimple or my account is hacked
- Asset Transfer -- $1,000,000,000 and I get a cat and a comfy chair and I'm all of a sudden not super worried about what the new owner does
Domain Changing / Rebranding -- Hypothetically, if "events.smokesignal.calendar.*" is ever widely adopted and absorbed into a larger standards body, it can never reflect that in the NSID. That also means I can never rebrand Smoke Signal to anything else or use a different domain for any reason.
Domain rebalancing -- "edu.gerakines.students" to "edu.gerakines.student-life.profiles"
Squatting and impersonation -- A bad actor with a lexicon based on a domain they don't own is no different than a good actor not publishing their lexicon schema. I could start publishing a lexicon at "com.facebook.events" today knowing full well that Meta / Facebook has no intention to publish a lexicon any time soon.

Counter proposal: did-method-lex

Consider an alternative where each lexicon type and method was prefixed with a decentralized identifier that is based on a cryptographic hash of it's genesis record. It sounds familiar because you've already solved this problem and it would check all of the boxes:

The identifier can leverage suffixes for types and methods accordingly. Exmaple did:lex:ikkymg724jpihrelrmxllicb:calendars.event
The DID document can have management keys (rotation keys) for making changes to the document.
The DID document doesn't need to include the full schema but can instead point to one or more HTTPS sources with whatever schema data (e.g., JSON files) and documentation (websites or repositories) are relevant.
Governance is possible because of the paved-path of rotation keys.
You don't need a PDS and supports XRPC without anything that is actually atprotocol specific

With a formal "meta" schema in the form of a DID document, you can also explore interesting topics like lexicon composition and mixins.

If you really want to create human friendly short-cuts, you can do so using the same and existing resolution methods but with minor and distinct changes: _lexicon.smokesignal.events and https://smokesignal.events/.well-known/lexicon-did

2 replies

bnewbold Nov 25, 2024
Maintainer Author

Would this be an extension of the NSID syntax? What would go in AT-URIs and repo collection path segments?

The proposal (and some conversation threads above) touch a bit on how social consensus is an ultimate hedge against catastrophic loss of domain control, and that Lexicon indexing/aggregation services can be a pragmatic filter and harm-reduction point. Do you think those mechanisms are insufficient?

ngerakines Nov 25, 2024

Would this be an extension of the NSID syntax? What would go in AT-URIs and repo collection path segments?

Anything that consists of a "path" beyond the DID. With did-method-web, there is a "default" path, but that doesn't really exist in this case, so some type or method reference is necessary.

I think what really resonates with me is the concept of authority and where that rests. For a user in the atmosphere, authority isn’t their handle, but with the DID. The DID doc tells systems where their repository can be found, who can make change their account (with management keys), and who they are also known as. Their handle is just an alias (the “also” in alsoKnownAs). Handles come and go and it’s even “easy” to change them.

I think lexicon identifiers need the same strong assurances. Having a bidirectional alsoKnownAs (representing consent between DID document and DNS managers). The DNS aspect of NSIDs as part of lexicon resolution, to me, is a feature that is ephemeral because DNS isn’t permanent.

Authority over Lexicons should primarily remain in the global DNS system, as described above and in the existing NSID specification. We don’t want to abandon this overall design.

Agreed and FWIW I don’t think it should be abandoned. I just want to pivot the emphasis of authority from DNS to a DID or record.

The proposal (and some conversation threads above) touch a bit on how social consensus is an ultimate hedge against catastrophic loss of domain control, and that Lexicon indexing/aggregation services can be a pragmatic filter and harm-reduction point. Do you think those mechanisms are insufficient?

Ultimately, I agree that a strong community and good communication is great safeguard for some of these worst-case-scenarios.

imax9000 · 2024-11-25T06:28:34Z

imax9000
Nov 25, 2024

My 2 cents: resolution should be performed at build time. Runtime resolution shouldn't be a thing, or at least not commonly relied upon.

4 replies

ngerakines Nov 25, 2024

That really makes it tricky for a PDS to encounter a new record type and then be able to validate record data against the schema. Some amount of support for runtime interaction is needed.

imax9000 Nov 25, 2024

Which is why I think that PDS shouldn't be in the business of validating application-specific records.

Having the way data is interpreted change significantly at runtime is a substantial increase in the possibility of critical bugs. Without a lot of work dedicated to securing this part and making it as robust as possible - it would be very close to giving the attacker the ability to call eval() inside your code.

Thankfully, lexicon is not Turing-complete (yet), so at least the normal processing should have a bounded cost. When it does become Turing-complete - then it would be trivial to inject infinite loop into anything that does runtime resolution.

imax9000 Nov 25, 2024

Let me put it differently: I'd be extremely uncomfortable running and/or storing important data into anything that does schema resolution at runtime. If at all possible without breaking compatibility - I'd be ripping out such functionality before using the code.

bleonard252 Nov 25, 2024

For the lexicons a given software handles (you call this application-specific), they should have that built in at build-time, 100%. For PDSes, that would be the /xrpc/com.atproto.* endpoints. For the Bluesky AppView, it should have bundled, unchangeable lexicons for /xrpc/app.bsky.* and app.bsky.* record types. A given piece of software should not attempt to do dynamic resolution of its own domain -- the "authoritative" software knows best.

I agree that validating records (or XRPC things that aren't com.atproto.*) probably shouldn't be the purview of the PDS; that should probably be at the appview level, since I don't know of any repo records that are ATProto-wide, which would elicit PDS validation.

ngerakines · 2024-11-25T13:41:40Z

ngerakines
Nov 25, 2024

After reading through everyone's comments and spending additional time processing the RFC content, we're missing something important in the discussion: repository-like grouping that spans lexicons.

Lexicons support packaging/namespacing of types and methods (events.smokesignal.calendar.event with implied "#main", events.smokesignal.calendar.event#scheduled, events.smokesignal.calendar.location#place, etc.), and their use is often spanning multiple types within and sometimes across namespaces. For example, a Smoke Signal RSVP record references package-local events.smokesignal.calendar.rsvp#interested and the system type com.atproto.repo.strongRef.

When we discuss resolving lexicon references to schemas, I think we should also consider how multiple lexicons are resolved in that context.

This leads me to think about the Golang ecosystem, and its support for interacting with packages in the context of a "go.mod" file. It supports a handful of directives like "require" and "replace" that could be really useful to think about here.

Consider a lexicon-scope structure that looks like this:

lexicon: "v1.0.0"
lexicons:
  # Resolve this lexicon to "HEAD" and use it as-is
- "com.atproto.repo"
  # Use app.bsky.feed but from a specific record
- lexicon: "app.bsky.feed"
  source: "at://did:plc:vdji24mx5mz2aiuv63ddxoy6/lexicon/xsxmb9tw6q"
  cid: "bafyrei..megcz3a"

This would impact several things. First, lexicon validation could include versioning in important ways that have strong assurances across packages and applications. Yes, there are backward compatibility mechanics in the spec, but I think they're impractical. This would provide developers with a strong assurance that if they reference something outside of their own lexicon, they can reference it at a specific point in time. It's like having a lock file for lexicon resolution.

This would also allow developers to "fork" lexicons. In this second scenario, let's say that Paul is working on an update to app.bsky.feed.post that introduces a new type for rich text blocks, but it hasn't been released yet. If I want to start experimenting with that in Smoke Signal, that isn't really possible. With a source (+ optional CID) "rewrite" rule, I can more easily look at future types and reference lexicon "forks."

7 replies

ngerakines Nov 25, 2024

One other side effect of the did-method-lex document is that those sources could be multiple types and can change using the same strong protections that did-method-plc enjoys. All of these could (should) be valid:

{
  "source": "at://did:web:ngerakines.me/lexicon/rjr8ld8gwp",
  "lexicon": "events.smokesignal"
}

{
  "source": "https://smokesignal.events/lexicon.json",
  "lexicon": "events.smokesignal"
}

{
  "source": "git://github.com/astrenoxcoop/smokesignal-lexicon.git",
  "ref": "refs/heads/main",
  "lexicon": "events.smokesignal"
}

ngerakines Nov 25, 2024

With did-method-lexicon being a commit log, it also means that a single persistent DID can be used to represent incremental changes to the spec, that those changes can be undone, and that a DID can be tombstoned and discouraged from use.

ngerakines Nov 25, 2024

This model also leads into the replace ("rewrite") mechanic.

Consider the scenario where everything under "events.smokesignal" is accepted by a standards body and becomes "lexicon.standards.events". To allow this change to be backwards compatible, the lexicon resolver and validator would need to understand that any instances of "events.smokesignal" in types or methods should actually be "lexicon.standards.events".

In the above example, consider the change via a lex_operation to "did:lex:lbefcmief2t4fzi275qgy6e6" that looks like:

{
  "lexicon": 1,
  "requirements": [
    "com.atproto",
    {
      "source": "at://did:web:pfrazee.com/lexicon/xsxmb9tw6q",
      "lexicon": "app.bsky.feed"
    },
    {
      "source": "at://did:web:ngerakines.me/lexicon/rjr8ld8gwp",
      "lexicon": "lexicon.standards.events",
      "replace": "events.smokesignal"
    }
  ]
}

That introduction of "replace" would say that any time a prefix of "events.smokesignal" is encountered, treat it as "lexicon.standards.events". Because the change is part of the primary did-method-lexicon DID, any current records referencing it with "$lexicon" would start consuming lexicon.standards.events.calendar.feed types using the rewritten records.

bleonard252 Nov 25, 2024

Concurring with this. It also means you only have to do authority lookup once.

bnewbold Nov 28, 2024
Maintainer Author

There are a bunch of really interesting ideas in here. The analogy software library systems and the affordances they have evolved over time are strong, and we (Bluesky) have had to do a fair amount of juggling of JSON between repos when experimenting with changes across repos.

Right now, we feel like we are already pushing our complexity budget with atproto. There is a tension between what would be best long-term, with folks who are working deeply with the protocol. Versus getting new devs to jump in and start experimenting.

Overall, versioning of Lexicons is something we have a basic plan for, but is going to need more exploration. I think several of the ideas above could be tooling build on lexicon v1, and maybe find there way in to a lexicon v2.

bnewbold · 2024-11-28T04:20:11Z

bnewbold
Nov 28, 2024
Maintainer Author

We looked over all this feedback today (thanks everybody!), and had a few more internal conversations.

Here is where we are "strongly leaning":

will not have schema files as blobs in lexicon v1 (all the schema data will be in the record itself)
the record layout will look basically just like current Lexicon schema JSON files, with a $type field. the lexicon integer field will be top-level, and how versions are distinguished. in other words, we aren't using "lexicon design best practices" for extensibility (eg, an open union), we will just use that lexicon field
the CID of the record "is" the CID of the Lexicon, and can be used as a distinguishing identifier (if not really a "version")
not going to do NSID syntax segmentation to indicate the "authoritative" host (aka, not doing app.bsky._feed.post)
from a security perspective, regular DNS TXT records are acceptable as part of Lexicon resolution, contingent on us designing some more mitigations specific to lex resolution and auth scopes; and publishing some general security considerations which also apply to handle DNS TXT records

We are still undecided about using the existing handle resolution mechanism (aka, _atproto.* DNS TXT, or HTTPS well-known), or a new _lexicon.* DNS TXT alone. This is the last major design decision.

6 replies

futurGH Nov 28, 2024

If someone wants to have both a handle and a namespace on a single hostname it can either be in two TXT records or a simple space to separate the two key-value segments:

I don't want to have to parse that 😅 not to mention all the existing handle resolution code that would suddenly be trying to resolve a DID of did:plc:gy4wcyjtmm2wiobwgq4demtb lexicon=at://gy4wcyjtmm2wiobwgq4demtb/lexicon/whatever

This makes sense though, looking forward to seeing it!

ngerakines Nov 28, 2024

@futurGH I agree, but the unfortunate reality is that a lot of DNS providers and DNS libraries support multiple TXT records as one big squashed string with a space.

bnewbold Nov 28, 2024
Maintainer Author

Starting with the NSID events.smokesignal.calendar.event:

strip .event suffix and reverse to calendar.smokesignal.events
resolve that hostname to a DID (either as a handle, or a new DNS TXT record type)
lookup the DID, find PDS hosting location. do not do bi-directional validation of the handle and/or DNS TXT record from the DID doc
record will have the rkey events.smokesignal.calendar.event (the original NSID), so the AT-URI would be like: at://did:plc:abc123/com.atproto.lexicon.schema/events.smokesignal.calendar.event. note that this is a form of bi-directional validation (aka, the repo rkey points back to the NSID)
parse the record. it should have the correct $type (like all records), and lexicon should be 1 (for this version of the protocol). then parse id and confirm it matches the original NSID (and the record key)
you can enumerate all rkeys in this repo to "discover" possible other NSIDs. this definitely works for all the NSIDs in the same "group" (exact same except for final "name"), and might be a hint for other prefixes/domains, but doesn't robustly enumerate everything under all sub-domains (for that, use a lexicon aggregator service)

ngerakines Nov 28, 2024

Awesome. Ship it

bleonard252 Nov 29, 2024

if it doesn't resolve at i.e. calendar.smokesignal.events, should it try looking for it instead at smokesignal.events? (I'd stop at two segments, or one plus a PSL prefix, i.e. the last one checked for io.github.user.long.example to be user.github.io)

bnewbold · 2024-12-04T18:48:50Z

bnewbold
Dec 4, 2024
Maintainer Author

As another update, we decided in the direction of _lexicon.* DNS TXT records for Lexicon resolution, distinct from the "handle" mechanism. There will not be an HTTPS well-known mechanism, at least to start, just DNS. There will not be bi-directionality in the DID document. The format of the DNS TXT value will be the same/similar to the handle _atproto.* records (did=did:plc:abc123).

The next steps will be to do some more implementation experiments to prove out the design, including publishing some example Lexicons. Then we'll add a spec document and build out a "lexicon aggregator" service, with it's own API for doing queries and look-ups.

3 replies

bb010g Dec 5, 2024

What was the final reasoning for preferring avoiding AT Protocol handles?

QuietImCoding Dec 6, 2024

I came here to suggest some stuff but looks like the team has already decided on all the things I was going to recommend. Looking forward to seeing how this evolves!

matthieusieben Dec 7, 2024
Collaborator

What was the final reasoning for preferring avoiding AT Protocol handles?

One of the main reasons is that handles are a 1to1 bidirectional mapping with a did. We want to be able to point multiple namespaces to the same did, without requiring bi-directionnality.

Another reason is that acme.com might be an account managed by a marketing team, while the lexicon are managed by devs. We didn't want to have to introduce permissions at the repo level.

Also, we are not thrilled at the idea of people being able to use the bsky.social for their NSIDs.

RFC: Lexicon Resolution #3074

bnewbold Nov 22, 2024 Maintainer

Background and Framing

Syntax, Terminology, Use Cases

Sketch Design

Open Questions

NSID to Authority Mapping

Repo Resolution Question

Schema Representation Question

Alternatives Considered

HTTP well-known Endpoints

DID Service Entry

Replies: 15 comments · 53 replies

bnewbold Nov 25, 2024 Maintainer Author

bnewbold Nov 25, 2024 Maintainer Author

bnewbold Nov 25, 2024 Maintainer Author

bnewbold Nov 25, 2024 Maintainer Author

matthieusieben Nov 25, 2024 Collaborator

bnewbold Nov 25, 2024 Maintainer Author

Footnotes

bnewbold Nov 25, 2024 Maintainer Author

bnewbold Nov 25, 2024 Maintainer Author

bnewbold Nov 25, 2024 Maintainer Author

matthieusieben Nov 25, 2024 Collaborator

Counter proposal: did-method-lex

bnewbold Nov 25, 2024 Maintainer Author

bnewbold
Nov 22, 2024
Maintainer

Replies: 15 comments 53 replies

bnewbold Nov 25, 2024
Maintainer Author

bnewbold Nov 25, 2024
Maintainer Author

bnewbold Nov 25, 2024
Maintainer Author

bnewbold Nov 25, 2024
Maintainer Author

matthieusieben Nov 25, 2024
Collaborator

bnewbold Nov 25, 2024
Maintainer Author

bnewbold Nov 25, 2024
Maintainer Author

bnewbold Nov 25, 2024
Maintainer Author

bnewbold Nov 25, 2024
Maintainer Author

matthieusieben Nov 25, 2024
Collaborator

bnewbold Nov 25, 2024
Maintainer Author