New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consider not using _ to prefix extensions #19

Open
kr opened this Issue May 18, 2017 · 17 comments

Comments

Projects
None yet
6 participants
@kr
Contributor

kr commented May 18, 2017

From RFC 6648 ‘Deprecating the "X-" Prefix and Similar Constructs in Application Protocols’:

   Historically, designers and implementers of application protocols
   have often distinguished between standardized and unstandardized
   parameters by prefixing the names of unstandardized parameters with
   the string "X-" or similar constructs.  In practice, that convention
   causes more problems than it solves.  Therefore, this document
   deprecates the convention for newly defined parameters with textual
   (as opposed to numerical) names in application protocols.

This spec's _ prefix seems to qualify as a "similar construct".

Maybe it's too late to do anything about this, and even if not, maybe the arguments in that RFC are not persuasive enough. But I wanted to raise the issue anyway, just in case.

@kr

This comment has been minimized.

Show comment
Hide comment
@kr

kr May 18, 2017

Contributor

For convenience, here's the main point:

Appendix B.  Analysis

   The primary problem with the "X-" convention is that unstandardized
   parameters have a tendency to leak into the protected space of
   standardized parameters, thus introducing the need for migration from
   the "X-" name to a standardized name.  Migration, in turn, introduces
   interoperability issues (and sometimes security issues) because older
   implementations will support only the "X-" name and newer
   implementations might support only the standardized name.  To
   preserve interoperability, newer implementations simply support the
   "X-" name forever, which means that the unstandardized name has
   become a de facto standard (thus obviating the need for segregation
   of the name space into standardized and unstandardized areas in the
   first place).
Contributor

kr commented May 18, 2017

For convenience, here's the main point:

Appendix B.  Analysis

   The primary problem with the "X-" convention is that unstandardized
   parameters have a tendency to leak into the protected space of
   standardized parameters, thus introducing the need for migration from
   the "X-" name to a standardized name.  Migration, in turn, introduces
   interoperability issues (and sometimes security issues) because older
   implementations will support only the "X-" name and newer
   implementations might support only the standardized name.  To
   preserve interoperability, newer implementations simply support the
   "X-" name forever, which means that the unstandardized name has
   become a de facto standard (thus obviating the need for segregation
   of the name space into standardized and unstandardized areas in the
   first place).
@kr

This comment has been minimized.

Show comment
Hide comment
@kr

kr May 18, 2017

Contributor

For example, a hypothetical alternate version of the ‘Extensions’ section might have said:

Publishers can use custom objects in JSON Feeds. Any names that aren't described on this page are custom. Custom objects can appear anywhere in a feed.

It’s good practice to name an extension with a company or service name, to provide a clue right away as to what it’s for and who made it. However, if your custom object is useful to most people who read and write feeds, consider the possibility that it might end up becoming a de facto standard whether you want it to or not. If that seems likely, choose a suitably general name.

Contributor

kr commented May 18, 2017

For example, a hypothetical alternate version of the ‘Extensions’ section might have said:

Publishers can use custom objects in JSON Feeds. Any names that aren't described on this page are custom. Custom objects can appear anywhere in a feed.

It’s good practice to name an extension with a company or service name, to provide a clue right away as to what it’s for and who made it. However, if your custom object is useful to most people who read and write feeds, consider the possibility that it might end up becoming a de facto standard whether you want it to or not. If that seems likely, choose a suitably general name.

@svenluijten

This comment has been minimized.

Show comment
Hide comment
@svenluijten

svenluijten May 18, 2017

This could also be solved by introducing a new top-level object: extra. It might look like this:

{
  "extra": {
    "my_vendor": {
      "some_key": "value"
    }
  }
}

This extra object would obviously be optional.

This could also be solved by introducing a new top-level object: extra. It might look like this:

{
  "extra": {
    "my_vendor": {
      "some_key": "value"
    }
  }
}

This extra object would obviously be optional.

@manton

This comment has been minimized.

Show comment
Hide comment
@manton

manton May 18, 2017

Collaborator

Thanks @kr and @svenluijten. Good to know about RFC 6648. I lean toward keeping _ because it reads cleanly compared to extra nesting, and is obvious what is not part of the specification, but I don't think I mind extensions that use generic names. For example, in Micro.blog I decided to use _microblog because the extra fields could just as easily be part of a different service. (We debated a few other more unique choices like reverse-DNS strings, etc. but they added a lot of clutter to the format.)

Collaborator

manton commented May 18, 2017

Thanks @kr and @svenluijten. Good to know about RFC 6648. I lean toward keeping _ because it reads cleanly compared to extra nesting, and is obvious what is not part of the specification, but I don't think I mind extensions that use generic names. For example, in Micro.blog I decided to use _microblog because the extra fields could just as easily be part of a different service. (We debated a few other more unique choices like reverse-DNS strings, etc. but they added a lot of clutter to the format.)

@kornelski

This comment has been minimized.

Show comment
Hide comment
@kornelski

kornelski May 18, 2017

A popular extension will become a de-facto standard. This has happened many times on the web (apple-touch-icon, meta viewport, tons of -webkit-* CSS supported cross-browser, XFF HTTP header, etc.).

The HTML standard ended up using a central registry for this (https://wiki.whatwg.org/wiki/MetaExtensions) with a very low bar for registration of new names, and it seems to work fine.

So I'd suggest just asking people to coordinate. Decentralized zero-contact extensibility sounds nice, but in reality we can talk to each other, especially that extensions will require cooperation between feed creators and consumers anyway.

A popular extension will become a de-facto standard. This has happened many times on the web (apple-touch-icon, meta viewport, tons of -webkit-* CSS supported cross-browser, XFF HTTP header, etc.).

The HTML standard ended up using a central registry for this (https://wiki.whatwg.org/wiki/MetaExtensions) with a very low bar for registration of new names, and it seems to work fine.

So I'd suggest just asking people to coordinate. Decentralized zero-contact extensibility sounds nice, but in reality we can talk to each other, especially that extensions will require cooperation between feed creators and consumers anyway.

@ttepasse

This comment has been minimized.

Show comment
Hide comment
@ttepasse

ttepasse May 19, 2017

Given the history of RSS extensions coordination seems too hard. Following RFC 6648 gives you something like this:

"https://blueshed-podcasts.com/json-feed-extension": {
    "explicit": false,
    "copyright": "1948 by George Orwell",
    "owner": "Big Brother and the Holding Company",
    "subtitle": "All shouting, all the time. Double. Plus. Good."
}

Possible spec text:

  • A top-level or item-level key which is a valid IRI or URI denotes an extension object.
  • The Resource behind the URI SHOULD document the extension.
  • A parser which doesn't recognize the URI should ignore it.

Of course the developer behind the extension spec can and will go out of business and will delete the docs like so much of RSS extension documents did. But if one or more extensions in the same subject space get's popular and is useful for the whole ecosystem like the podcasting extensions to RSS, I'd argue that it is the job of the spec to document those extension instead of leaving the subject open to different extensions by different sources which all can go under.

Given the history of RSS extensions coordination seems too hard. Following RFC 6648 gives you something like this:

"https://blueshed-podcasts.com/json-feed-extension": {
    "explicit": false,
    "copyright": "1948 by George Orwell",
    "owner": "Big Brother and the Holding Company",
    "subtitle": "All shouting, all the time. Double. Plus. Good."
}

Possible spec text:

  • A top-level or item-level key which is a valid IRI or URI denotes an extension object.
  • The Resource behind the URI SHOULD document the extension.
  • A parser which doesn't recognize the URI should ignore it.

Of course the developer behind the extension spec can and will go out of business and will delete the docs like so much of RSS extension documents did. But if one or more extensions in the same subject space get's popular and is useful for the whole ecosystem like the podcasting extensions to RSS, I'd argue that it is the job of the spec to document those extension instead of leaving the subject open to different extensions by different sources which all can go under.

@kornelski

This comment has been minimized.

Show comment
Hide comment
@kornelski

kornelski May 19, 2017

AFAIK RSS did not event attempt to have any sort of central coordination. There was no registry. 2.0 spec didn't even have a clear way to provide feedback.

They've just said to use XML namespaces, and XML namespaces were generally misunderstood and incorrectly implemented (e.g. Sparkle required a specific prefix name instead of using NS URLs).

URL keys are longer, much harder to remember and easier to get wrong (was it http or https? trailing slash?).

And they're solving problem of decentralized extensibility with zero contact, instead of the problem of "we're all going to have to use this weird key if this extension becomes de-facto standard".

AFAIK RSS did not event attempt to have any sort of central coordination. There was no registry. 2.0 spec didn't even have a clear way to provide feedback.

They've just said to use XML namespaces, and XML namespaces were generally misunderstood and incorrectly implemented (e.g. Sparkle required a specific prefix name instead of using NS URLs).

URL keys are longer, much harder to remember and easier to get wrong (was it http or https? trailing slash?).

And they're solving problem of decentralized extensibility with zero contact, instead of the problem of "we're all going to have to use this weird key if this extension becomes de-facto standard".

@kornelski

This comment has been minimized.

Show comment
Hide comment
@kornelski

kornelski May 19, 2017

Let's say message stickers become the new rage and somehow every podcast will have to have a sticker pack. We could have:

  • _sticker key. Maybe there will be one proprietary format from the first-mover, or a few competing ones clashing, either by accident or because they disliked the first proprietary one.

  • same as above, but with addition of _sticker2, and _better_sticker, so every feed will end up having all three just in case.

  • _vendor-name-sticker, and everyone will forever use the vendor's name for the key, even after the vendor goes out of business. It will either be added to the official spec ("all _-prefixed keys are extensions, except _vendor-name-sticker, which is totally standard now!), or it won't be in the spec, and everyone will have to reverse-engineer it or find an alternative "spec" on some forum or StackOverflow answer.

  • http://json.feed.vendor.com/ns/extensions/sticker/1.0, same mess as above, just longer to type, and some people will get confused why it doesn't work when they "correct" the URL to https.

  • OR a sticker key! If the spec asks people to make a github issue, then the first person to attempt implementation will get it. Maybe they will even get a feedback on the format, and other implementors will find it to cooperate instead of duplicate. And later it could be added to the official spec without any legacy mess carried over.

kornelski commented May 19, 2017

Let's say message stickers become the new rage and somehow every podcast will have to have a sticker pack. We could have:

  • _sticker key. Maybe there will be one proprietary format from the first-mover, or a few competing ones clashing, either by accident or because they disliked the first proprietary one.

  • same as above, but with addition of _sticker2, and _better_sticker, so every feed will end up having all three just in case.

  • _vendor-name-sticker, and everyone will forever use the vendor's name for the key, even after the vendor goes out of business. It will either be added to the official spec ("all _-prefixed keys are extensions, except _vendor-name-sticker, which is totally standard now!), or it won't be in the spec, and everyone will have to reverse-engineer it or find an alternative "spec" on some forum or StackOverflow answer.

  • http://json.feed.vendor.com/ns/extensions/sticker/1.0, same mess as above, just longer to type, and some people will get confused why it doesn't work when they "correct" the URL to https.

  • OR a sticker key! If the spec asks people to make a github issue, then the first person to attempt implementation will get it. Maybe they will even get a feedback on the format, and other implementors will find it to cooperate instead of duplicate. And later it could be added to the official spec without any legacy mess carried over.

@manton

This comment has been minimized.

Show comment
Hide comment
@manton

manton May 19, 2017

Collaborator

I think @pornel's examples highlight why we went with _sticker. All extension styles have downsides, so best to go with the one that is the most simple and readable. I've seen a few extensions in the wild already, and they don't overwhelm the document. (If an extension becomes widely used, makes sense to file a request to incorporate it into the spec.)

Collaborator

manton commented May 19, 2017

I think @pornel's examples highlight why we went with _sticker. All extension styles have downsides, so best to go with the one that is the most simple and readable. I've seen a few extensions in the wild already, and they don't overwhelm the document. (If an extension becomes widely used, makes sense to file a request to incorporate it into the spec.)

@kr

This comment has been minimized.

Show comment
Hide comment
@kr

kr May 20, 2017

Contributor

Going with just sticker would have been (as far as I can tell) the most simple and readable, even more so than _sticker. No centralized registry is necessary. Same spec, same guidelines, same everything else, just… no underscore.

If an extension becomes widely used, makes sense to file a request to incorporate it into the spec.

Yes! The interesting part is what happens then?

There's a popular extension called _sticker. Everyone loves it and agrees it should be put in the spec, but there's lots of software out in the wild that supports only the underscore-name. You can't change the name because that would fail to interoperate with existing software (remember Rule #1). You can't standardize the existing name because the spec explicitly promises never to specify underscore-names, even in the future. What do you do?

Contributor

kr commented May 20, 2017

Going with just sticker would have been (as far as I can tell) the most simple and readable, even more so than _sticker. No centralized registry is necessary. Same spec, same guidelines, same everything else, just… no underscore.

If an extension becomes widely used, makes sense to file a request to incorporate it into the spec.

Yes! The interesting part is what happens then?

There's a popular extension called _sticker. Everyone loves it and agrees it should be put in the spec, but there's lots of software out in the wild that supports only the underscore-name. You can't change the name because that would fail to interoperate with existing software (remember Rule #1). You can't standardize the existing name because the spec explicitly promises never to specify underscore-names, even in the future. What do you do?

@manton

This comment has been minimized.

Show comment
Hide comment
@manton

manton May 20, 2017

Collaborator

It's similar to what happens with CSS extensions, e.g. -webkit-opacity. If it's incorporated into the official spec, people use both for a little while, then eventually we forget about the extension and just use the standard name. It's not perfect, but I think it's better than the conflicts that could happen if there's no prefix on extensions.

Collaborator

manton commented May 20, 2017

It's similar to what happens with CSS extensions, e.g. -webkit-opacity. If it's incorporated into the official spec, people use both for a little while, then eventually we forget about the extension and just use the standard name. It's not perfect, but I think it's better than the conflicts that could happen if there's no prefix on extensions.

@kr

This comment has been minimized.

Show comment
Hide comment
@kr

kr May 21, 2017

Contributor

Yeah, so with CSS, basically only web browsers can add extensions. (Because they're the only software that interprets CSS that people have to interact with over the network.) And they basically only do this when they're trying to get their extension into the standard. And even then it only works because almost all browsers have aggressive auto-update systems. Everyone who's writing CSS files knows this from the start, so they plan for it and it's less of a headache. (But still a big headache that takes years to resolve for each extension.)

The current situation strikes me as more similar to HTTP header fields. The classic example is X-Forwarded-For. There's a Forwarded field in the HTTP RFC, but nobody uses it. Everybody still uses X-Forwarded-For because that's the one that works. With X-Frame-Options (https://tools.ietf.org/html/rfc7034), they realized it would be futile to try to rename it and didn't even bother, they just standardized the X name.

The difference is, with CSS there are basically only five programs in the world you need to worry about, and they auto-update like nobody's business. With HTTP there are hundreds, and many of them are pretty conservative (i.e. slow or never) with updates.

I think JSON feed (especially if it is successful — and I really hope it will be! I love this spec, should've said so to start with) will be more on the HTTP end of the spectrum. As a machine-readable and machine-writable format, it'll have various network services that generate, transform, and interpret it. A feed aggregator is one example. If there is a widely used extension, say, _forwarded_for, that shows the URLs of the upstream feeds, you'll never manage to rename it to forwarded_for even if you want to.

You can specify the new name, but everyone will just keep using the name that is generated by the feed server on their random free multi-tenant web host and still works even on their cousin's five-year-old PC.

In short: I predict the _ prefix will make things more complicated.

the conflicts that could happen if there's no prefix on extensions

I don't totally follow here. Do you have an example of a conflict that would happen with no prefix on extensions, and that would be prevented by putting an underscore in front?

Note that new nonstandard HTTP header fields (these days) don't use any prefix (e.g. DNT), and conflicts are not a problem in practice.

Contributor

kr commented May 21, 2017

Yeah, so with CSS, basically only web browsers can add extensions. (Because they're the only software that interprets CSS that people have to interact with over the network.) And they basically only do this when they're trying to get their extension into the standard. And even then it only works because almost all browsers have aggressive auto-update systems. Everyone who's writing CSS files knows this from the start, so they plan for it and it's less of a headache. (But still a big headache that takes years to resolve for each extension.)

The current situation strikes me as more similar to HTTP header fields. The classic example is X-Forwarded-For. There's a Forwarded field in the HTTP RFC, but nobody uses it. Everybody still uses X-Forwarded-For because that's the one that works. With X-Frame-Options (https://tools.ietf.org/html/rfc7034), they realized it would be futile to try to rename it and didn't even bother, they just standardized the X name.

The difference is, with CSS there are basically only five programs in the world you need to worry about, and they auto-update like nobody's business. With HTTP there are hundreds, and many of them are pretty conservative (i.e. slow or never) with updates.

I think JSON feed (especially if it is successful — and I really hope it will be! I love this spec, should've said so to start with) will be more on the HTTP end of the spectrum. As a machine-readable and machine-writable format, it'll have various network services that generate, transform, and interpret it. A feed aggregator is one example. If there is a widely used extension, say, _forwarded_for, that shows the URLs of the upstream feeds, you'll never manage to rename it to forwarded_for even if you want to.

You can specify the new name, but everyone will just keep using the name that is generated by the feed server on their random free multi-tenant web host and still works even on their cousin's five-year-old PC.

In short: I predict the _ prefix will make things more complicated.

the conflicts that could happen if there's no prefix on extensions

I don't totally follow here. Do you have an example of a conflict that would happen with no prefix on extensions, and that would be prevented by putting an underscore in front?

Note that new nonstandard HTTP header fields (these days) don't use any prefix (e.g. DNT), and conflicts are not a problem in practice.

@manton

This comment has been minimized.

Show comment
Hide comment
@manton

manton May 21, 2017

Collaborator

Good points. You're right that CSS is not the best comparison, although it's still similar in that I think the most popular extensions will be proposed by feed readers.

As for conflicts, let's imagine that sticker in one extension contains a pair of string values (maybe full_url and thumbnail_url), and in another extension is an object with different members (maybe type, size, and url). It's true that the conflict is there whether the field has an underscore prefix or not. But the problem gets worse if there's no prefix, because if we bring one version of the extension into the spec, and new feed readers adopt it, those feed readers will be surprised when they encounter the same field name with a completely different structure underneath it.

In this scenario it actually becomes difficult to even check whether a feed is valid. As soon as we promoted an extension with a conflicting name like that into the spec, we'd actually invalidate other feeds that used to be perfectly fine (the sticker extension that used the thumbnail_url value, for example, when the spec now says it must be an object with 3 different required members). The prefix should prevent that from happening.

Collaborator

manton commented May 21, 2017

Good points. You're right that CSS is not the best comparison, although it's still similar in that I think the most popular extensions will be proposed by feed readers.

As for conflicts, let's imagine that sticker in one extension contains a pair of string values (maybe full_url and thumbnail_url), and in another extension is an object with different members (maybe type, size, and url). It's true that the conflict is there whether the field has an underscore prefix or not. But the problem gets worse if there's no prefix, because if we bring one version of the extension into the spec, and new feed readers adopt it, those feed readers will be surprised when they encounter the same field name with a completely different structure underneath it.

In this scenario it actually becomes difficult to even check whether a feed is valid. As soon as we promoted an extension with a conflicting name like that into the spec, we'd actually invalidate other feeds that used to be perfectly fine (the sticker extension that used the thumbnail_url value, for example, when the spec now says it must be an object with 3 different required members). The prefix should prevent that from happening.

@manton

This comment has been minimized.

Show comment
Hide comment
@manton

manton May 21, 2017

Collaborator

As @sonicdoe pointed out in another issue, extensions have to be objects, so my sticker example above was originally too much of a simplification. I've edited it to fix this. I think the point is the same either way. Thanks!

Collaborator

manton commented May 21, 2017

As @sonicdoe pointed out in another issue, extensions have to be objects, so my sticker example above was originally too much of a simplification. I've edited it to fix this. I think the point is the same either way. Thanks!

@kornelski

This comment has been minimized.

Show comment
Hide comment
@kornelski

kornelski May 21, 2017

The situation with one key having two different bodies in two different implementations may happen indeed, but I think it's not too bad for two reasons: a) it can be made less likely, b) even if it happens, it can be resolved.

It can be made less likely by tracking what extensions are used in the wild, and asking people to propose/register extensions they use. It is in implementors' interest to have their extensions widely known, conflict-free, and compatible with other clients.

And even when a conflict happens, it can be fixed:

  • It will be noticed. Anyone who spots a conflict can file an issue, and the community can figure out how to coordinate.
  • Implementors can converge on one implementation (e.g. choose that "url" is the right key, but use "full_url" as a fallback). This works even when one side is uncooperative.
  • In the worst case, it may be possible for producers to make both work (e.g. provide all the keys or mangle the syntax to hide it from some bad client).
  • The official spec can pick one side to help other implementations fade away.

The situation with one key having two different bodies in two different implementations may happen indeed, but I think it's not too bad for two reasons: a) it can be made less likely, b) even if it happens, it can be resolved.

It can be made less likely by tracking what extensions are used in the wild, and asking people to propose/register extensions they use. It is in implementors' interest to have their extensions widely known, conflict-free, and compatible with other clients.

And even when a conflict happens, it can be fixed:

  • It will be noticed. Anyone who spots a conflict can file an issue, and the community can figure out how to coordinate.
  • Implementors can converge on one implementation (e.g. choose that "url" is the right key, but use "full_url" as a fallback). This works even when one side is uncooperative.
  • In the worst case, it may be possible for producers to make both work (e.g. provide all the keys or mangle the syntax to hide it from some bad client).
  • The official spec can pick one side to help other implementations fade away.
@kr

This comment has been minimized.

Show comment
Hide comment
@kr

kr May 21, 2017

Contributor

Aha, I see. Yeah, of course I can't say that such a conflict wouldn't happen. (But as @pornel mentioned it's in everyone's interest to avoid conflicts, so they seem unlikely. This matches what's happened so far with HTTP.)

In that unfortunate scenario, I agree the spec wouldn't want pick just one of the two sticker extensions to standardize. I think a reasonable thing to do would be to use a new name such as badge or stickie or whatever. Then this problem reduces to the same problem as with _ prefixes, except now it only happens when there is actually a conflict.

But I appreciate that all that might end up still being a confusing situation. Thanks for hearing me out.

Contributor

kr commented May 21, 2017

Aha, I see. Yeah, of course I can't say that such a conflict wouldn't happen. (But as @pornel mentioned it's in everyone's interest to avoid conflicts, so they seem unlikely. This matches what's happened so far with HTTP.)

In that unfortunate scenario, I agree the spec wouldn't want pick just one of the two sticker extensions to standardize. I think a reasonable thing to do would be to use a new name such as badge or stickie or whatever. Then this problem reduces to the same problem as with _ prefixes, except now it only happens when there is actually a conflict.

But I appreciate that all that might end up still being a confusing situation. Thanks for hearing me out.

@dshanske

This comment has been minimized.

Show comment
Hide comment
@dshanske

dshanske May 2, 2018

Reading, it says the prefix applies to new objects, not necessarily extra and established properties inside an existing object. For example, I just added a syndication property to items based on alternate copies of it. This isn't an object.

dshanske commented May 2, 2018

Reading, it says the prefix applies to new objects, not necessarily extra and established properties inside an existing object. For example, I just added a syndication property to items based on alternate copies of it. This isn't an object.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment