Rules around the URL casing #745

yordis · 2021-05-04T02:51:28Z

Hey folk, hopefully, I didn't miss the docs, my apologies if I did.

I am trying to find a rule related to the URL casing, like, should my URLs be case insensitive, or not. and should I use dashes or underscore for separating words?

Would appreciate some alignment related to the topic.

Thanks in advanced,

shaneqld · 2021-05-31T04:39:43Z

Your URLs consist of the hostname (lowercase, case insensitive), version (e.g. v1beta1, which is lowercase), resource name and optionally custom method.

AIP-122 details resource names. Specifically:

Collection identifiers must be in camelCase

user-settable resource IDs should conform to RFC-1034; which restricts to letters, numbers, and hyphen
user-settable resource IDs should restrict letters to lower-case

AIP-136 details custom methods. Specifically:

If word separation is required, camelCase must be used

Here's an excerpt from a Google API showing this in action:

post: "/v3beta1/{parent=projects/*/locations/*/agents/*/sessions/*}/entityTypes"

In this example above, projects, locations, agents, sessions and entityTypes are collection identifiers (camel case) whereas the * characters are resource IDs (lowercase, numbers and hyphen).

And another example using a custom method:

post: "/v3beta1/{session=projects/*/locations/*/agents/*/sessions/*}:matchIntent"

Whether the path should be case sensitive, my guess is yes.

gibson042 · 2021-06-01T19:51:16Z

That section of AIP-122 doesn't even make sense... RFC 1034 describes DNS, and section 3.5 covers the preferred syntax for host names (note: not arbitrary domain names) and was superseded by RFC 1123 anyway (to allow the first character of any label in an Internet host name to be an ASCII letter or digit). Note also that each DNS domain name is a sequence of labels, and "conform to RFC-1034" fails to convey intent that seems to be "be valid as a DNS label in an Internet host name conforming with the preferred syntax of RFC 1034 {as updated by,without the updates of} RFC 1123". That said, though, I can't imagine why API resource names should having anything to do with DNS domain name labels, and Unicode Standard Annex #31 (IDENTIFIER AND PATTERN SYNTAX) would seem much more applicable.

yordis · 2021-06-01T19:56:27Z

How valuable would be to downcase everything and use either - or _ (only one) to avoid bikeshedding?

Or use camelCase but we can't use acronyms never, only use uppercase to separate words 🤷🏻

I am getting lost reading the AIP, and I am craving for more rules that avoid bikeshedding

lukesneeringer · 2021-07-12T18:25:03Z

@gibson042 Can you send a PR with what you think it ought to say?

gibson042 · 2021-07-13T07:09:22Z

Can you send a PR with what you think it ought to say?

@lukesneeringer Yes, but I'd really like someone to explain the intent of the current text first. The Resource ID segments section seems clear, albeit with a surprising DNS RFC reference—IIUC, user-settable resource IDs should be matched by /^[a-z]([a-z0-9-]{0,61}[a-z0-9])?$/ (which allows consecutive dashes that would be disallowed by UAX #31, and disallows underscores that would be allowed). But the main Guidance section is unclear, because RFC 1123 "Internet host name" updates the "Preferred name syntax" from RFC 1034, but neither of those documents have a concept of "DNS names" as suggested by the AIP (which I suspect is intending to reference the host name label production [i.e., the parts of a domain name that are separated by unescaped dots], although again it is not clear why URL path segments should have any connection to that concept).

For reference, RFC 3986 defines URL path segments as consisting of any combination of unreserved (ASCII alphanumeric/dash/dot/underscore/tilde), pct-encoded, sub-delims (any of !$&'()*+,;=), ":", and "@", and assigns special treatment to empty segments (disallowed after an initial slash for some paths), dot segments (. and .., subject to relative interpretation), and segments containing a colon (disallowed as the initial segment of a relative reference, where it would be confuseable with the separator after an initial URI scheme). AIPs can restrict path egments to a smaller set, but is there any reason for that restriction to correspond with the unrelated DNS host name label?

There's nothing wrong with "up to 63 ASCII characters with an initial letter, terminal alphanumeric, and inner alphanumerics and dashes, preferably all lowercase", it's just strange to couple it to DNS. Should I assume that we want to keep the general concept and drop the coupling, or should it go further and allow punctuation such as underscore (which is an ID_Continue character supported in Unicode identifiers)?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rules around the URL casing #745

Rules around the URL casing #745

yordis commented May 4, 2021

shaneqld commented May 31, 2021

gibson042 commented Jun 1, 2021

yordis commented Jun 1, 2021

lukesneeringer commented Jul 12, 2021

gibson042 commented Jul 13, 2021

Rules around the URL casing #745

Rules around the URL casing #745

Comments

yordis commented May 4, 2021

shaneqld commented May 31, 2021

gibson042 commented Jun 1, 2021

yordis commented Jun 1, 2021

lukesneeringer commented Jul 12, 2021

gibson042 commented Jul 13, 2021