future spec suggestion: include a (optional) canonical_url
in the response payload as core attribute
#238
Labels
canonical_url
in the response payload as core attribute
#238
The mailing list seems dead, so I this seems to be the best place for a suggestion / wishlist for future specs (if there is one). I didn't see this covered in the history of either.
It would be nice if the spec encouraged/suggested providers to include the "canonical" url in the payload.
Why?
We consume a fair amount of data via oEmbed, and consume a fair bit of extra traffic caused by multiple discrete URLs that represent a single URL.
For example (and with the understanding that kw args are for internal use and do not affect the oEmbed response), the html versions of these endpoint all declare the same canonical URL of
https://example.com/a
and have an identical response:In addition to various business-logic reasons, in order to keep our cache size under control, we need to use the canonical url as the cache key. To achieve that, we must first do an HTTP request for the actual html document to discern the canonical, then oembed fetch it if there is a cache miss.
While an average oEmbed response from upstream providers is around 3k, the full html pages are usually in the 300-800k range. Providing the canonical in the oEmbed payload would allow for HTML retrieval/parsing to be skipped, and shift all interaction to the oEmbed API.
Good examples of this phenomena are YouTube and Instagram. Both properties are built on the concept of sharing links and capitalize on oEmbed integrations, but also have numerous URLs patterns/generators that will point to a given canonical.
The text was updated successfully, but these errors were encountered: