Hypermedia

On RESTfulness

Hypermedia

A true RESTful API must satisfy the Fielding constraints[1], but in practice these seem to have imposed a high engineering barrier to entry. Anecdotally, this -- with the marketing industry behind it -- could partially explain the abuse of the term coming to mean something like "JSON (or XML) over HTTP".

This "quasi-RESTfulness" has been seen as an iterative step towards true RESTfulness[2], whereby API developers currently understand the concept of resources and, increasingly, the constraints of HTTP verb semantics. However, the HATEOAS (hypermedia as the engine of application state) constraint has struggled to gain widespread adoption.

Why is this? While HATEOAS isn't a particularly "friendly" term, it is conceptually very simple and, importantly, familiar insofar as it can be seen as a generalisation of linking within traditional web pages (i.e., HTML's anchor element). It is therefore reasonable to consider the connected graph model manifested by hypermedia to be established lore and, therefore, not the reason for the lack of traction.

Could it therefore be due to the mechanism for embedding application semantics? Again, while not as widely used, using link relations is both simple and precedent from traditional HTML usage. The only potential point of friction is that of standardisation and usage: The IANA registry for link relations[3] is very bare, to such a degree that one is forced to define application specific relation types. Given the average developer's propensity for creativity, the only issue here would be one of standardisation.

Whilst any decent developer will care about standardisation, it's clear that standards don't materialise from the ether. Furthermore, good developers are wont to hack! This, too, can not be the root cause.

In reality, there appear to be two fundamental issues that could explain the hesitance towards true RESTfulness:

The traditional RPC model of an API does not cleanly map into a RESTful architecture (i.e., what is essentially a database). While not necessarily borne from a lack of understanding, this impedance mismatch is at the heart of the problem. However, first a more systemic issue needs to be addressed.
That is, the practical problems associated with tooling and, critically, the mechanisms for representing the graph in the first place.

Hypermedia Type

JSON is popular because it is simple, very lightweight and can be easily consumed by JavaScript, which the majority of web API client logic is ultimately implemented in due to its incumbency in the browser space. However, JSON is ostensibly not a hypermedia format and, as such, a lot of the technical discussion around HATEOAS derives from this deficiency.

The standard resolution is to "hack" special hypermedia properties into JSON documents, but many solutions are either lacking or overly complicated. A less common, but increasingly popular solution -- used on its own, or in conjunction with the former -- is to use the HTTP Link response header to associate the resource (e.g., in plain JSON) with another (or several others) that describes its schema. This approach comes with the potential for very fine fidelity, but at the cost of maintenance and, presuming links are to be dereferenced, extraneous HTTP requests (or caching).

The experience from these attempts can be used to prioritise constraints for the development of a more appropriate form to represent the graph:

The data format must have minimal friction against adoption by both client and server developers. While ultimately consumed by machines, human readability is important for a holistic understanding of state. Given the popularity of JSON, this realistically means a JSON derivative with the barest of augmentation.
The data format should allow arbitrary linking of any data element, with a link relation defining its application semantics, per RFC5988[4].
Protocol semantics (i.e., HTTP verbs) should be associated with all resources.
Human readable (the above are all to be considered machine readable) documentation can be associated with resources.
HTTP requests should be kept to a minimum (i.e., dereferencing IRIs on an, at most, needs be basis).

Prior Art

(X)HTML and XML

To a degree, traditional HTML satisfies most of the above constraints. It only really falls down in two respects:

Only the GET and POST protocol semantics, between resources, can be encoded. This is rather limited and has led to an overloading of the POST method in practice.
HTML comes with a lot of unnecessary baggage that isn't required by an API, but is eminently useful for human-readable hypermedia (its original intention). As such, there is more friction imposed on the developer to either generate or parse this both economically and consistently.

One interesting feature of (X)HTML is that links can be both associated with a resource (in the head element and Link response header) and also inline, specific to arbitrary regions within the document (via anchors). For the purpose of API state, links specific to the resource are clearly necessary; anything at a finer granularity would be a "nice to have".

Regardless of this, given these actually quite minor failings, it is reasonable for a RESTful web API to support HTML output (e.g., if that's all a client will accept) for read-only consumption.

XML resolves the second problem, somewhat, by using a specific and typed (using DTDs) vocabulary. Arbitrary linking is supported using XLink[5]. While this therefore makes it a worthy candidate for RESTful APIs, it is out of favour due to its verbosity and the extra machinery required for parsing. Nonetheless, some effort would be worth pursuing for a modicum of support.

HAL

HAL[6] is a hypermedia type that extends JSON and XML with a handful of special properties to facilitate linking and link relations.

Specifically, each resource (state) comes with a set of links indexed by their relation, and optional embedded state (following the same pattern). This affords an arbitrarily nested structure of state with associated links.

While minimalistic, HAL lacks any mechanism for specifying protocol semantics on links, deferring instead to human readable documentation. Moreover, its semi-nested structure imposes an odd semantic credence on the head resource.

Siren

Siren[7] takes a more involved approach than HAL, by encoding both links and actions that can be performed against a resource. In a sense, it mirrors HTML, with its emphasis on data and relationships, rather than human readable text.

Importantly, arbitrary protocol semantics (unlike HTML's restriction to GET and POST) can be encoded into "actions". The fidelity afforded by the specification allows one to define the type level details of any request a resource may accept (much like HTML's form elements).

All this comes at the cost of complexity: Both in terms of developer adoption and interoperability between API deployments and clients.

Linked Profiles

Linked profiles[8], via the HTTP Link response header, while not a hypermedia type can use the same mechanism to encode semantics out-of-band from the resource. The arbitrariness of what is linked -- presuming it is deferencable, which is not a requirement of the specification -- gives a huge gamut of augmentation that can be applied to any resource.

However, at the same time, this places undue burden on developers to either support the consumption of a referenced profile, as well as curating a repository of such profiles, or by hardcoding the semantics implied by such links.

Some attempts to standardise profile resources have been, or are being made (e.g., ALPS[9]). However, they have yet to fully crystallise and, again, don't fully resolve or even address the mechanism's shortcomings.

JSON-LD and Hydra

JSON-LD[10] can be seen as a type annotation system on top of standard JSON. This has the benefit that JSON-LD data can simply be inserted into existing APIs that serve plain JSON to augment it with both application semantics and linking, under a particular context.

The standard notion of link relations is somewhat indirect, by virtue of JSON-LD's identifier annotation implying the relation. Nonetheless, this is present. Moreover, Hydra[11] can be used in conjunction with JSON-LD to define additional vocabulary to facilitate protocol semantics and request templates.

The strength of JSON-LD, with Hydra, is also its weakness: Allowing the contextualisation to be shimmed into existing data makes for easy adoption, while maintaining separation of concerns; however, the resulting circumlocution of data and schema complicates what is, essentially, a fairly straightforward concept.

(JSON API[12] is very similar to JSON-LD, with Hydra, but seems more restrictive and opinionated in its design.)

Collection+JSON

Credit should go to Collection+JSON[13] for paving the way towards better thought out and justified representations for facilitating truly RESTful APIs.

In particular, it represents resources as a list of items (hence "Collection"), each with their own state. The resource as a whole is associated with various relational links, as are individual items, which also comprise its data. In addition to this, mechanisms for defining search queries, item templates (cf. HTML's form tags), linked profiles and even error responses also exist.

While Collection+JSON caters for a wide spectrum of eventualities, it does so by cramming everything, albeit optionally, into the format but at the expense of detail. Moreover, it explicitly demarcates various aspects of a resource's state -- so one must context switch -- rather than recognising self-similar patterns.

Discussion

A criticism of the quest for real RESTfulness is that a software agent could not navigate an API graph -- regardless of its representation -- in a meaningful way without sufficient intelligence guiding it. Given that present technology cannot fulfil this requirement, one must defer to human intervention, which therefore makes the effort of curating a suitably rich graph structure pointless from the outset.

If the only purpose of a RESTful API is to facilitate fully automatic clients, then this is a valid concern. However, presenting resources to human agents with the full spectrum of potential interactions and relations, abstracting away semantics from presentation, is clearly of utility. Moreover, given a client with a sufficiently rich ontology, it wouldn't be impossible for it to judiciously explore a given graph with the intention of fulfilling a specific purpose.

For example, imagine a DSL that maps over a directory service ontology. A statement such as manager of foo becomes bar, applied against an API entry point, could be interpreted as follows:

The manager keyword is identified in the ontology as having the type of people, so the client knows it's dealing with HR data.
A GET request to the entry point is made and its response is analysed to look for a link with a people (or synonymous) relation: say, /people.
A GET request can now be made on /people/foo to retrieve the state of the person foo; a manager link relation is found in foo's representation. (While not strictly necessary, a HEAD request can also be made to /people/bar to ensure bar dereferences.)
An OPTIONS request is made on /people/foo to check that the becomes keyword -- which is interpreted as an idempotent update (i.e., PUT) -- can be satisfied.
The PUT request is constructed respectfully and submitted to /people/foo. The client waits for a 2xx status code (e.g., 204 No Content), or otherwise, which is reported back to the user.

The point is that, to justify a key design constraint, while link relations are the mechanism for encoding application semantics, how they are interpreted is outside the scope of the API itself. (As link relations can be IRIs, one could imagine a link relation repository, that could double as a client ontology per the above.)

As far as representations go, one option would be to draw a line in the sand and just pick a hypermedia type that satisfies the most salient design requirements, accepting any deficiencies as a compromise. This is certainly the most pragmatic approach, in an engineering sense -- plus, preferably, one would want to avoid media-type proliferation -- but it appears that the proposed solutions (mentioned above, or otherwise) are lacking in one way or another. That said, the ideas they suggest are certainly worthwhile and can be built upon or adapted into a more productive system.

Indeed, of all the proposed hypermedia types, Collection+JSON is the closest in terms of satisfying our design constraints. That said, it too is not ideal:

While relatively straightforward, Collection+JSON's syntax is a bit cluttered. This can be tidied up by closer modelling of the graph.
It has an "all or nothing" approach where metadata is concerned, which complicates things further. Moreover, the metadata it admits -- which mimic HTML's form elements, like Siren -- are a bit constrained.
Collection+JSON's "query templates" are a good idea, but not fully fledged.

Next > Modelling the Graph

References

Fielding, R. T. (2000) Architectural Styles and the Design of Network-Based Software Architectures; University of California, Irvine; PhD Thesis
Fowler, M. (2010), based on Richardson, L. (2008) Richardson Maturity Model
Nottingham, M. et al (eds.) (Retr. 2015) Link Relations; IANA Registry
Nottingham, M. (2010) Web Linking; IETF RFC5988
DeRose, S. et al (eds.) (2001) XML Linking Language (XLink) Version 1.0; W3C Recommendation
Kelly, M. (2013) HAL - Hypertext Application Language
Wiber, K. (2014) Siren: A Hypermedia Specification for Representing Entities
Wilde, E. (2013) The 'profile' Link Relation Type; IETF RFC6906
Amundsen, M. et al (2015) Application-Level Profile Semantics (ALPS); IETF Internet-Draft
Sporny, M. et al (2014) JSON-LD 1.0; W3C Recommendation
Lathaler, M. (2013) Hydra: Hypermedia-Driven Web APIs
Klabnik, S. et al (eds.) (2013) JSON API
Amundsen, M. (2013) Collection+JSON - Hypermedia Type

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hypermedia

On RESTfulness

Hypermedia

Hypermedia Type

Prior Art

(X)HTML and XML

HAL

Siren

Linked Profiles

JSON-LD and Hydra

Collection+JSON

Discussion

References

Clone this wiki locally