Skip to content

How to (and how not to) design REST APIs

Jeff Schnitzer edited this page Oct 30, 2023 · 19 revisions

Oct 30, 2023

I have grown weary of seeing the same mistakes repeated in REST APIs - even new ones - so I thought it might be nice to write down a set of best practices. And poke fun at a couple widely-used APIs. Much of this should be "duh", but there might be a few rules you haven't thought of yet.

Rule #1: DO use plural nouns for collections

It's an arbitrary convention, but it's well-established and I've found violations tend to be a leading indicator of "this API will have rough edges".

# GOOD
GET /products   # get all the products
GET /products/{product_id} # get one product

# BAD
GET /product/{product_id}

Rule #2: DON'T add unnecessary path segments

A common mistake seems to be trying to build your relational model into your URL structure. Etsy's new API is full of this kind of thing:

# GOOD
GET /v3/application/listings/{listing_id}

# BAD
PATCH /v3/application/shops/{shop_id}/listings/{listing_id}
GET /v3/application/shops/{shop_id}/listings/{listing_id}/properties
PUT /v3/application/shops/{shop_id}/listings/{listing_id}/properties/{property_id}

The {listing_id} is globally unique; there's no reason for {shop_id} to be part of the URL. Besides irritating your developers with extra clutter, it inevitably causes problems when your "invariant" changes down the road - say, a listing moves to a different store or can be listed in multiple stores.

I've seen this mistake repeated over and over; I can only assume it's a manifestation of someone's OCD:

GET /shop/{shop_id}/listings              # normal, expected
GET /shop/{shop_id}/listings/{listing_id} # pretty when lined up like this, but otherwise dumb
GET /listings/{listing_id}                # would have been a much better endpoint

Which is not to say that compound URLs don't make sense - use them when you genuinely have compound keys.

# When {option_id} is not globally unique
GET /listings/{listing_id}/options/{option_id}

Rule #3: DON'T add .json or other extensions to the url

This seems to have been some sort of default behavior of Rails, so it shows up intermittently in public APIs. Shopify gets shame here.

  • URLs are resource identifiers, not representations. Adding representation information to the URL means there's no canonical URL for a 'thing'. Clients may have trouble uniquely identifying 'things' by URL.
  • "JSON" is not even a complete specification of the representation. What transfer encoding, for example?
  • HTTP already offers headers (Accept, Accept-Charset, Accept-Encoding, Accept-Language) to negotiate representations.
  • Putting stock text at the end of URLs annoys the people writing clients.
  • JSON should be the default anyway.

Back in the 2000s there might have been some question about whether clients want JSON or XML, but here in the 2020s it has been settled. Return JSON, and if clients want to negotiate for something else, rely on the standard HTTP headers.

Rule #4: DON'T return arrays as top level responses

The top level response from an endpoint should always be an object, never an array.

# GOOD
GET /things returns:
{ "data": [{ ...thing1...}, { ...thing2...}] }

# BAD
GET /things returns:
[{ ...thing1...}, { ...thing2...}]

The problem is that it's very hard to make backwards compatible changes when you return arrays. Objects let you make additive changes.

The obvious common evolution here will be to add pagination. You can always add totalCount or hasMore fields and old clients will continue to work. If your endpoint returns a top-level array, you will need a whole new endpoint.

Rule #5: DON'T return map structures

I often see map structures used for collections in JSON responses. Return an array of objects instead.

# BAD
GET /things returns:
{
    "KEY1": { "id": "KEY1", "foo": "bar" },
    "KEY2": { "id": "KEY2", "foo": "baz" },
    "KEY3": { "id": "KEY3", "foo": "bat" }
}

# GOOD (also note application of Rule #4)
GET /things returns:
{
    "data": [
        { "id": "KEY1", "foo": "bar" },
        { "id": "KEY2", "foo": "baz" },
        { "id": "KEY3", "foo": "bat" }
    ]   
}

Map structures in JSON are bad:

  • The key information is redundant and adds noise to the wire
  • Unnecessary dynamic keys create headaches for people working in typed languages
  • Whatever you think a "natural" key is can change, or clients may want a different grouping

Converting an array of objects to a map is a one-liner in most languages. If your client wants efficient random-access to the collection of objects, they can create that structure. You don't need to put that on the wire.

The worst thing about returning map structures is that your conceptual keys may change over time, and the only way to migrate is to break backwards compatibility. OpenAPI is a cautionary tale - v3 to v4 is full of unnecessary breaking changes because they rely heavily on map structures instead of array structures.

# OpenAPI v3 structure
{
    "paths": {
        "/speakers": {
            "post": { ...information about the endpoint...}
        }
    }
}

# Proposed OpenAPI v4 structure, which names requests by adding a new 
# map layer (eg "createSpeaker").
{
    "paths": {
        "speakers": {
            "requests": {
                "createSpeaker": {
                    "method": "post",
                    ...rest of the endpoint info...
                }
            }
        }
    }
}

If this was a flatter list structure, adding a name to an object is a nonbreaking change:

# Hypothetical flat array structure
{
    "requests": [
        {
            name: "createSpeaker"
            path: "speakers",
            method: "post",
            ...etc...
        }
    ]
}

Exception to the no-map rule

The exception to the no-map rule is simple key/value pairs, like Stripe's metadata.

# OK
{ 
    "key1": "value1",
    "key2": "value2"
}

Nobody will fault you for this structure. But if the values are more than simple strings, prefer arrays of objects instead.

Rule #6: DO use strings for all identifiers

Always use strings for object identifiers, even if your internal representation (ie database column type) is numeric. Just stringify the number.

# BAD
{ "id": 123 }

# GOOD
{ "id": "123" }

A great API will outlast you, your implementation code, and the company that created it. In that time your infrastructure might be rewritten on a different technology platform, migrated to a new database, or merged with another database that contains conflicting IDs.

String IDs are incredibly flexible. Strings can encode version information or segment ID ranges. Strings can encode composite keys. Numeric IDs put a straightjacket on future developers.

I once worked on a system that (because of a database merge) had to segment numeric ID ranges by giving one group positive IDs, the other negative IDs. Aside from the general ugliness, you can only do this segmentation once.

As a bonus, if all your ID fields are strings, client developers working in typed languages don't need to think about which type to use. Just use strings!

Rule #7: DO prefix your identifiers

If your application is at all complicated, you'll end up with a lot of different object types. Keeping opaque IDs straight is a mental challenge for both you and your client developers. You can dramatically improve the ergonomics of your API by making different types of IDs self-describing.

  • Stripe's identifiers have two-letter-plus-underscore prefixes: in_1MVpWEJVZPfyS2HyRgVDkwiZ
  • Shopify's graphql identifiers look like URLs (though their REST API IDs are numeric, boo): gid://shopify/FulfillmentOrder/1469358604360

It doesn't matter what format you use, as long as 1) they're visually distinct and 2) they don't change.

Everyone will appreciate the reduced support load when you can instantly tell the difference between an "order line item ID", a "fulfillment order line item ID", and an "invoice item line item ID".

Rule #8: DON'T use 404 to indicate "not found"

The HTTP spec says you should use 404 to indicate that a resource was not found. A literal interpretation suggests you should return 404 for GET/PUT/DELETE/etc requests to an ID that does not exist. Please do not do this - hear me out.

When calling (say) GET /things/{thing_id} for a thing that doesn't exist, the response should indicate that 1) the server understood your request, and 2) the thing wasn't found. Unfortunately, a 404 response does not guarantee #1. There are many layers of software that can return 404 to a request, some of which you may have no control over:

  • Misconfigured client hitting the wrong URL
  • Misconfigured proxies (client end and server end)
  • Misconfigured load balancers
  • Misconfigured routing tables in the server application

Returning HTTP 404 for "thing not found" is almost like returning HTTP 500 - it could mean the thing doesn't exist, or it could mean something went wrong; the client cannot be sure which.

This is not a minor problem. One of the hardest things about distributed systems is maintaining consistency. Let's say you want to delete a resource from two systems (Alpha and Bravo) and all you have is a simple REST API (no two-phase-commit):

  1. In a single database transaction, SystemAlpha deletes Thing123 and enqueues a NotifyBravo job
  2. The NotifyBravo job runs, calling DELETE /things/Thing123 on SystemBravo

This works because the queue will retry jobs until success. But it may also retry jobs that have succeeded; queues are at-least-once, not exactly-once.

Since successfully-executed DELETE jobs may retry anyway, jobs must treat the "not found" response as success. If you treat 404 as success, and a failure in your stack returns 404, your job will be removed from the queue and your delete will not propagate. I have seen this happen in real life.

You could simply have DELETE return 200 (or 204) OK when deleting a nonexistant thing - it makes sense, and I think it's an acceptable answer for DELETE. But some analogue of this issue exists for GET, PUT, PATCH, and other methods.

My advice is to pick another 400-level error code that clients can interpret as "I understand what you're asking for, but I don't have it". I use 410 GONE. This diverges slightly from the original intended meaning of 410 ("it existed before, but it doesn't now") but nobody actually uses that error, it's reasonably self-explanatory, and there's no risk that a future HTTP spec will repurpose your made-up 4XX number.

But almost any strategy is better than returning 404 for entity not found.

Rule #9: BE consistent

Mostly this section is here so that I can poke fun at Shopify. There are 6 subtly different schemas for an Address in their REST API:

  • DraftOrder has most of the address fields you would expect, including name, first_name, last_name, province, province_code, country, country_code
  • Customer Address adds country_name
  • Order billing_address adds latitude, longitude but shipping_address does not
  • Checkout billing_address and shipping_address have no name (but still have first_name, last_name)
  • AssignedFulfillmentOrder destination is missing name, province_code, country_code (but still has first_name, last_name, and the full country and province names)
  • Location has name but not first_name or last_name - at least this one makes sense

This is maddening. It feels like someone at Shopify is toying with us: "Simon says there's a country field. Simon says there's a country_name field. There's a country_code field. Hahaha, null pointer exception for you!"

Rule #10: DO use a structured error format

If you're building the backend for a simple website you can probably ignore this section. But if you're building a large system with multiple layers of REST services, you can save yourself a lot of headache by establishing a standard error format up front.

My error formats tend to look something like this, roughly shaped like a (Java) exception:

{
  "message": "You do not have permission to access this resource",
  "type": "Unauthorized",
  "types": ["Unauthorized", "Security"],
  "cause": { ...recurse for nested any exceptions... }
}

The standard error format (with a nested cause) means that you can wrap and rethrow errors multiple layers deep:

ServiceAlpha -> ServiceBravo -> ServiceCharlie -> ServiceDelta

If ServiceDelta raises an error, ServiceAlpha can return (or log) the complete chain, including the root cause. This is much easier to debug than combing through the logs on four different systems - even with centralized logging.

Rule #11: DO provide idempotence mechanisms

I've had this exact conversation multiple times with various service providers (including ones that print and ship physical goods!):

Jeff: How can I ensure that I don't submit duplicate orders?

Service's tech team: Can't you just only submit the order once?

No, no I cannot.

I don't want to dwell too much on the 'why' of this; most people who read this far probably already understand the problem. The quick example I always send back is this one:

  1. I submit the order
  2. The network fails and I get a timeout instead of 200 OK
  3. I don't know if the order succeeded or failed

There are other failure modes but the upshot is that without some help from the API provider, there's no way to guarantee "once and only once" behavior. This is the Two General's Problem again.

There are generally two good ways and one crappy way to support this. The good ways:

  1. Let the client submit an "idempotency key" or "customer reference key" with each create request and enforce the uniqueness of this key. Maybe you store it in the database with a unique constraint; maybe you put it in ephemeral storage and only guarantee uniqueness for a time period. The latter is how Stripe works.
  2. Let the client pick IDs! From an implementation perspective, this is much like #1. You may have an internal ID that the client never sees, but you keep a unique constraint on {client_id, client_thing_id}. This style tends to produce ergonomic APIs, at the cost of a little implementation complexity.

The crappy way to support idempotence:

  1. Provide an endpoint to list recent transactions. Prior to every submission, clients must look through recent transactions and make sure that the create hasn't already been done. This is crappy because it requires create operations to be serialized (lest you introduce race conditions), it's slow, and maintaining an adequate safety window (duration) in a busy system may require fetching excessive amounts of data.

Now that your API offers an idempotence mechanism, there's one more major consideration: How do you inform the client that there's a conflict? There are two schools of thought:

  1. Return an error. When a client submits a duplicate idempotency key, I like to return 409 CONFLICT. There is one trick here - unless you're using user-submitted IDs (strategy #2 above), you need to include the "real" ID in the error message or otherwise provide a mechanism to lookup the ID by idempotency key. When the client gets a 409 CONFLICT response, it says "oh, already done" and records the created ID.
  2. Return the same response that the client should have gotten the first time. This allows clients to be a little dumber - they don't have to explicitly code up a CONFLICT error handler - but it significantly complicates server implementation. You need to store all responses for a period of time and you need to validate that the client sent the exact same parameters with each request. Stripe chose this route.

TL;DR: There are a few ways of enabling idempotent behavior, but you won't go wrong with:

  • Have the client submit an idempotency key (aka "customer reference ID") with each POST/create operation
  • Store it in the database with a unique constraint on {client_id, idempotency_key}
  • Return 409 CONFLICT when you violate the unique constraint
  • Provide the original ID in the 409 response body

Conclusion

What rules am I missing? Let me know on Hacker News (link TBD)