Skip to content
API Standards for DOL
Branch: master
Clone or download
Pull request Compare This branch is 48 commits ahead, 4 commits behind WhiteHouse:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
README.md

README.md

DOL API Standards

This document captures DOL's view of API best practices and standards as can be applied to the DOL-wide API. We will incorporate these into our work.

Design for common use cases

DOL's API is designed with varying uses in mind:

  • Bulk data. Clients often wish to establish their own copy of the API's dataset in its entirety. For example, someone might like to build their own search engine on top of the dataset, using different parameters and technology than the "official" API allows. If the API can't easily act as a bulk data provider, provide a separate mechanism for acquiring the backing dataset in bulk.
  • Staying up to date. Especially for large datasets, clients may want to keep their dataset up to date without downloading the data set after every change. If this is a use case for the API, prioritize it in the design.
  • Driving expensive actions. What would happen if a client wanted to automatically send text messages to thousands of people or light up the side of a skyscraper every time a new record appears? Consider whether the API's records will always be in a reliable unchanging order, and whether they tend to appear in clumps or in a steady stream. Generally speaking, consider the "entropy" an API client would experience.\

Using one's own API (dogfooding)

The #1 best way to understand and address the weaknesses in an API's design and implementation is to use it in a production system.

Whenever feasible, design applications to integrate with the API.

Point of contact

Have an obvious mechanism for clients to report issues and ask questions about the API.

When using GitHub for an API's code, use the associated issue tracker. In addition, publish an email address for direct, non-public inquiries.

Notifications of updates

Have a simple mechanism for clients to follow changes to the API.

Common ways to do this include a mailing list, or a dedicated developer blog with an RSS feed.

API Endpoints

An "endpoint" is a combination of two things:

  • The verb (e.g. GET or POST)
  • The URL path (e.g. /posts)

Information can be passed to an endpoint in either of two ways:

  • The URL query string (e.g. ?year=2014)
  • HTTP headers (e.g. X-Api-Key: my-key)

When people say "RESTful" nowadays, they really mean designing simple, intuitive endpoints that represent unique functions in the API.

Generally speaking:

  • Avoid single-endpoint APIs. Don't jam multiple operations into the same endpoint with the same HTTP verb.
  • Prioritize simplicity. It should be easy to guess what an endpoint does by looking at the URL and HTTP verb, without needing to see a query string.
  • Endpoint URLs should advertise resources, and avoid verbs.

Some examples of these principles in action:

JSON As the Default

JSON is an excellent, widely supported transport format, suitable for many web APIs.

Supporting JSON as the default will support developers' needs in this day and age. XML and CSV will be optional formats for those with legacy code or who want bulk downloads.

General JSON guidelines:

  • Responses should be a JSON object (not an array). Using an array to return results limits the ability to include metadata about results, and limits the API's ability to add additional top-level keys in the future.
  • Don't use unpredictable keys. Parsing a JSON response where keys are unpredictable (e.g. derived from data) is difficult, and adds friction for clients.
  • Use under_score case for keys. Different languages use different case conventions. JSON uses under_score, not camelCase.

Use a consistent date format

And specifically, use ISO 8601, in UTC.

For just dates, that looks like 2013-02-27. For full times, that's of the form 2013-02-27T10:00:00Z.

This date format is used all over the web, and puts each field in consistent order -- from least granular to most granular.

API Keys

These standards do take a position on whether or not to use API keys. DOL uses them to provide detailed metrics to the API management team and senior leadership as well as to the developers themselves. API keys will be required and will be sent through the header.

In the future, the API may allow some sort of very limited unauthenticated access, without keys to allow newcomers to use and experiment with the API in demo environments and with simple curl/wget/etc. requests.

Error handling

Handle all errors (including otherwise uncaught exceptions) and return a data structure in the same format as the rest of the API.

For example, a JSON API might provide the following when an uncaught exception occurs:

{
  "message": "Description of the error.",
  "exception": "[detailed stacktrace]"
}

HTTP error responses should conform to the HTTP spec.

HTTP responses with error details should use a 4XX status code to indicate a client-side failure (such as invalid authorization, or an invalid parameter), and a 5XX status code to indicate server-side failure (such as an uncaught exception).

Transaction Logging

All API requests will be logged to provide the API management team and individual developers detailed metrics. Nothing that can identify a user of an app that consumes the API will be logged.

Pagination

DOL will follow a "no pagination first" policy. If pagination is required by the user to navigate datasets or it's necessary due to technical contstraints on the part of a data source, use the method that makes the most sense for the API's data.

Parameters

Common patterns:

  • page and per_page. Intuitive for many use cases. Links to "page 2" may not always contain the same data.
  • offset and limit. This standard comes from the SQL database world, and is a good option when you need stable permalinks to result sets.
  • since and limit. Get everything "since" some ID or timestamp. Useful when it's a priority to let clients efficiently stay "in sync" with data. Generally requires result set order to be very stable.

Metadata

Include enough metadata so that clients can calculate how much data there is, and how and whether to fetch the next set of results.

Example of how that might be implemented:

{
  "results": [ ... actual results ... ],
  "pagination": {
    "count": 2340,
    "page": 4,
    "per_page": 20
  }
}

Always use HTTPS

Any new API should use and require HTTPS encryption (using TLS/SSL) with 2048 bit encryption at a minimum. HTTPS provides:

  • Security. The contents of the request are encrypted across the Internet.
  • Authenticity. A stronger guarantee that a client is communicating with the real API.
  • Privacy. Enhanced privacy for apps and users using the API. HTTP headers and query string parameters (among other things) will be encrypted.
  • Compatibility. Broader client-side compatibility. For CORS requests to the API to work on HTTPS websites -- to not be blocked as mixed content -- those requests must be over HTTPS.

HTTPS should be configured using modern best practices, including ciphers that support forward secrecy, and HTTP Strict Transport Security. This is not exhaustive: use tools like SSL Labs to evaluate an API's HTTPS configuration.

For an existing API that runs over plain HTTP, the first step is to add HTTPS support, and update the documentation to declare it the default, use it in examples, etc.

Then, evaluate the viability of disabling or redirecting plain HTTP requests. See GSA/api.data.gov#34 for a discussion of some of the issues involved with transitioning from HTTP->HTTPS.

Server Name Indication

If you can, use Server Name Indication (SNI) to serve HTTPS requests.

SNI is an extension to TLS, first proposed in 2003, that allows SSL certificates for multiple domains to be served from a single IP address.

Using one IP address to host multiple HTTPS-enabled domains can significantly lower costs and complexity in server hosting and administration. This is especially true as IPv4 addresses become more rare and costly. SNI is a Good Idea, and it is widely supported.

However, some clients and networks still do not properly support SNI. As of this writing, that includes:

  • Internet Explorer 8 and below on Windows XP
  • Android 2.3 (Gingerbread) and below.
  • All versions of Python 2.x (a version of Python 2.x with SNI is planned).
  • Some enterprise network environments have been configured in some custom way that disables or interferes with SNI support. One identified network where this is the case: the White House.

When implementing SSL support for an API, evaluate whether SNI support makes sense for the audience it serves.

Use UTF-8

Just use UTF-8.

Expect accented characters or "smart quotes" in API output, even if they're not expected.

An API should tell clients to expect UTF-8 by including a charset notation in the Content-Type header for responses.

An API that returns JSON should use:

Content-Type: application/json; charset=utf-8

CORS

For clients to be able to use an API from inside web browsers, the API must enable CORS.

For the simplest and most common use case, where the entire API should be accessible from inside the browser, enabling CORS is as simple as including this HTTP header in all responses:

Access-Control-Allow-Origin: *

It's supported by every modern browser, and will Just Work in many JavaScript clients, like jQuery.

For more advanced configuration, see the W3C spec or Mozilla's guide.

What about JSONP?

JSONP is not secure or performant. If IE8 or IE9 must be supported, use Microsoft's XDomainRequest object instead of JSONP. There are libraries to help with this.

You can’t perform that action at this time.