htmldoc api

David E. Wheeler edited this page Mar 23, 2017 · 7 revisions


  • Name: htmldoc
  • Returns: text/html; charset=utf-8
  • URI Template Variables: {dist}, {version}, {+docpath}
  • Availability: API Server

Returns an HTML representation of a documentation file in a single release of a distribution. One must know the distribution name and release version, as well as the doc file path to fetch an HTML document. Such data is available via the dist API and the meta API provided by an API server (but not by a mirror server). Those APIs will contain a docs key that looks something like:

"docs": {
   "README": {
      "title": "pair 0.1.2"
   "doc/pair": {
      "abstract": "A key/value pair data type",
      "title": "pair 0.1.2"

The keys are the values to use for the {+docpath} variable. Similar data may be available under the provides key, e.g.:

"provides": {
   "pair": {
      "abstract": "A key/value pair data type",
      "docfile": "doc/",
      "docpath": "doc/pair",
      "file": "sql/pair.sql",
      "version": "0.1.2"

Here it's the docpath key that provides the needed value.

Document Structure

The documents served by this API are not full HTML documents, but partial documents. There are no <html>, <head>, or <body> elements. The root element is a <div> with the ID pgxndoc.

This element in turn contains two divs. The first has the id pgxntoc and contains a table of contents for the document. The table of contents contains links to all <h1>, <h2>, and <h3> elements in the document.

Speaking of the document, it makes up the contents of the second <div> element, which has the ID pgxnbod. The links from the table of contents are made to IDs in the corresponding <h1>, <h2>, and <h3> elements. All of these are generated by the API server from original, sanitized documentation included in the distributions. Some examples:

Here's how those same files are used on the main PGXN site by PGXN::Site:

Here's how the HTML served by the API is structured:

<div id="pgxndoc">
  <div id="pgxntoc">
    <ul class="pgxntocroot">
      <li><a href="#pair.0.1.2">pair 0.1.2</a>
          <li><a href="#Synopsis">Synopsis</a></li>
          <li><a href="#Description">Description</a></li>
          <li><a href="#Author">Author</a></li></ul></li>
  <div id="pgxnbod">
<h1 id="pair.0.1.2">pair 0.1.2</h1>

<h2 id="Synopsis">Synopsis</h2>

<pre><code>% CREATE EXTENSION pair;

% SELECT pair('foo', 'bar');

 % SELECT 'foo' ~&gt; 'bar';

<h2 id="Description">Description</h2>

<p>This library contains a single PostgreSQL extension, a key/value pair data
type called <code>pair</code>, along with a convenience function for constructing
key/value pairs. It's just a simple thing, really: a two-value composite type
that can store any type of value in its slots, which are named "k" and "v".</p>

<h2 id="Author">Author</h2>

<p><a href="">David E. Wheeler</a></p>


Department of Sanitation

Documentation is generated by the API server by parsing any documents in a distribution that are supported by Text::Markup. The API server then sanitizes the HTML, removing insecure tags (<script>) and unwanted attributes (id, class). The document is then written out with the table of contents. It should then be safe to use on any web site and styled as you see fit using CSS to markup the table of contents and limit the effects only to the document by scoping CSS rules under #pgxndoc.

Perl Example

Assuming you have retrieved the JSON document from the index API and stored the data in the $table hash, you can fetch and parse the docs/html/mpq documentation in distribution "pgmp" version 1.0.0b3 from an API server like so:

use URI::Template;
use HTTP::Tiny;
my $tmpl = URI::Template->new($table->{htmldoc});
my $uri = $tmpl->process({
    dist    => 'pgmp',
    version => '1.0.0b3',
    docpath => 'docs/html/mpq',

my $req  = HTTP::Tiny->new;
my $res  = $req->get('' . $uri);
say $res->{content};