Skip to content

Index Format

Eugene Lazutkin edited this page Jun 13, 2026 · 2 revisions

Index Format

A wiki-search index is a single, self-describing JSON document. A client assumes nothing beyond this contract: it validates the version and required fields, then builds result links purely from the index's own metadata. Any site that emits this shape is searchable.

This is the human-facing summary; the repo's INDEX-FORMAT.md is the source of truth.

Shape

{
  "v": 1,
  "site": {
    "name": "wiki-search wiki",
    "urlTemplate": "https://github.com/uhop/wiki-search/wiki/{page}",
    "fragments": true
  },
  "docs": [
    { "id": 0, "page": "Architecture", "title": "Architecture",
      "heading": "The pieces", "anchor": "the-pieces", "text": "full section text…" }
  ]
}

Fields

  • v — format version (1). Clients reject versions they don't understand.
  • site.name — human label for the corpus.
  • site.urlTemplate — result-URL template; must contain {page}. No hardcoded host.
  • site.fragmentstrue if the target renders Text Fragments; when false, clients omit :~:text=.
  • docs[].id — stable integer, sequential in build order.
  • docs[].page — the {page} substitution (for GitHub wikis, the page's URL segment, Foo-Bar). May also be a relative URL path (e.g. ../blob/main/README.md) to point a result at a file outside the wiki — see Folding in non-wiki files.
  • docs[].title — page display title.
  • docs[].heading — section heading (falls back to the page title for the page top).
  • docs[].anchor — in-page anchor slug; "" means the page top.
  • docs[].text — plain-text section body (markdown stripped), for the engine.

Building a result URL

base = urlTemplate.replace("{page}", encodePerSegment(doc.page))
hash = doc.anchor                          (omit if empty)
text = ":~:text=" + matchedPhrase          (only if site.fragments and a phrase)
url  = base + ("#" + hash + text  if any)

encodePerSegment runs encodeURIComponent on each /-separated segment and rejoins with /. For an ordinary single-segment page it's identical to encodeURIComponent; for a relative page it keeps the / and .. literal so the browser can normalize the dot-segments (see below).

Folding in non-wiki files

A repo file (e.g. a README.md that doubles as API docs) can be indexed alongside the wiki pages by giving its docs a relative page that climbs out of the wiki path:

{ "id": 12, "page": "../blob/main/README.md", "title": "my-tool",
  "heading": "Usage", "anchor": "usage", "text": "" }

Against the canonical …/wiki/{page} template this becomes …/wiki/../blob/main/README.md, which the browser normalizes to …/blob/main/README.md on navigation (the .. drops the wiki segment before the request is sent). Anchors and :~:text= apply just as they do for wiki pages. It's a pure page-value convention — no new fields, so the index stays v: 1. The builder emits these via --file <path> (see Add Search).

Validation (verify-or-explain)

On any failure the client shows a specific message, never a blank box:

  1. the index is fetchable (else 404 / network / no CORS);
  2. it is valid JSON;
  3. v is supported (else "format vN unsupported — app or index out of date");
  4. site.urlTemplate is present and contains {page};
  5. docs is a non-empty array, each entry having page, title, text.

Versioning

v increases only on a breaking change. Additive optional fields don't bump v; clients ignore unknown fields, and a client meeting a higher v than it knows stops and says so rather than guessing.

Clone this wiki locally