Skip to content

usepennington/dewey

Repository files navigation

DeweySearch

A standalone, dependency-free static search index. Build a sharded, Pagefind-style inverted index at build time in C#; query it entirely client-side with a tiny JavaScript client that fetches only the shards a query touches. No server, no service, no third-party runtime.

DeweySearch is the search engine extracted from Pennington, with zero ties to any content model — feed it documents, get back JSON artifacts.

How it works

documents ──▶ IndexBuilder.Build() ──▶ SearchIndex ──▶ ToFiles()
                                                          │
                                   index.json, t-{prefix}.json, f-{docId}.json
                                                          │
                                                          ▼
                                   DeweySearchEngine (dewey-search.js)  ◀── browser query

The C# builder and the JS client share a byte-for-byte tokenizer/stemmer contract so that build-time index keys and client-time query terms always agree. Both are pinned by the shared fixtures under conformance/.

Packages

Package What it is
DeweySearch (NuGet) The BCL-only index builder, tokenizer, stemmer, and wire records.
DeweySearch.Web (NuGet) ASP.NET Core static-asset delivery of the client at _content/DeweySearch.Web/dewey-search.js.

Building an index (C#)

using DeweySearch;

var documents = new[]
{
    new SearchDocument(
        Url: "/guide/routing/",
        Title: "Routing Guide",
        Description: "Configure routing for your pages",
        Headings: "Routes Wildcards",
        Body: "Long-form plain-text body…",
        Priority: 5,
        Facets: new Dictionary<string, string[]>
        {
            ["section"] = ["Guides"],
            ["tag"]     = ["routing", "beginner"],
        }),
};

var index = new IndexBuilder(new IndexOptions { ShardPrefixLength = 2 }).Build(documents);

// Write the artifacts wherever the client will fetch them from.
foreach (var (name, bytes) in index.ToFiles())
{
    File.WriteAllBytes(Path.Combine("wwwroot/search/en", name), bytes);
}

Facets are an open dictionary — any dimension you put on a document (section, tag, area, author, …) is interned, id-mapped, and shipped in the manifest for client-side filtering. DeweySearch has no built-in notion of what a facet means.

Querying (JavaScript)

<script src="/_content/DeweySearch.Web/dewey-search.js"></script>
<script>
  const engine = new DeweySearchEngine('/search/en'); // directory holding the artifacts
  const results = await engine.search('routing');      // [{ docId, score, fields }, …]
  const F = DeweySearchEngine.FieldFlags;
  for (const { docId, fields } of results) {
    const doc = engine.docEntry(docId);                // title, url, facets
    // `fields` is the OR of fields the query matched in; skip the body snippet on a heading hit.
    const headingHit = fields & (F.Title | F.Heading);
    const fragment = headingHit ? null : await engine.loadFragment(docId); // body excerpt, on demand
  }
</script>

The client fetches index.json once, then only the term-prefix shards a query touches and the fragments for results actually shown — the whole index is never downloaded. It supports BM25 ranking, field boosts (title/heading/description/body), prefix completion, bounded typo-tolerant fuzzy matching, and synonyms.

Each result carries fields — the OR of the field flags the match landed in (DeweySearchEngine.FieldFlags: Title, Heading, Description, Body) — so the UI can branch on where a query hit, for example dropping the body snippet when a result already matched in its heading.

Repository layout

  • src/DeweySearch/ — the index builder and the cross-language tokenizer/stemmer (BCL-only).
  • src/DeweySearch.Web/ — Razor Class Library that ships the JS client as a static web asset.
  • js/ — the canonical dewey-search.js (shipped via DeweySearch.Web) and its contract tests.
  • conformance/ — shared fixtures both runtimes assert against.
  • tests/DeweySearch.Tests/ — C# unit + contract tests.

Build & test

dotnet test DeweySearch.slnx       # C# engine + cross-language contract
cd js && node --test         # JS client + same contract fixtures

License

MIT

About

A standalone, dependency-free static search index — a build-time C# library emits a sharded, Pagefind-style inverted index that a tiny JavaScript client queries entirely in the browser.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors