Skip to content


Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
C++ Java
branch: master
Failed to load latest commit information.
build-dawg-cpp Clean up
src First commit! Update


DAWGs (directed acyclic word graphs) are minimal, highly compressed dictionaries implemented using deterministic acyclic finite state automata, providing lightning fast lookup speeds.

Some properties:

  • Prefix data structure (just like a trie)
  • Optimal lookup complexity O(length)
  • Small size, fast loading (stored as an array of ints)

This Java class provides read-only access to files created by the dawgdic C++ library and DAWG Python module.

Taken from:

Building in C++

A single C++ source file "dawgdic-build.cpp" which uses the dawgdic library is included in this package to build a dawg from a lexicon file, which has sorted lexicons separated by line feeds (not CRLF), and write it to disk. This code is not rigorously tested, so may give nasty suprises!

g++ -O3 dawgdic-build.cpp -o dawgdic-build
dawgdic-build lexicons.txt lexicon-dawg.dawg

Building in Python

Using the DAWG module, building the dawg is as simple as

import dawg
d = dawg.DAWG(data)'lexicon-dawg.dawg')
Something went wrong with that request. Please try again.