Skip to content

Goatherd\Commons\Word

goatherd edited this page Jun 25, 2012 · 4 revisions

Goatherd\Commons\Word

Generic digital search tree (trie) implementation approach.

There are several proposals for very basic tree or trie data structures in PHP. But most use inefficient node-object restricting there practical use.

Should solve

  • implemented using arrays
  • optimised for retrieval (iteration, search)
  • optimised for de-serialisation (speed and size)
  • must hold millions of short words (<50 chars) within reasonable time and space
  • must provide iteration logic
  • must provide serialisation logic
  • must provide fuzzy search
  • must provide prefix search (list all words with common prefix)
  • may provide suffix search
  • must provide a generic trie
  • should provide a generic tree implementation
  • should provide a balanced ternary search tree (TST)
  • must provide a (read-only) DAG-transformation for TST and generic tries

Proposal state

Initial profiling done. Trie, DAG and TST approach tested. Interface design started.

Roadmap

Initial interface proposal and sample implementation scheduled for July 2012.

Clone this wiki locally