Skip to content

Latest commit

 

History

History
29 lines (19 loc) · 1.32 KB

readme.md

File metadata and controls

29 lines (19 loc) · 1.32 KB

SpellChecker

A project inspired by Peter Norvig's essay How to Write a Spelling Corrector written in modern C++ and tested with Catch2.

On average, running the correction function takes 1.4 seconds per 10 words.

int main() {
  SpellChecker checker("training.txt");

  std::cout << checker.correction("expresion") << std::endl; // expression
  std::cout << checker.correction("thea") << std::endl; // the
  std::cout << checker.correction("helpo") << std::endl; // hello
  std::cout << checker.correction("queot") << std::endl; // quote
  std::cout << checker.correction("peotry") << std::endl; // poetry
}

How It Works

  1. We read word_freq.txt in the SpellChecker's constructor and process each line, which contains an english word followed by a space and then the word's frequency.

  2. Given a word the SpellChecker generates a set of possible correction candidates. This set is generated by taking words that are one and two edit distances away from the original word. This process produces a lot of words that would not be found in a dictionary so this set is filtered by all known english words.

  3. We choose the most likely correction by finding the word that has the max frequency.

See It In Action

https://repl.it/@awdavids/SpellChecker