Fonetic

Text search utils, fuzzy search algorithms and collections to make their use easier.

Main packages and classes:

####text

Word — representation of a piece of text as some string value, which internally holds mapping to source where it was initially extracted from. Across transformations, such mapping remains unchanged or may change accordingly if word length changes, so after all operations it is clear simple to align resulting word to its initial source. Word implements CharSequence of its value, making it easy to use it in search algorithms, utility methods etc;
Words — utility class to produce Words and play with them (extract from String, join, split etc).

This may be useful when you work with documents containing markup tags and other special entities—you first extract text as a collection or Words from the document, then process/modify them, and finally apply modifications to source, leaving markup untouched. (For example, you need to search and highlight dictionary entries in html document.)

####search

FoneticSearch — original algorithm to search for phonetically similar occurrences of pattern in text. The main goal is to allow not only phonetic variations, but also non-phonetic misspells, that commonly used Metaphone and Soundex don't handle—misspelled word there are very likely to be encoded differently than original;
LcsSearch — search for matches using gapped longest common subsequence.

####collect [in progress, subject to change] — collections to help in search algorithms, incliding:

CharMap<V> — fast and simple char-keyed map;
SimpleTrieMap<V> — implementation of Map<String, V> and TrieMap<V> based on compressed trie.

A few small demos to see how it works: src/main/java/demos

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pom.xml		pom.xml
tabs-to-spaces.xml		tabs-to-spaces.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fonetic

About

Releases

Packages

Languages

License

Salauyou/Fonetic

Folders and files

Latest commit

History

Repository files navigation

Fonetic

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages