FuzzyCompare

Fuzzy string comparison library

ComparisonMethods

The class provides three static methods:

JaroSimilarity: calculates the Jaro similarity between two given ReadOnlySpan<char> inputs. It returns a float value that ranges from 0 to 1, representing the similarity between the two inputs.
JaroWinklerSimilarity: calculates the Jaro-Winkler similarity between two given ReadOnlySpan<char> inputs. It returns a float value that ranges from 0 to 1, representing the similarity between the two inputs. The p parameter is optional and it sets the scaling factor for the common prefix length adjustment.
LevenshteinDistance: calculates the Levenshtein distance between two given ReadOnlySpan<char> inputs. It returns an int value that represents the minimum number of single-character edits (insertions, deletions, or substitutions) required to change one input into the other.

FuzzyStringComparer

The FuzzyStringComparer class provides a way to compare strings using a fuzzy matching algorithm. The main purpose of the FuzzyStringComparer class is to determine the similarity between two strings, by calculating a similarity score based on the number of matching characters and their positions within the strings.

It provides a constructor that takes an optional CultureInfo parameter, which can be used to specify the culture to use for string comparison. By default, the FuzzyStringComparer class uses the current culture for case-insensitive string comparison.

The FuzzyStringComparer class uses a fuzzy matching algorithm that takes into account several factors, including the length of the strings, the number of matching characters, and the positions of the matching characters within the strings. The algorithm is designed to be tolerant of small differences between the strings, such as typos, misspellings, or minor variations in formatting.

Tokenizer

The Tokenizer class is a utility class that provides methods for tokenizing strings and spans of characters. The main purpose of the class is to break down a string or span of characters into individual tokens, which are then returned as a token enumerator.

The class provides two main methods for tokenizing strings and spans of characters: Tokenize and EnumerateTokens. The Tokenize method takes a string as input and returns an TokenEnumerator object that can be used to iterate through the tokens in the string. The EnumerateTokens method takes a read-only span of characters as input and returns a TokenRefEnumerator object that can be used to iterate through the tokens in the span.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
FuzzyCompare		FuzzyCompare
FuzzyCompareTests		FuzzyCompareTests
.gitignore		.gitignore
Directory.Build.props		Directory.Build.props
Directory.Build.targets		Directory.Build.targets
FuzzyCompare.sln		FuzzyCompare.sln
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FuzzyCompare

ComparisonMethods

FuzzyStringComparer

Tokenizer

About

Releases

Packages

Languages

License

vborovikov/fuzzy

Folders and files

Latest commit

History

Repository files navigation

FuzzyCompare

ComparisonMethods

FuzzyStringComparer

Tokenizer

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages