Skip to content

Fuzzy Matching

Jo Jaquinta edited this page Mar 2, 2019 · 6 revisions

Many match functions return a map containing details of a match that has occurred. They all contain the same values, detailed here.

confidence is a number value, from 0 to 5. The lower the value, the more confident the match. So a value of 5 means it does not match at all. A value of 0 means it is a perfect match.

matchedWord string. Is the value to which the match relates. I.e. what it was calculated against.

method is a number. It indicates the algorithm used to produce the match. 0 = TEXT, 1 = IPA, 2 = CAVERPHONE, 3 = DOUBLE_METAPHONE.

originalWord string. Is what was passed in to match with.

matchedIPA string. Is only returned if an IPA match was conducted. It is the rendering of the matched word in the International Phonetic Alphabet.

originalIPA string. Is only returned if an IPA match was conducted. It is the rendering of word in the International Phonetic Alphabet.

matchedCaverphone string. Is only returned if an CAVERPHONE match was conducted. It is the rendering of the matched word with the Caverphone algorithm.

originalCaverphone string. Is only returned if an CAVERPHONE match was conducted. It is the rendering of the original word with the Caverphone algorithm

matchedDoubleMetaphone string. Is only returned if an DOUBLE_METAPHONE match was conducted. It is the rendering of the matched word with the Double Metaphone algorithm.

originalDoubleMetaphone string. Is only returned if an DOUBLE_METAPHONE match was conducted. It is the rendering of original word with the Double Metaphone algorithm.

matchedObject object. If an object or verb match was done, this is the object that was matched.

matchedVerb string. If a verb match was done, this is the name of the verb that was matched.

Function: map match_word (str word, str pattern)

This does a fuzzy match of word against pattern. The pattern may contain a *, which is interpreted as starting with everything ahead of that. If pattern contains spaces, they are taken to be part of the pattern.

The return value is a match map, as above.

Function: map match_object (str word, obj object)

This does a fuzzy match of word against text renderings of object. It matches against any string returned by a title() verb if it exists. If not, the name property of the object. And also any elements in any alias property that is present.

The return value is a match map, as above.

Function: list match_list (str word, list patterns, int limit)

This does a fuzzy match of word against all the elements in patterns. It returns the highest confidence matches up to the given limit. If no limit is given, 1 is assumed. The patterns should contain strings, objects or lists.

The result is a list of match maps. There will be no more than limit values returned, but there may be less if there are less original elements to match against, or there are not enough matches of sufficient merit.

Function: map match_verb (str word, obj object)

This does a fuzzy match of word against the verbs defined object or any of its parents. Spaces in a verb's name are taken to denote aliases, and a * is a prefix match, as described above. Verbs with a profile of dobj=this, prep=none, iobj=this are ignored.

The result is a match map of the highest confidence match of the elements considered for the object.

Function: list match_verbs (str word, list objects, int limit)

objects is assumed to be a list of objects. This does a fuzzy match of word against the verbs defined on each object in the list, as defined for match_verb().

The result is a list of match maps for the highest confidence match of the verbs considered for the object.