Skip to content

A v1 AutoHotkey package that finds degree of similarity between strings, based on Sørensen–Dice coefficient, which is mostly better than Levenshtein distance.

Notifications You must be signed in to change notification settings

Chunjee/stringc.ahk

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Finds degree of similarity between two strings, based on Dice's Coefficient, which is mostly better than Levenshtein distance.

Installation

In a terminal or command line navigated to your project folder:

npm install stringc.ahk

In your code only export.ahk needs to be included:

#Include %A_ScriptDir%\node_modules
#Include stringc.ahk\export.ahk
ostringc := new stringc()

ostringc.compare("test", "testing")
; => 0.67
ostringc.compare("Hello", "hello")
; => 1.0

API

Including the module provides a class stringc with three methods: .compare, .compareAll, and .bestMatch

compare(string1, string2, [function])

Returns a fraction between 0 and 1, which indicates the degree of similarity between the two strings. 0 indicates completely different strings, 1 indicates identical strings. The comparison is case-insensitive.

Arguments

string1 (string): The first string

string2 (string): The second string

function (function): A function to applied to both strings prior to comparison.

Order does not make a difference.

Returns

(Number): A fraction from 0 to 1, both inclusive. Higher number indicates more similarity.

Example
stringc.compare("healed", "sealed")
; => 0.80

stringc.compare("Olive-green table for sale, in extremely good condition."
	, "For sale: table in very good  condition, olive green in colour.")
; => 0.71

stringc.compare("Olive-green table for sale, in extremely good condition."
	, "For sale: green Subaru Impreza, 210,000 miles")
; => 0.30

stringc.compare("Olive-green table for sale, in extremely good condition."
	, "Wanted: mountain bike with at least 21 gears.")
; => 0.11

compareAll(targetStrings, mainString, [function])

Compares mainString against each string in targetStrings.

Arguments

targetStrings (array): Each string in this array will be matched against the main string.

mainString (string): The string to match each target string against.

function (function): A function to applied to each element prior to comparison.

Returns

(Object): An object with a ratings property, which gives a similarity rating for each target string, and a bestMatch property, which specifies which target string was most similar to the main string. The array of ratings are sorted from higest rating to lowest.

Example
stringc.compareAll(["For sale: green Subaru Impreza, 210,000 miles"
	, "For sale: table in very good condition, olive green in colour."
	, "Wanted: mountain bike with at least 21 gears."]
	, "Olive-green table for sale, in extremely good condition.")
; =>
{ ratings:
	[{ target: "For sale: table in very good condition, olive green in colour.",
		rating: 0.71 },
	{ target: "For sale: green Subaru Impreza, 210,000 miles",
		rating: 0.30 },
	{ target: "Wanted: mountain bike with at least 21 gears.",
		rating: 0.11 }],
	bestMatch:
	{ target: "For sale: table in very good condition, olive green in colour.",
		rating: 0.71 } }

bestMatch(targetStrings, mainString, [function])

Compares mainString against each string in targetStrings.

Arguments

mainString (string): The string to match each target string against. targetStrings (Array): Each string in this array will be matched against the main string. function (function): A function to applied to each element prior to comparison.

Returns

(String): The string that was most similar to the first argument string.

Example
stringc.bestMatch([" hard to    "
	, "hard to"
	, "Hard 2"]
	, "Hard to")
; => "hard to"

About

A v1 AutoHotkey package that finds degree of similarity between strings, based on Sørensen–Dice coefficient, which is mostly better than Levenshtein distance.

Topics

Resources

Stars

Watchers

Forks