Ukkonen's Approximate String Matching algorithm
Switch branches/tags
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Ukkonen - Approximate String Matching

npm version Build Status

This project implements the Approximate String Matching algorithm by Esko Ukkonen extended with ideas from An Extension of Ukkonen's Enhanced Dynamic Programming ASM Algorith by Hal Berghel and David Roach.

Ukkonen's algorithm is very competitive with the Levenshtein distance and for longer strings it is much more performant than Levenshtein distance.

In addition to being a competitive alternative to Levenshtein distance, Ukkonen's algorithm also allows you to provide a threshold for the distance which increases the performance even more for texts that are longer than the threshold.

HTML diffing using Levenshtein HTML diffing using Ukkonen's algorithm

Above you can see the different of using Levenshtein distance and Ukkonen's algorithm for matching sub-trees when diffing HTML.


npm install --save ukkonen


You can find the distance between the strings Ukkonen and Levenshtein the following way:

var ukkonen = require('ukkonen')

assert.equal(ukkonen('Ukkonen', 'Levenshtein'), 8)

If you want to limit the distance by a given threshold:

var ukkonen = require('ukkonen')

assert.equal(ukkonen('Ukkonen', 'Levenshtein', 6), 6)
assert.equal(ukkonen('Ukkonen', 'Levenshtein', 10), 8)

Platform support

The library is ES5 and will work with any JavaScript bundler in the browser as well as Node versions with ES5 support.


I have benchmarked the library against the fastest Levenshtein distance implementation on NPM.

              Edit distance one word
 245,499 op/s » ukkonen
 502,333 op/s » leven

              Edit distance on sentence with small differences
 767,359 op/s » ukkonen
 139,628 op/s » leven

              Edit distance on paragraphs with small differences
 237,857 op/s » ukkonen
   2,670 op/s » leven

              Edit distance on longer texts with small differences
 112,547 op/s » ukkonen
     683 op/s » leven

              Edit distance on longer texts with many differences
     372 op/s » ukkonen
     416 op/s » leven

              Edit distance on longer texts with small differences and a threshold of 10
 127,725 op/s » ukkonen
     678 op/s » leven

              Edit distance on longer texts with many differences and a threshold of 40
  84,959 op/s » ukkonen
     425 op/s » leven


Obviously the authors of the papers describing the algorithm Esko Ukkonen, Hal Berghel and David Roach.

I stole a lot of ideas from Sindre Sorhus's leven library and I also used it to test my implementation against.


MIT © Sune Simonsen