Damerau Levenshtein edit distance in Perl6
Perl6
Switch branches/tags
Latest commit 96ae611 Apr 10, 2016 @ugexe Fix Out-of-memory error
Native int/str were causing a memory error for an unknown reason

README.pod

Text::Levenshtein::Damerau

Levenshtein and Damerau Levenshtein edit distances

Synopsis

use Text::Levenshtein::Damerau;

say dld('Neil','Niel'); # damerau levenstein distance
# prints 1

say ld('Neil','Niel'); # levenshtein distance
prints 2

Description

Returns the true Levenshtein or Damerau Levenshtein edit distance of strings with adjacent transpositions.

use Text::Levenshtein::Damerau;

my @names = 'John','Jonathan','Jose','Juan','Jimmy';
my $name_mispelling = 'Jonh';

my $dl = Text::Levenshtein::Damerau.new(
    max             => 0,       # default 
    targets         => @names,  # required
    sources         => [$name_mispelling]
);

say "Lets search for a 'John' but mistyped...";
my %results =  $dl.get_results;

# Display each source, target, and the distance
for %results.kv -> $source, $targets {
    for $targets.kv -> $target, $dld {
        say "source:$source\ttarget:$target\tdld:" ~ ($dld // "<max exceeded>");
    }
}

# More info
say "----------------------------";
say "\$dl.best_distance:        {$dl.best_distance}";
say "-";
say "\$dl.targets:              {~$dl.targets}";
say "\$dl.best_target:          {$dl.best_target}";
say "-";
say "\@names:                   {~@names}";

Routines

  • dld

    Damerau Levenshtein Distance (Levenshtein Distance including transpositions)

    Arguments: $source, $target, $max?

    $max distance. 0 = unlimited. Default = 0

    Returns: int that represents the edit distance between the two argument. Stops calculations and returns Int if max distance is set and reached if possible.

    use Text::Levenshtein::Damerau;
    say dld('AABBCC','AABCBCD');
    # prints 2
    
    # Max edit distance of 1
    say dld('AABBCC','AABCBCD',1); # distance is 2
    # prints Int
  • ld

    Levenshtein Distance (no transpositions)

    Arguments: $source, $target, $max?

    $max distance. 0 = unlimited. Default = 0

    Returns: Int that represents edit distance between the two argument. Stops calculations and returns Int if max distance is set and reached if possible.

    use Text::Levenshtein::Damerau;
    say ld('AABBCC','AABCBCD');
    # prints 3
    
    # Max edit distance of 1
    # Uses regular Levenshtein distance (no transpositions)
    say ld('AABBCC','AABCBCD',1); # distance is 3
    # prints Int

Methods

  • new

    Damerau Levenshtein Distance (Levenshtein Distance including transpositions)

    Arguments: \@sources, \@targets, $max?

    $max distance. 0 = unlimited. Default = 0

    Create a new object. For now, this is so you may call get_results on it.

  • get_results

    Generates %results and sets the following attributes:

    $.best_distance
    $.best_target
    $.best_source if @.sources.elems > 1
    %.results
    # %results{$source}{$target} = $distance_result

    Returns: %.results

Bugs

Please report bugs to:

https://github.com/ugexe/Perl6-Text--Levenshtein--Damerau/issues

Author

Nick Logan nlogan@gmail.com