-
Notifications
You must be signed in to change notification settings - Fork 0
Does a set of code words form a uniquely decodable code?
ap/UDCode
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
use strict; use warnings; package UDCode; our $VERSION = '1.04'; BEGIN { require Exporter; *import = \&Exporter::import } our @EXPORT = qw(is_udcode ud_pair); =head1 NAME UDCode - Does a set of code words form a uniquely decodable code? =head1 SYNOPSIS use UDCode; if (is_udcode(@words)) { ... } my ($x1, $x2) = ud_pair(@words); =head1 DESCRIPTION A I<code> is a set of strings, called the I<code words>. A code is I<uniquely decodable> if any string I<S> that is a concatenation of code words is so in I<exactly one way>. For example, the code C<('ab', 'abba', 'b')> is I<not> uniquely decodable, because C<'abba' . 'b' eq 'ab' . 'b' . 'ab'>. But the code C<('a', 'ab', 'abb')> I<is> uniquely decodable, because there is no such pair of sequences of code words. This module provides a pair of functions to tell whether a set of code words is a uniquely decodable code, and to find an example of sequences of code words whose concatenations are the same, if there is such a pair. =head1 INTERFACE =head2 C<is_udcode> C<is_udcode(@words)> returns true if and only if the specified code is uniquely decodable. =cut sub is_udcode { my $N = my ($a, $b) = ud_pair(@_); return $N == 0; } =head2 C<ud_pair> If C<@words> is not a uniquely decodable code, then C<ud_pair(@words)> returns a proof of that fact, in the form of two distinct sequences of code words whose concatenations are equal. If C<@words> is not uniquely decodable, then C<ud_pair> returns references to two arrays of code words, C<$a>, and C<$b>, such that: join("", @$a) eq join("", @$b) For example, given C<@words = qw(ab abba b)>, C<ud_pair> might return the two arrays C<["ab", "b", "ab"]> and C<["abba", "b"]>. If C<@words> is uniquely decodable, C<ud_pair> returns false. =cut sub ud_pair { # Code words my @c = @_; # $h{$x} = [$y, $z] means that $x$y eq $z my %h; # Queue my @q; for my $c1 (@c) { for my $c2 (@c) { next if $c1 eq $c2; if (is_prefix_of($c1, $c2)) { my $x = subtract($c1, $c2); $h{$x} = [[$c1], [$c2]]; push @q, $x; } } } while (@q) { my $x = shift @q; return unless defined $x; my ($a, $b) = @{$h{$x}}; for my $c (@c) { die unless defined $b; # Can't happen # $a$x eq $b my $y; if (is_prefix_of($c, $x)) { $y = subtract($c, $x); next if exists $h{$y}; # already tried this $h{$y} = [[@$a, $c], $b]; push @q, $y; } elsif (is_prefix_of($x, $c)) { $y = subtract($x, $c); next if exists $h{$y}; # already tried this $h{$y} = [$b, [@$a, $c]]; push @q, $y; } return @{$h{""}} if defined($y) && $y eq ""; } } return; # failure } sub is_prefix_of { index($_[1], $_[0]) == 0; } sub subtract { substr($_[1], length($_[0])); } =head1 AUTHOR Mark Jason Dominus =head1 COPYRIGHT AND LICENSE This software is hereby released into the public domain. You may use, modify, or distribute it for any purpose whatsoever without restriction. =cut 1;
About
Does a set of code words form a uniquely decodable code?