Skip to content
/ UDCode Public

Does a set of code words form a uniquely decodable code?

Notifications You must be signed in to change notification settings

ap/UDCode

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

use strict; use warnings;

package UDCode;

our $VERSION = '1.04';

BEGIN { require Exporter; *import = \&Exporter::import }
our @EXPORT = qw(is_udcode ud_pair);

=head1 NAME

UDCode - Does a set of code words form a uniquely decodable code?

=head1 SYNOPSIS

        use UDCode;

        if (is_udcode(@words)) { ... }

        my ($x1, $x2) = ud_pair(@words);

=head1 DESCRIPTION

A I<code> is a set of strings, called the I<code words>. A code is
I<uniquely decodable> if any string I<S> that is a concatenation of
code words is so in I<exactly one way>.

For example, the code C<('ab', 'abba', 'b')> is I<not> uniquely
decodable, because C<'abba' . 'b' eq 'ab' . 'b' . 'ab'>. But the code
C<('a', 'ab', 'abb')> I<is> uniquely decodable, because there is no such
pair of sequences of code words.

This module provides a pair of functions to tell whether a set of
code words is a uniquely decodable code, and to find an example of
sequences of code words whose concatenations are the same, if there is
such a pair.

=head1 INTERFACE

=head2 C<is_udcode>

C<is_udcode(@words)> returns true if and only if the specified code is
uniquely decodable.

=cut

sub is_udcode {
  my $N = my ($a, $b) = ud_pair(@_);
  return $N == 0;
}

=head2 C<ud_pair>

If C<@words> is not a uniquely decodable code, then C<ud_pair(@words)>
returns a proof of that fact, in the form of two distinct sequences of
code words whose concatenations are equal.

If C<@words> is not uniquely decodable, then C<ud_pair> returns
references to two arrays of code words, C<$a>, and C<$b>, such that:

	join("", @$a) eq join("", @$b)

For example, given C<@words = qw(ab abba b)>, C<ud_pair> might return
the two arrays C<["ab", "b", "ab"]> and C<["abba", "b"]>.

If C<@words> is uniquely decodable, C<ud_pair> returns false.

=cut

sub ud_pair {
  # Code words
  my @c = @_;

  # $h{$x} = [$y, $z]  means that $x$y eq $z
  my %h;

  # Queue
  my @q;

  for my $c1 (@c) {
    for my $c2 (@c) {
      next if $c1 eq $c2;
      if (is_prefix_of($c1, $c2)) {
        my $x = subtract($c1, $c2);
        $h{$x} = [[$c1], [$c2]];
        push @q, $x;
      }
    }
  }

  while (@q) {
    my $x = shift @q;
    return unless defined $x;

    my ($a, $b) = @{$h{$x}};
    for my $c (@c) {
      die unless defined $b;      # Can't happen
      # $a$x eq $b

      my $y;
      if (is_prefix_of($c, $x)) {
        $y = subtract($c, $x);
        next if exists $h{$y};  # already tried this
        $h{$y} = [[@$a, $c], $b];
        push @q, $y;
      } elsif (is_prefix_of($x, $c)) {
        $y = subtract($x, $c);
        next if exists $h{$y};  # already tried this
        $h{$y} = [$b, [@$a, $c]];
        push @q, $y;
      }

      return @{$h{""}} if defined($y) && $y eq "";
    }
  }
  return;                       # failure
}

sub is_prefix_of {
  index($_[1], $_[0]) == 0;
}

sub subtract {
  substr($_[1], length($_[0]));
}

=head1 AUTHOR

Mark Jason Dominus

=head1 COPYRIGHT AND LICENSE

This software is hereby released into the public domain.  You may use,
modify, or distribute it for any purpose whatsoever without restriction.

=cut

1;

About

Does a set of code words form a uniquely decodable code?

Topics

Resources

Stars

Watchers

Forks

Languages