bucketsort - bucket sort that can be used for general purpose
Perl C
Switch branches/tags
Nothing to show
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
Makefile
README
bench-num.c
bench-str.c
benchmark.pod
bucketsort-omp.c
bucketsort.c
bucketsort.h
bucketsort.pod
main.c

README

NAME
    bucketsort - bucket sort

SYNOPSIS
      #include "bucketsort.h"
      int bucketsort(
        void        **base,
        size_t        nmemb,
        keyaccessor_t key,
        indexer_t     idx,
        comparator_t  cmp
      );

DESCRIPTION
    The bucketsort() is an implementations of bucket sort.

  base
    The reference to the array of pointers to the element (or integers whose
    size equals to that of pointer) to be sorted in-place.

  nmemb
    The nember of elements in the array.

  key
    The function pointer which takes the reference to the array element and
    returns the reference to the key.

      void *key(void *element);

    When "NULL" the default function is used which is defined as:

      inline void *identity(void *v)
      {
        return v;
      }

  idx
    The function pointer which takes the reference to the key and the
    recursion level of the caller and returns the integer value between 0
    and 255, which is the bucket id.

    When "NULL" the default function is used which is defined as:

      inline int charAt(void *v, size_t i)
      {
        char *s = (char *) v;
        return s ? strlen(s) > i ? s[i] : 0 : 0;
      };

    In other words, default acts the same as the MSD radix sort with
    variable lengths.

  cmp
    The Function pointer which compares two keys. This is needed to allow
    elements with same keys. Its signature is the same as the one you pass
    to "qsort()", "mergesort()" and "heapsort()" in libc.

    When "NULL", it defaults to:

      inline int stringcmp(void *a, void *b)
      {
        return strcmp(a ? a : "", b ? b : "");
      }

EXAMPLE
    See main.c which is included in the distro.

IMPLEMENTATION DETAILS
    The bucket is implemented as a queue by linked lists. The whole array is
    converted to the linked list before sorted then written back to the
    array. If you like you can use the list sorter directly as:

      list_t bucketsort_l(
        list_t        src,
        size_t        nmemb,
        keyaccessor_t key
        indexer_t     idx,
        comparator_t  cmp
      );

    where:

      typedef struct {
        void *car, *cdr;
      } _cons_t, *list_t;

ACHILLES'S HEEL AND ITS SHIELD
    The obvious drawback of the recursive bucket sort is that it recurses
    deeply when it comes accross data (strings) with long common prefix.
    Given:

      aa..a  # "a" x n
      aa..aa # "a" x (n+1)

    It recurses n times.

    To cope with such cases, "bucketsort()" now falls back to the merge sort
    after "BUCKETSORT_DEPTH", 256 in bucketsort.h. The merge sort used there
    is defined as:

      list_t mergesort_l(
        list_t        src,
        comparator_t  cmp
      );

RETURN VALUES
    The bucketsort() function returns the value 0 if successful; otherwise
    the value -1 is returned and the global variable errno is set to
    indicate the error.

ERRORS
  ENOMEM
    When "bucketsort()" fails to allocate the memory needed to build the
    list, which is exactly twice the size of the original array since each
    element takes two pointers.

SEE ALSO
    sort(3), qsort(3), megesort(3), heapsort(3), radixsort(3)

    <http://en.wikipedia.org/wiki/Bucket_sort>

    Engineering Radix Sort
      <http://usenix.org/publications/compsystems/1993/win_mcilroy.pdf>

      Consequently this program is a generalized verion of the Program A in
      the paper. Their program (and radixsort(1) in 4.4 BSD) is by limiting
      its application to string sorting.