Skip to content


Subversion checkout URL

You can clone with
Download ZIP
Natural sorting in Perl 6
Tree: fa83526353

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.





    use v6;
    use Sort::Naturally;

    # sort strings containing a mix of letters and digits sensibly
    my @a =
	  <1 11 100 14th 2 210 21 30 3rd d1 Any any d10 D2 D21 d3 aid Are ANY >;
    say ~@a.nsort;
    # or
    say @a.sort( { $^a ncmp $^b } ).join(' ');

@a.nsort yields:

    1 2 3rd 11 14th 21 30 100 141 210 aid ANY Any any Are d1 D2 d3 d10 D21

compared to @a.sort:

    1 100 11 141 14th 2 21 210 30 3rd ANY Any Are D2 D21 aid any d1 d10 d3

    # or sort a list of dotted quad notation IP addresses:
    my @ips = ((0..255).roll(4).join('.')for 0..99);

    .say for nsort @ips;


Similar though not identical to the Perl 5 Sort::Naturally. When sorting strings
that contain groups of digits, will sort the groups of consecutive digits by
"order of magnitude", then lexically by lower-cased terms. Order of magnitude is
something of a simplification. Sort::Naturally doesn't try to interpret or
evaluate a group of digits as a number, it just counts how many digits are in
each group and uses that as its order of magnitude. 

The implications are:

    It doesn't understand the (non)significance of leading zeros in numbers;
    0010, 0100 and 1000 are all treated as being of the same order of magnitude
    and will all be sorted to be after 20 and 200. This is the correct behavior
    for strings of digits where leading zeros are significant, like zip codes or
    phone numbers.

    It doesn't understand floating point numbers; the groups of digits before
    and after a decimal point are treated as two separate numbers for sorting

    It doesn't understand or deal with scientific or exponential notation.

However, that also means:

    You are not limited by number magnitude. It will sort arbitrarily large
    numbers with ease regardless of the math capability of your hardware/OS.

    It is quite speedy. (For liberal values of speedy.) Since it doesn't need to
    interpret numbers by value it eliminates a lot of code that would do that.

Sort::Naturally could have been modified to ignore leading zeros, and in fact I
experimented with that bit, but ran into issues with strings where leading zeros
WERE significant. Just remember, it is for sorting strings, not numbers. It
makes some attempt at treating groups of digits in a kind of numbery way, but
they are still strings. If you truly want to sort numbers, use a numeric sort.


Sort::Naturally has at its heart a sorting block modifier transformation
routine: &naturally. This performs a transform of the sort terms so they will end
up in the natural order.

C<nsort> is meant to be used as the primary natural sorting routine. Syntactic
sugar for C<sort( { .&naturally } )>. May be called either as a method or a sub:
C<@array.nsort> or C<nsort @array>.

C<ncmp> is meant to be used in sort blocks. Syntactic sugar for C<sort( {
$^a.naturally cmp $^b.naturally } )>. Useful when you need to do secondary

Say you have a hash containing the words in a document with the keys being the
number of times each appears. You could sort by word frequency, then naturally
as follows:

    ("%words{$_}, $_").say
      for sort {%words{$^b} <=> %words{$^a} || $^a ncmp $^b}, %words.keys;

Note: using a sort block with an arity > 1 will disable the default Schwartzian
Transform and may be very slow. If that is an issue either do a manual
Schwartzian Transform or some kind of caching of terms.

nsort will flatten all parameters passed to it into lists whether called as a
method or sub. (Previously, method calling conventions required manual
flattening.) This means (@a1,@a2,@a3).nsort will return a list comprised of the
contents of each of the arrays sorted in natural order, rather than a list of
the three arrays nsorted on their names. This seems to more what would be
expected following the Principle of least astonishment.


Perl 5 Sort::Naturally has an odd convention in that numbers at the beginning of
strings are sorted in ASCII order (digits sort before letters) but numbers
embedded inside strings are sorted in non-ASCII order (digits sort after
letters). While this is just plain strange in my opinion, some people may rely
on or prefer this behavior so perl6 Sort::Naturally has "p5 compatibility mode"
routines. These are analogues of the primary routines prepended with p5.

C<p5naturally()>, C<p5nsort()> and C<p5ncmp>. Used identically to the p6

for comparison:

  ('   sort:',<foo12z foo foo13a fooa Foolio Foo12a foolio foo12 foo12a 9x 14>\
    .sort).join(' ').say;
  ('  nsort:',<foo12z foo foo13a fooa Foolio Foo12a foolio foo12 foo12a 9x 14>\
    .nsort).join(' ').say;
  ('p5nsort:',<foo12z foo foo13a fooa Foolio Foo12a foolio foo12 foo12a 9x 14>\
    .p5nsort).join(' ').say;

       sort: 14 9x Foo12a Foolio foo foo12 foo12a foo12z foo13a fooa foolio
      nsort: 9x 14 foo foo12 Foo12a foo12a foo12z foo13a fooa Foolio foolio
    p5nsort: 9x 14 foo fooa Foolio foolio foo12 Foo12a foo12a foo12z foo13a



Tests and the p5* routines will fail under locales that specify lower case
letters to sort before upper case. (EBCDIC locales notably). They will still
sort consistently, just not in the order advertised. I can probably implement
some kind of run time check to modify the behavior based on current locale.
I'll look into it more seriously later if necessary.

Load on demand is not working yet. By default all methods and subroutines are
imported into the GLOBAL namespace. More an annoyance than a bug. 


Stephen Schulze (often seen lurking on and #perl6 IRC as


Licensed under The Artistic 2.0; see LICENSE.

Something went wrong with that request. Please try again.