Permalink
Browse files

Modified to work around module load order bug, flatten method call ar…

…guments, added tests and updated README
  • Loading branch information...
1 parent 0417c5b commit b6f8f07f5560da02dd9df16c59624abaa54fca23 @thundergnat committed Nov 18, 2010
Showing with 270 additions and 227 deletions.
  1. +158 −149 README
  2. +37 −25 lib/Sort/Naturally.pm6
  3. +75 −53 t/01-basic.t
View
307 README
@@ -1,149 +1,158 @@
-NAME
-
-Sort::Naturally
-
-
-SYNOPSIS
-
- use v6;
- use Sort::Naturally;
-
- # sort strings containing a mix of letters and digits sensibly
- my @a =
- <1 11 100 14th 2 210 21 30 3rd d1 Any any d10 D2 D21 d3 aid Are ANY >;
-
- say ~@a.nsort;
- # or
- say @a.sort( { $^a ncmp $^b } ).join(' ');
-
-@a.nsort yields:
-
- 1 2 3rd 11 14th 21 30 100 141 210 aid ANY Any any Are d1 D2 d3 d10 D21
-
-compared to @a.sort:
-
- 1 100 11 141 14th 2 21 210 30 3rd ANY Any Are D2 D21 aid any d1 d10 d3
-
-
-
- # or sort a list of dotted quad notation IP addresses:
- my @ips = ((0..255).roll(4).join('.')for 0..99);
-
- .say for nsort @ips;
-
-
-DESCRIPTION
-
-Similar though not identical to the Perl 5 Sort::Naturally. When sorting strings
-that contain groups of digits, will sort the groups of consecutive digits by
-"order of magnitude", then lexically by lowercased terms. Order of magnitude is
-something of a simplification. Sort::Naturally does't try to interpret or
-evaluate a group of digits as a number, it just counts how many digits are in
-each group and uses that as its order of magnitude.
-
-The implications are:
-
- It doesn't understand the (non)significance of leading zeros in numbers;
- 0010, 0100 and 1000 are all treated as being of the same order of magnitude
- and will all be sorted to be after 20 and 200. This is the correct behavior
- for strings of digits where leading zeros are significant, like zip codes or
- phone numbers.
-
- It doesn't understand floating point numbers; the groups of digits before
- and after a decimal point are treated as two separate numbers for sorting
- purposes.
-
- It doesn't understand or deal with scientific or exponential notation.
-
-However, that also means:
-
- You are not limited by number magnitude. It will sort arbitrarily large
- numbers with ease regardless of the math capability of your hardware/OS.
-
- It is quite speedy. (For liberal values of speedy.) Since it doesn't need
- to interpret numbers by value it eliminates a lot of code that would do that.
-
-Sort::Naturally could have been modified to ignore leading zeros, and in fact I
-experimented with that bit, but ran into issues with strings where leading zeros
-WERE significant. Just remember, it is for sorting strings, not numbers. It
-makes some attempt at treating groups of digits in a kind of numbery way, but
-they are still strings. If you truly want to sort numbers, use a numeric sort.
-
-
-USAGE
-
-Sort::Naturally has at its heart a sorting block modifier transformation
-routine: &naturally. This performs a tranform of the sort terms so they will end
-up in the natural order.
-
-C<nsort> is meant to be used the primary natural sorting routine. Syntactic
-sugar for C<sort( { .&naturally } )>. May be called either as a method or a sub:
-C<@array.nsort> or C<nsort @array>.
-
-C<ncmp> is meant to be used in sort blocks. Syntactic sugar for C<sort( {
-$^a.naturally cmp $^b.naturally } )>. Useful when you need to do secondary
-sorts.
-
-Say you have a hash containg the words in a document with the keys being the
-number of times each appears. You could sort by word frequency, then naturally
-as follows:
-
- ("%words{$_}, $_").say
- for sort {%words{$^b} <=> %words{$^a} || $^a ncmp $^b}, %words.keys;
-
-Note: using a sort block with an arity > 1 will disable the default Schwartzian
-Transform and may be very slow. If that is an issue either do a manual
-Schwartzian Transform or some kind of caching of terms.
-
-
-BACKWARD COMPATIBILITY
-
-Perl 5 Sort::Naturally has an odd convention in that numbers at the beginning of
-strings are sorted in ASCII order (digits sort before letters) but numbers
-embedded inside strings are sorted in non-ASCII order (digits sort after
-letters). While this is just plain strange in my opinion, some people may rely
-on or prefer this behavior so perl6 Sort::Naturally has "p5 compatibility mode"
-routines. These are analogues of the primary routines prepended with p5.
-
-C<p5naturally()>, C<p5nsort()> and C<p5ncmp>. Used identically to the p6
-versions.
-
-for comparison:
-
-
- (' sort:',<foo12z foo foo13a fooa Foolio Foo12a foolio foo12 foo12a 9x 14>\
- .sort).join(' ').say;
- (' nsort:',<foo12z foo foo13a fooa Foolio Foo12a foolio foo12 foo12a 9x 14>\
- .nsort).join(' ').say;
- ('p5nsort:',<foo12z foo foo13a fooa Foolio Foo12a foolio foo12 foo12a 9x 14>\
- .p5nsort).join(' ').say;
-
-yeilds:
-
- sort: 14 9x Foo12a Foolio foo foo12 foo12a foo12z foo13a fooa foolio
- nsort: 9x 14 foo foo12 Foo12a foo12a foo12z foo13a fooa Foolio foolio
- p5nsort: 9x 14 foo fooa Foolio foolio foo12 Foo12a foo12a foo12z foo13a
-
-
-
-BUGS
-
-Tests and the p5* routines will fail under locales that specify lower case
-letters to sort before upper case. (EBCDIC locales notably). They will still
-sort consistently, just not in the order advertised. I can probably implement
-some kind of run time check to modify the behavior based on current locale.
-I'll look into it more seriously later if necessary.
-
-Load on demand is not working yet. By default all methods and subroutines are
-imported into the lexical scope. More an annoyance than a bug.
-
-
-AUTHOR
-
-Stephen Schulze (often seen lurking on perlmonks.org and #perl6 IRC as
-thundergnat)
-
-LICENSE
-
-Licensed under The Artistic 2.0; see LICENSE.
-
+NAME
+
+Sort::Naturally
+
+
+SYNOPSIS
+
+ use v6;
+ use Sort::Naturally;
+
+ # sort strings containing a mix of letters and digits sensibly
+ my @a =
+ <1 11 100 14th 2 210 21 30 3rd d1 Any any d10 D2 D21 d3 aid Are ANY >;
+
+ say ~@a.nsort;
+ # or
+ say @a.sort( { $^a ncmp $^b } ).join(' ');
+
+@a.nsort yields:
+
+ 1 2 3rd 11 14th 21 30 100 141 210 aid ANY Any any Are d1 D2 d3 d10 D21
+
+compared to @a.sort:
+
+ 1 100 11 141 14th 2 21 210 30 3rd ANY Any Are D2 D21 aid any d1 d10 d3
+
+
+
+ # or sort a list of dotted quad notation IP addresses:
+ my @ips = ((0..255).roll(4).join('.')for 0..99);
+
+ .say for nsort @ips;
+
+
+DESCRIPTION
+
+Similar though not identical to the Perl 5 Sort::Naturally. When sorting strings
+that contain groups of digits, will sort the groups of consecutive digits by
+"order of magnitude", then lexically by lower-cased terms. Order of magnitude is
+something of a simplification. Sort::Naturally doesn�t try to interpret or
+evaluate a group of digits as a number, it just counts how many digits are in
+each group and uses that as its order of magnitude.
+
+The implications are:
+
+ It doesn't understand the (non)significance of leading zeros in numbers;
+ 0010, 0100 and 1000 are all treated as being of the same order of magnitude
+ and will all be sorted to be after 20 and 200. This is the correct behavior
+ for strings of digits where leading zeros are significant, like zip codes or
+ phone numbers.
+
+ It doesn't understand floating point numbers; the groups of digits before
+ and after a decimal point are treated as two separate numbers for sorting
+ purposes.
+
+ It doesn't understand or deal with scientific or exponential notation.
+
+However, that also means:
+
+ You are not limited by number magnitude. It will sort arbitrarily large
+ numbers with ease regardless of the math capability of your hardware/OS.
+
+ It is quite speedy. (For liberal values of speedy.) Since it doesn't need to
+ interpret numbers by value it eliminates a lot of code that would do that.
+
+Sort::Naturally could have been modified to ignore leading zeros, and in fact I
+experimented with that bit, but ran into issues with strings where leading zeros
+WERE significant. Just remember, it is for sorting strings, not numbers. It
+makes some attempt at treating groups of digits in a kind of numbery way, but
+they are still strings. If you truly want to sort numbers, use a numeric sort.
+
+
+USAGE
+
+Sort::Naturally has at its heart a sorting block modifier transformation
+routine: &naturally. This performs a transform of the sort terms so they will end
+up in the natural order.
+
+C<nsort> is meant to be used as the primary natural sorting routine. Syntactic
+sugar for C<sort( { .&naturally } )>. May be called either as a method or a sub:
+C<@array.nsort> or C<nsort @array>.
+
+C<ncmp> is meant to be used in sort blocks. Syntactic sugar for C<sort( {
+$^a.naturally cmp $^b.naturally } )>. Useful when you need to do secondary
+sorts.
+
+Say you have a hash containing the words in a document with the keys being the
+number of times each appears. You could sort by word frequency, then naturally
+as follows:
+
+ ("%words{$_}, $_").say
+ for sort {%words{$^b} <=> %words{$^a} || $^a ncmp $^b}, %words.keys;
+
+Note: using a sort block with an arity > 1 will disable the default Schwartzian
+Transform and may be very slow. If that is an issue either do a manual
+Schwartzian Transform or some kind of caching of terms.
+
+nsort will flatten all parameters passed to it into lists whether called as a
+method or sub. (Previously, method calling conventions required manual
+flattening.) This means (@a1,@a2,@a3).nsort will return a list comprised of the
+contents of each of the arrays sorted in natural order, rather than a list of
+the three arrays nsorted on their names. This seems to more what would be
+expected following the <a
+href="http://en.wikipedia.org/wiki/Principle_of_least_astonishment">Principle of
+least astonishment</a>.
+
+
+BACKWARD COMPATIBILITY
+
+Perl 5 Sort::Naturally has an odd convention in that numbers at the beginning of
+strings are sorted in ASCII order (digits sort before letters) but numbers
+embedded inside strings are sorted in non-ASCII order (digits sort after
+letters). While this is just plain strange in my opinion, some people may rely
+on or prefer this behavior so perl6 Sort::Naturally has "p5 compatibility mode"
+routines. These are analogues of the primary routines prepended with p5.
+
+C<p5naturally()>, C<p5nsort()> and C<p5ncmp>. Used identically to the p6
+versions.
+
+for comparison:
+
+
+ (' sort:',<foo12z foo foo13a fooa Foolio Foo12a foolio foo12 foo12a 9x 14>\
+ .sort).join(' ').say;
+ (' nsort:',<foo12z foo foo13a fooa Foolio Foo12a foolio foo12 foo12a 9x 14>\
+ .nsort).join(' ').say;
+ ('p5nsort:',<foo12z foo foo13a fooa Foolio Foo12a foolio foo12 foo12a 9x 14>\
+ .p5nsort).join(' ').say;
+
+yields:
+
+ sort: 14 9x Foo12a Foolio foo foo12 foo12a foo12z foo13a fooa foolio
+ nsort: 9x 14 foo foo12 Foo12a foo12a foo12z foo13a fooa Foolio foolio
+ p5nsort: 9x 14 foo fooa Foolio foolio foo12 Foo12a foo12a foo12z foo13a
+
+
+
+BUGS
+
+Tests and the p5* routines will fail under locales that specify lower case
+letters to sort before upper case. (EBCDIC locales notably). They will still
+sort consistently, just not in the order advertised. I can probably implement
+some kind of run time check to modify the behavior based on current locale.
+I'll look into it more seriously later if necessary.
+
+Load on demand is not working yet. By default all methods and subroutines are
+imported into the lexical scope. More an annoyance than a bug.
+
+
+AUTHOR
+
+Stephen Schulze (often seen lurking on perlmonks.org and #perl6 IRC as
+thundergnat)
+
+LICENSE
+
+Licensed under The Artistic 2.0; see LICENSE.
+
View
@@ -1,25 +1,37 @@
-module Sort::Naturally:ver<0.1.0>;
-use v6;
-use MONKEY_TYPING;
-
-augment class Any {
- method nsort is export(:standard) { self.list.sort( { .&naturally } ) };
- method p5nsort is export(:p5) { self.list.sort( { .&p5naturally } ) };
-}
-
-sub infix:<ncmp>($a, $b) is export(:standard) {$a.&naturally cmp $b.&naturally}
-
-sub infix:<p5ncmp>($a, $b) is export(:p5) {$a.&p5naturally cmp $b.&p5naturally}
-
-sub naturally ($a) is export(:standard) {
- $a.lc.subst(/(\d+)/, ->$/ { 0 ~ $0.chars.chr ~ $0 }, :g) ~ "\x0" ~ $a
-}
-
-sub p5naturally ($a) is export(:p5) {
- $a.lc.subst(/^(\d+)/, -> $/ { 0 ~ $0.chars.chr ~ $0 } )\
- # Less than awesome use of captures, but rakudo doesn't have <?after ...>
- # lookaround implemented yet. Really should be:
- # .subst(/<?after \D>(\d+)/, -> $/ { 'z{' ~ $0.chars.chr ~ $0 }, :g)
- .subst(/(\D)(\d+)/, -> $/ { $0 ~ 'z{' ~ $1.chars.chr ~ $1 }, :g)
- ~ "\x0" ~ $a
-}
+module Sort::Naturally:ver<0.1.1>;
+use v6;
+use MONKEY_TYPING;
+
+augment class Any {
+ multi method nsort is export(:standard) { self.list.flat.sort( { .&naturally } ) };
+ multi method p5nsort is export(:p5) { self.list.flat.sort( { .&p5naturally } ) };
+}
+
+
+# Work-around multi subs nsort and p5nsort. Shouldn't be necessary but
+# module load order can affect class method exporting and make them unfindable.
+
+multi sub nsort(*@a) is export(:standard) { @a.nsort; };
+multi sub p5nsort(*@a) is export(:p5) { @a.p5nsort };
+
+
+# Sort modifier block routines
+
+sub infix:<ncmp>($a, $b) is export(:standard) {$a.&naturally cmp $b.&naturally}
+sub infix:<p5ncmp>($a, $b) is export(:p5) {$a.&p5naturally cmp $b.&p5naturally}
+
+
+# core routines to actually do the transformation for sorting
+
+sub naturally ($a) is export(:standard) {
+ $a.lc.subst(/(\d+)/, ->$/ { 0 ~ $0.chars.chr ~ $0 }, :g) ~ "\x0" ~ $a
+}
+
+sub p5naturally ($a) is export(:p5) {
+ $a.lc.subst(/^(\d+)/, -> $/ { 0 ~ $0.chars.chr ~ $0 } )\
+ # Less than awesome use of captures, but rakudo doesn't have
+ # <?after ...> lookbehind implemented yet for .subst(). Really should be:
+ # .subst(/<?after \D>(\d+)/, -> $/ { 'z{' ~ $0.chars.chr ~ $0 }, :g)
+ .subst(/(\D)(\d+)/, -> $/ { $0 ~ 'z{' ~ $1.chars.chr ~ $1 }, :g)
+ ~ "\x0" ~ $a
+}
Oops, something went wrong.

0 comments on commit b6f8f07

Please sign in to comment.