Faq: string common operations

tinmarino · tinmarino · commit 27649529e6b3 · 2020-03-28T14:46:46.000-03:00
diff --git a/doc/Language/faq.pod6 b/doc/Language/faq.pod6
@@ -299,6 +299,77 @@ dependencies within a single compilation unit (e.g., file) are possible
 through stubbing. Therefore another possible solution is to move
 classes into the same compilation unit.
 
+
+=head1 Common operations
+
+=head2 String: How can I parse and get a L<number|/language/numerics> from a L<string|/language/type/Str>
+
+Use the L<+ prefix|/language/operators#prefix_+>:
+
+=begin code :ok-test<dd>
+say "42.123456789123456789";  # OUTPUT: «42.123456789123456789␤»
+say +"42.4e-2";               # OUTPUT: «0.424␤»
+=end code
+
+This method can numify any string you could enter as a L<litteral number|/language/syntax#Number_literals>.
+L<val|/routine/val> routine converts it to L<allomorph|/language/glossary#Allomorph>.
+L<unival|/routine/unival> routine converts one unicode codepoint.
+
+=head2 String: How can I check if a string contains a substring and if so, how can I get indices of matches
+
+Use L<.contains|/type/Str#method_contains> or L<.indices|/type/Str#method_indices>:
+
+=begin code :ok-test<dd>
+"az and az and az again".contains("az"); # OUTPUT: «True␤»
+"az and az and az again".indices("az");  # OUTPUT: «(0 7 14)␤»
+=end code
+
+=head2 String: How can I get the hexadecimal representation of a string
+
+First convert it to a L<Blob|/type/Blob> with L<.encode|/routine/encode>.
+
+=begin code :ok-test<dd>
+say "I ❤ 🦋".encode>>.base(16);  OUTPUT: «(49 20 E2 9D A4 20 F0 9F A6 8B)␤»
+=end code
+
+Note that L<.gist|/routine/gist> or L<.raku|/routine/perl> methods are your friends when L<debugging|/programs/01-debugging>:
+
+=begin code
+say "I ❤ 🦋".encode.raku;  # OUTPUT: «utf8.new(73,32,226,157,164,32,240,159,166,139)␤»
+say "I ❤ 🦋".encode.gist;  # OUTPUT: «utf8:0x<49 20 E2 9D A4 20 F0 9F A6 8B>␤»
+=end code
+
+=head2 String: How can I remove from a string some characters by index
+
+Use L<.comb|/routine/comb> to transform it to a L<Seq|/type/Seq>, then the L<(-) infix|/language/operators#infix_(-),_infix_\\> to remove the unwanted indices:
+
+=begin code :ok-test<dd>
+say '0123456789'.comb[(^* (-) (1..3, 8).flat).keys.sort].join;  # OUTPUT: «045679␤»
+=end code
+
+If the string is large, L<.comb|/routine/comb> can take time. In which case, L<.substr-rw|/routine/substr-rw> is faster:
+
+=begin code :ok-test<dd>
+multi postcircumfix:<[- ]> (Str:D $str is copy, +@indices) {
+    for @indices.reverse {
+        when Int   { $str.substr-rw($_,1) = '' }
+        when Range { $str.substr-rw($_  ) = '' }
+    }
+    return $str;
+}
+
+say '0123456789'[- 1..3, 8 ];  # OUTPUT: «045679␤»
+=end code
+
+=head2 String: How can I split a string in equal parts
+
+L<.comb|/routine/comb> is accepting a L<Int|/type/Int> or a L<Regex|/type/Regex>:
+
+=begin code :ok-test<dd>
+.say for 'abcdefghijklmnopqrstuvwxyz'.comb: 8;  # OUTPUT: «abcdefgh␤ijklmnop␤qrstuvwx␤yz»
+.say for 'abcdefg4444hijklmnop4444qrstuvwxyz'.comb: /..\d+../;  # OUTPUT: «fg4444hi␤op4444qr␤»
+=end code
+
 =head1 Language features
 
 X<|Data::Dumper (FAQ)>