Skip to content

Commit 8fd8c8d

Browse files
committed
Fix and explaing .slurp mangling newlines
Resolves #1578, or so I think.
1 parent 136acc7 commit 8fd8c8d

File tree

1 file changed

+24
-14
lines changed

1 file changed

+24
-14
lines changed

doc/Language/traps.pod6

Lines changed: 24 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -535,7 +535,7 @@ and you can use C<:i> (C<:ignorecase>) adverb instead.
535535
536536
=head1 Pairs
537537
538-
=head2 Constants on LHS of pair notation
538+
=head2 Constants on the LHS of pair notation
539539
540540
Consider this code:
541541
@@ -559,6 +559,7 @@ string literal, as long as it looks like an identifier.
559559
560560
To avoid this, use C«(Dog) => 42» or C«::Dog => 42».
561561
562+
562563
=head1 Operators
563564
564565
Some operators commonly shared among other languages were repurposed in Perl 6 for other, more common, things:
@@ -962,11 +963,11 @@ L«C<Str>|/type/Str#routine_lines». The trap arises if you start
962963
assuming that both split data the same way.
963964
964965
=begin code :skip-test
965-
say $_.perl for $*IN.lines # or just 「lines」
966-
# OUTPUT:
967-
# "foox"
968-
# "fooy\rbar"
969-
# "fooz"
966+
say $_.perl for $*IN.lines # .lines called on IO::Handle
967+
# OUTPUT:
968+
# "foox"
969+
# "fooy\rbar"
970+
# "fooz"
970971
=end code
971972
972973
As you can see in the example above, there was a line which contained
@@ -979,12 +980,12 @@ systems. Therefore, it will split by all possible variations of a
979980
newline.
980981
981982
=begin code :skip-test
982-
say $_.perl for $*IN.slurp.lines # or just 「slurp.lines
983-
# OUTPUT:
984-
# "foox"
985-
# "fooy"
986-
# "bar"
987-
# "fooz"
983+
say $_.perl for $*IN.slurp(:bin).decode.lines # .lines called on a Str
984+
# OUTPUT:
985+
# "foox"
986+
# "fooy"
987+
# "bar"
988+
# "fooz"
988989
=end code
989990
990991
The rule is quite simple: use
@@ -997,7 +998,16 @@ Use C<$data.split(“\n”)> in cases where you need the behavior of
997998
L«C<IO::Handle.lines>|/type/IO::Handle#routine_lines» but the original
998999
L<IO::Handle> is not available.
9991000
1000-
=comment RT #131923
1001+
=comment RT#132154
1002+
1003+
Note that if you really want to slurp the data first, then you will
1004+
have to use C<.IO.slurp(:bin).decode.split(“\n”)>. Notice how we use
1005+
C<:bin> to prevent it from doing the decoding, only to call C<.decode>
1006+
later anyway. All that is needed because C<.slurp> is assuming that
1007+
you are working with text and therefore it attempts to be smart about
1008+
newlines.
1009+
1010+
=comment RT#131923
10011011
10021012
If you are using L<Proc::Async>, then there is currently no easy way
10031013
to make it split data the right way. You can try reading the whole
@@ -1056,7 +1066,7 @@ whenever $proc.print: “one\ntwo\nthree\nfour” {
10561066
}
10571067
=end code
10581068
1059-
=head2 Using <.stdout> without <.lines>
1069+
=head2 Using C<.stdout> without C<.lines>
10601070
10611071
Method <.stdout> of L<Proc::Async> returns a supply that emits
10621072
I<chunks> of data, not lines. The trap is that sometimes people assume

0 commit comments

Comments
 (0)