forked from schacon/perl
/
perl5140delta.pod
4590 lines (2971 loc) · 141 KB
/
perl5140delta.pod
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
=encoding utf8
=head1 NAME
perl5140delta - what is new for perl v5.14.0
=head1 DESCRIPTION
This document describes differences between the 5.12.0 release and
the 5.14.0 release.
If you are upgrading from an earlier release such as 5.10.0, first read
L<perl5120delta>, which describes differences between 5.10.0 and
5.12.0.
Some of the bug fixes in this release have been backported to subsequent
releases of 5.12.x. Those are indicated with the 5.12.x version in
parentheses.
=head1 Notice
As described in L<perlpolicy>, the release of Perl 5.14.0 marks the
official end of support for Perl 5.10. Users of Perl 5.10 or earlier
should consider upgrading to a more recent release of Perl.
=head1 Core Enhancements
=head2 Unicode
=head3 Unicode Version 6.0 is now supported (mostly)
Perl comes with the Unicode 6.0 data base updated with
L<Corrigendum #8|http://www.unicode.org/versions/corrigendum8.html>,
with one exception noted below.
See L<http://unicode.org/versions/Unicode6.0.0/> for details on the new
release. Perl does not support any Unicode provisional properties,
including the new ones for this release.
Unicode 6.0 has chosen to use the name C<BELL> for the character at U+1F514,
which is a symbol that looks like a bell, and is used in Japanese cell
phones. This conflicts with the long-standing Perl usage of having
C<BELL> mean the ASCII C<BEL> character, U+0007. In Perl 5.14,
C<\N{BELL}> continues to mean U+0007, but its use generates a
deprecation warning message unless such warnings are turned off. The
new name for U+0007 in Perl is C<ALERT>, which corresponds nicely
with the existing shorthand sequence for it, C<"\a">. C<\N{BEL}>
means U+0007, with no warning given. The character at U+1F514 has no
name in 5.14, but can be referred to by C<\N{U+1F514}>.
In Perl 5.16, C<\N{BELL}> will refer to U+1F514; all code
that uses C<\N{BELL}> should be converted to use C<\N{ALERT}>,
C<\N{BEL}>, or C<"\a"> before upgrading.
=head3 Full functionality for C<use feature 'unicode_strings'>
This release provides full functionality for C<use feature
'unicode_strings'>. Under its scope, all string operations executed and
regular expressions compiled (even if executed outside its scope) have
Unicode semantics. See L<feature/"the 'unicode_strings' feature">.
However, see L</Inverted bracketed character classes and multi-character folds>,
below.
This feature avoids most forms of the "Unicode Bug" (see
L<perlunicode/The "Unicode Bug"> for details). If there is any
possibility that your code will process Unicode strings, you are
I<strongly> encouraged to use this subpragma to avoid nasty surprises.
=head3 C<\N{I<NAME>}> and C<charnames> enhancements
=over
=item *
C<\N{I<NAME>}> and C<charnames::vianame> now know about the abbreviated
character names listed by Unicode, such as NBSP, SHY, LRO, ZWJ, etc.; all
customary abbreviations for the C0 and C1 control characters (such as
ACK, BEL, CAN, etc.); and a few new variants of some C1 full names that
are in common usage.
=item *
Unicode has several I<named character sequences>, in which particular sequences
of code points are given names. C<\N{I<NAME>}> now recognizes these.
=item *
C<\N{I<NAME>}>, C<charnames::vianame>, and C<charnames::viacode>
now know about every character in Unicode. In earlier releases of
Perl, they didn't know about the Hangul syllables nor several
CJK (Chinese/Japanese/Korean) characters.
=item *
It is now possible to override Perl's abbreviations with your own custom aliases.
=item *
You can now create a custom alias of the ordinal of a
character, known by C<\N{I<NAME>}>, C<charnames::vianame()>, and
C<charnames::viacode()>. Previously, aliases had to be to official
Unicode character names. This made it impossible to create an alias for
unnamed code points, such as those reserved for private
use.
=item *
The new function charnames::string_vianame() is a run-time version
of C<\N{I<NAME>}}>, returning the string of characters whose Unicode
name is its parameter. It can handle Unicode named character
sequences, whereas the pre-existing charnames::vianame() cannot,
as the latter returns a single code point.
=back
See L<charnames> for details on all these changes.
=head3 New warnings categories for problematic (non-)Unicode code points.
Three new warnings subcategories of "utf8" have been added. These
allow you to turn off some "utf8" warnings, while allowing
other warnings to remain on. The three categories are:
C<surrogate> when UTF-16 surrogates are encountered;
C<nonchar> when Unicode non-character code points are encountered;
and C<non_unicode> when code points above the legal Unicode
maximum of 0x10FFFF are encountered.
=head3 Any unsigned value can be encoded as a character
With this release, Perl is adopting a model that any unsigned value
can be treated as a code point and encoded internally (as utf8)
without warnings, not just the code points that are legal in Unicode.
However, unless utf8 or the corresponding sub-category (see previous
item) of lexical warnings have been explicitly turned off, outputting
or executing a Unicode-defined operation such as upper-casing
on such a code point generates a warning. Attempting to input these
using strict rules (such as with the C<:encoding(UTF-8)> layer)
will continue to fail. Prior to this release, handling was
inconsistent and in places, incorrect.
Unicode non-characters, some of which previously were erroneously
considered illegal in places by Perl, contrary to the Unicode Standard,
are now always legal internally. Inputting or outputting them
works the same as with the non-legal Unicode code points, because the Unicode
Standard says they are (only) illegal for "open interchange".
=head3 Unicode database files not installed
The Unicode database files are no longer installed with Perl. This
doesn't affect any functionality in Perl and saves significant disk
space. If you need these files, you can download them from
L<http://www.unicode.org/Public/zipped/6.0.0/>.
=head2 Regular Expressions
=head3 C<(?^...)> construct signifies default modifiers
An ASCII caret C<"^"> immediately following a C<"(?"> in a regular
expression now means that the subexpression does not inherit surrounding
modifiers such as C</i>, but reverts to the Perl defaults. Any modifiers
following the caret override the defaults.
Stringification of regular expressions now uses this notation.
For example, C<qr/hlagh/i> would previously be stringified as
C<(?i-xsm:hlagh)>, but now it's stringified as C<(?^i:hlagh)>.
The main purpose of this change is to allow tests that rely on the
stringification I<not> to have to change whenever new modifiers are added.
See L<perlre/Extended Patterns>.
This change is likely to break code that compares stringified regular
expressions with fixed strings containing C<?-xism>.
=head3 C</d>, C</l>, C</u>, and C</a> modifiers
Four new regular expression modifiers have been added. These are mutually
exclusive: one only can be turned on at a time.
=over
=item *
The C</l> modifier says to compile the regular expression as if it were
in the scope of C<use locale>, even if it is not.
=item *
The C</u> modifier says to compile the regular expression as if it were
in the scope of a C<use feature 'unicode_strings'> pragma.
=item *
The C</d> (default) modifier is used to override any C<use locale> and
C<use feature 'unicode_strings'> pragmas in effect at the time
of compiling the regular expression.
=item *
The C</a> regular expression modifier restricts C<\s>, C<\d> and C<\w> and
the POSIX (C<[[:posix:]]>) character classes to the ASCII range. Their
complements and C<\b> and C<\B> are correspondingly
affected. Otherwise, C</a> behaves like the C</u> modifier, in that
case-insensitive matching uses Unicode semantics.
If the C</a> modifier is repeated, then additionally in case-insensitive
matching, no ASCII character can match a non-ASCII character.
For example,
"k" =~ /\N{KELVIN SIGN}/ai
"\xDF" =~ /ss/ai
match but
"k" =~ /\N{KELVIN SIGN}/aai
"\xDF" =~ /ss/aai
do not match.
=back
See L<perlre/Modifiers> for more detail.
=head3 Non-destructive substitution
The substitution (C<s///>) and transliteration
(C<y///>) operators now support an C</r> option that
copies the input variable, carries out the substitution on
the copy, and returns the result. The original remains unmodified.
my $old = "cat";
my $new = $old =~ s/cat/dog/r;
# $old is "cat" and $new is "dog"
This is particularly useful with C<map>. See L<perlop> for more examples.
=head3 Re-entrant regular expression engine
It is now safe to use regular expressions within C<(?{...})> and
C<(??{...})> code blocks inside regular expressions.
These blocks are still experimental, however, and still have problems with
lexical (C<my>) variables and abnormal exiting.
=head3 C<use re '/flags'>
The C<re> pragma now has the ability to turn on regular expression flags
till the end of the lexical scope:
use re "/x";
"foo" =~ / (.+) /; # /x implied
See L<re/"'/flags' mode"> for details.
=head3 \o{...} for octals
There is a new octal escape sequence, C<"\o">, in doublequote-like
contexts. This construct allows large octal ordinals beyond the
current max of 0777 to be represented. It also allows you to specify a
character in octal which can safely be concatenated with other regex
snippets and which won't be confused with being a backreference to
a regex capture group. See L<perlre/Capture groups>.
=head3 Add C<\p{Titlecase}> as a synonym for C<\p{Title}>
This synonym is added for symmetry with the Unicode property names
C<\p{Uppercase}> and C<\p{Lowercase}>.
=head3 Regular expression debugging output improvement
Regular expression debugging output (turned on by C<use re 'debug'>) now
uses hexadecimal when escaping non-ASCII characters, instead of octal.
=head3 Return value of C<delete $+{...}>
Custom regular expression engines can now determine the return value of
C<delete> on an entry of C<%+> or C<%->.
=head2 Syntactical Enhancements
=head3 Array and hash container functions accept references
B<Warning:> This feature is considered experimental, as the exact behaviour
may change in a future version of Perl.
All builtin functions that operate directly on array or hash
containers now also accept unblessed hard references to arrays
or hashes:
|----------------------------+---------------------------|
| Traditional syntax | Terse syntax |
|----------------------------+---------------------------|
| push @$arrayref, @stuff | push $arrayref, @stuff |
| unshift @$arrayref, @stuff | unshift $arrayref, @stuff |
| pop @$arrayref | pop $arrayref |
| shift @$arrayref | shift $arrayref |
| splice @$arrayref, 0, 2 | splice $arrayref, 0, 2 |
| keys %$hashref | keys $hashref |
| keys @$arrayref | keys $arrayref |
| values %$hashref | values $hashref |
| values @$arrayref | values $arrayref |
| ($k,$v) = each %$hashref | ($k,$v) = each $hashref |
| ($k,$v) = each @$arrayref | ($k,$v) = each $arrayref |
|----------------------------+---------------------------|
This allows these builtin functions to act on long dereferencing chains
or on the return value of subroutines without needing to wrap them in
C<@{}> or C<%{}>:
push @{$obj->tags}, $new_tag; # old way
push $obj->tags, $new_tag; # new way
for ( keys %{$hoh->{genres}{artists}} ) {...} # old way
for ( keys $hoh->{genres}{artists} ) {...} # new way
=head3 Single term prototype
The C<+> prototype is a special alternative to C<$> that acts like
C<\[@%]> when given a literal array or hash variable, but will otherwise
force scalar context on the argument. See L<perlsub/Prototypes>.
=head3 C<package> block syntax
A package declaration can now contain a code block, in which case the
declaration is in scope inside that block only. So C<package Foo { ... }>
is precisely equivalent to C<{ package Foo; ... }>. It also works with
a version number in the declaration, as in C<package Foo 1.2 { ... }>,
which is its most attractive feature. See L<perlfunc>.
=head3 Statement labels can appear in more places
Statement labels can now occur before any type of statement or declaration,
such as C<package>.
=head3 Stacked labels
Multiple statement labels can now appear before a single statement.
=head3 Uppercase X/B allowed in hexadecimal/binary literals
Literals may now use either upper case C<0X...> or C<0B...> prefixes,
in addition to the already supported C<0x...> and C<0b...>
syntax [perl #76296].
C, Ruby, Python, and PHP already support this syntax, and it makes
Perl more internally consistent: a round-trip with C<eval sprintf
"%#X", 0x10> now returns C<16>, just like C<eval sprintf "%#x", 0x10>.
=head3 Overridable tie functions
C<tie>, C<tied> and C<untie> can now be overridden [perl #75902].
=head2 Exception Handling
To make them more reliable and consistent, several changes have been made
to how C<die>, C<warn>, and C<$@> behave.
=over
=item *
When an exception is thrown inside an C<eval>, the exception is no
longer at risk of being clobbered by destructor code running during unwinding.
Previously, the exception was written into C<$@>
early in the throwing process, and would be overwritten if C<eval> was
used internally in the destructor for an object that had to be freed
while exiting from the outer C<eval>. Now the exception is written
into C<$@> last thing before exiting the outer C<eval>, so the code
running immediately thereafter can rely on the value in C<$@> correctly
corresponding to that C<eval>. (C<$@> is still also set before exiting the
C<eval>, for the sake of destructors that rely on this.)
Likewise, a C<local $@> inside an C<eval> no longer clobbers any
exception thrown in its scope. Previously, the restoration of C<$@> upon
unwinding would overwrite any exception being thrown. Now the exception
gets to the C<eval> anyway. So C<local $@> is safe before a C<die>.
Exceptions thrown from object destructors no longer modify the C<$@>
of the surrounding context. (If the surrounding context was exception
unwinding, this used to be another way to clobber the exception being
thrown.) Previously such an exception was
sometimes emitted as a warning, and then either was
string-appended to the surrounding C<$@> or completely replaced the
surrounding C<$@>, depending on whether that exception and the surrounding
C<$@> were strings or objects. Now, an exception in this situation is
always emitted as a warning, leaving the surrounding C<$@> untouched.
In addition to object destructors, this also affects any function call
run by XS code using the C<G_KEEPERR> flag.
=item *
Warnings for C<warn> can now be objects in the same way as exceptions
for C<die>. If an object-based warning gets the default handling
of writing to standard error, it is stringified as before with the
filename and line number appended. But a C<$SIG{__WARN__}> handler now
receives an object-based warning as an object, where previously it
was passed the result of stringifying the object.
=back
=head2 Other Enhancements
=head3 Assignment to C<$0> sets the legacy process name with prctl() on Linux
On Linux the legacy process name is now set with L<prctl(2)>, in
addition to altering the POSIX name via C<argv[0]>, as Perl has done
since version 4.000. Now system utilities that read the legacy process
name such as I<ps>, I<top>, and I<killall> recognize the name you set when
assigning to C<$0>. The string you supply is truncated at 16 bytes;
this limitation is imposed by Linux.
=head3 srand() now returns the seed
This allows programs that need to have repeatable results not to have to come
up with their own seed-generating mechanism. Instead, they can use srand()
and stash the return value for future use. One example is a test program with
too many combinations to test comprehensively in the time available for
each run. It can test a random subset each time and, should there be a failure,
log the seed used for that run so this can later be used to produce the same results.
=head3 printf-like functions understand post-1980 size modifiers
Perl's printf and sprintf operators, and Perl's internal printf replacement
function, now understand the C90 size modifiers "hh" (C<char>), "z"
(C<size_t>), and "t" (C<ptrdiff_t>). Also, when compiled with a C99
compiler, Perl now understands the size modifier "j" (C<intmax_t>)
(but this is not portable).
So, for example, on any modern machine, C<sprintf("%hhd", 257)> returns "1".
=head3 New global variable C<${^GLOBAL_PHASE}>
A new global variable, C<${^GLOBAL_PHASE}>, has been added to allow
introspection of the current phase of the Perl interpreter. It's explained in
detail in L<perlvar/"${^GLOBAL_PHASE}"> and in
L<perlmod/"BEGIN, UNITCHECK, CHECK, INIT and END">.
=head3 C<-d:-foo> calls C<Devel::foo::unimport>
The syntax B<-d:foo> was extended in 5.6.1 to make B<-d:foo=bar>
equivalent to B<-MDevel::foo=bar>, which expands
internally to C<use Devel::foo 'bar'>.
Perl now allows prefixing the module name with B<->, with the same
semantics as B<-M>; that is:
=over 4
=item C<-d:-foo>
Equivalent to B<-M-Devel::foo>: expands to
C<no Devel::foo> and calls C<< Devel::foo->unimport() >>
if that method exists.
=item C<-d:-foo=bar>
Equivalent to B<-M-Devel::foo=bar>: expands to C<no Devel::foo 'bar'>,
and calls C<< Devel::foo->unimport("bar") >> if that method exists.
=back
This is particularly useful for suppressing the default actions of a
C<Devel::*> module's C<import> method whilst still loading it for debugging.
=head3 Filehandle method calls load L<IO::File> on demand
When a method call on a filehandle would die because the method cannot
be resolved and L<IO::File> has not been loaded, Perl now loads L<IO::File>
via C<require> and attempts method resolution again:
open my $fh, ">", $file;
$fh->binmode(":raw"); # loads IO::File and succeeds
This also works for globs like C<STDOUT>, C<STDERR>, and C<STDIN>:
STDOUT->autoflush(1);
Because this on-demand load happens only if method resolution fails, the
legacy approach of manually loading an L<IO::File> parent class for partial
method support still works as expected:
use IO::Handle;
open my $fh, ">", $file;
$fh->autoflush(1); # IO::File not loaded
=head3 Improved IPv6 support
The C<Socket> module provides new affordances for IPv6,
including implementations of the C<Socket::getaddrinfo()> and
C<Socket::getnameinfo()> functions, along with related constants and a
handful of new functions. See L<Socket>.
=head3 DTrace probes now include package name
The C<DTrace> probes now include an additional argument, C<arg3>, which contains
the package the subroutine being entered or left was compiled in.
For example, using the following DTrace script:
perl$target:::sub-entry
{
printf("%s::%s\n", copyinstr(arg0), copyinstr(arg3));
}
and then running:
$ perl -e 'sub test { }; test'
C<DTrace> will print:
main::test
=head2 New C APIs
See L</Internal Changes>.
=head1 Security
=head2 User-defined regular expression properties
L<perlunicode/"User-Defined Character Properties"> documented that you can
create custom properties by defining subroutines whose names begin with
"In" or "Is". However, Perl did not actually enforce that naming
restriction, so C<\p{foo::bar}> could call foo::bar() if it existed. The documented
convention is now enforced.
Also, Perl no longer allows tainted regular expressions to invoke a
user-defined property. It simply dies instead [perl #82616].
=head1 Incompatible Changes
Perl 5.14.0 is not binary-compatible with any previous stable release.
In addition to the sections that follow, see L</C API Changes>.
=head2 Regular Expressions and String Escapes
=head3 Inverted bracketed character classes and multi-character folds
Some characters match a sequence of two or three characters in C</i>
regular expression matching under Unicode rules. One example is
C<LATIN SMALL LETTER SHARP S> which matches the sequence C<ss>.
'ss' =~ /\A[\N{LATIN SMALL LETTER SHARP S}]\z/i # Matches
This, however, can lead to very counter-intuitive results, especially
when inverted. Because of this, Perl 5.14 does not use multi-character C</i>
matching in inverted character classes.
'ss' =~ /\A[^\N{LATIN SMALL LETTER SHARP S}]+\z/i # ???
This should match any sequences of characters that aren't the C<SHARP S>
nor what C<SHARP S> matches under C</i>. C<"s"> isn't C<SHARP S>, but
Unicode says that C<"ss"> is what C<SHARP S> matches under C</i>. So
which one "wins"? Do you fail the match because the string has C<ss> or
accept it because it has an C<s> followed by another C<s>?
Earlier releases of Perl did allow this multi-character matching,
but due to bugs, it mostly did not work.
=head3 \400-\777
In certain circumstances, C<\400>-C<\777> in regexes have behaved
differently than they behave in all other doublequote-like contexts.
Since 5.10.1, Perl has issued a deprecation warning when this happens.
Now, these literals behave the same in all doublequote-like contexts,
namely to be equivalent to C<\x{100}>-C<\x{1FF}>, with no deprecation
warning.
Use of C<\400>-C<\777> in the command-line option B<-0> retain their
conventional meaning. They slurp whole input files; previously, this
was documented only for B<-0777>.
Because of various ambiguities, you should use the new
C<\o{...}> construct to represent characters in octal instead.
=head3 Most C<\p{}> properties are now immune to case-insensitive matching
For most Unicode properties, it doesn't make sense to have them match
differently under C</i> case-insensitive matching. Doing so can lead
to unexpected results and potential security holes. For example
m/\p{ASCII_Hex_Digit}+/i
could previously match non-ASCII characters because of the Unicode
matching rules (although there were several bugs with this). Now
matching under C</i> gives the same results as non-C</i> matching except
for those few properties where people have come to expect differences,
namely the ones where casing is an integral part of their meaning, such
as C<m/\p{Uppercase}/i> and C<m/\p{Lowercase}/i>, both of which match
the same code points as matched by C<m/\p{Cased}/i>.
Details are in L<perlrecharclass/Unicode Properties>.
User-defined property handlers that need to match differently under C</i>
must be changed to read the new boolean parameter passed to them, which
is non-zero if case-insensitive matching is in effect and 0 otherwise.
See L<perlunicode/User-Defined Character Properties>.
=head3 \p{} implies Unicode semantics
Specifying a Unicode property in the pattern indicates
that the pattern is meant for matching according to Unicode rules, the way
C<\N{I<NAME>}> does.
=head3 Regular expressions retain their localeness when interpolated
Regular expressions compiled under C<use locale> now retain this when
interpolated into a new regular expression compiled outside a
C<use locale>, and vice-versa.
Previously, one regular expression interpolated into another inherited
the localeness of the surrounding regex, losing whatever state it
originally had. This is considered a bug fix, but may trip up code that
has come to rely on the incorrect behaviour.
=head3 Stringification of regexes has changed
Default regular expression modifiers are now notated using
C<(?^...)>. Code relying on the old stringification will fail.
This is so that when new modifiers are added, such code won't
have to keep changing each time this happens, because the stringification
will automatically incorporate the new modifiers.
Code that needs to work properly with both old- and new-style regexes
can avoid the whole issue by using (for perls since 5.9.5; see L<re>):
use re qw(regexp_pattern);
my ($pat, $mods) = regexp_pattern($re_ref);
If the actual stringification is important or older Perls need to be
supported, you can use something like the following:
# Accept both old and new-style stringification
my $modifiers = (qr/foobar/ =~ /\Q(?^/) ? "^" : "-xism";
And then use C<$modifiers> instead of C<-xism>.
=head3 Run-time code blocks in regular expressions inherit pragmata
Code blocks in regular expressions (C<(?{...})> and C<(??{...})>) previously
did not inherit pragmata (strict, warnings, etc.) if the regular expression
was compiled at run time as happens in cases like these two:
use re "eval";
$foo =~ $bar; # when $bar contains (?{...})
$foo =~ /$bar(?{ $finished = 1 })/;
This bug has now been fixed, but code that relied on the buggy behaviour
may need to be fixed to account for the correct behaviour.
=head2 Stashes and Package Variables
=head3 Localised tied hashes and arrays are no longed tied
In the following:
tie @a, ...;
{
local @a;
# here, @a is a now a new, untied array
}
# here, @a refers again to the old, tied array
Earlier versions of Perl incorrectly tied the new local array. This has
now been fixed. This fix could however potentially cause a change in
behaviour of some code.
=head3 Stashes are now always defined
C<defined %Foo::> now always returns true, even when no symbols have yet been
defined in that package.
This is a side-effect of removing a special-case kludge in the tokeniser,
added for 5.10.0, to hide side-effects of changes to the internal storage of
hashes. The fix drastically reduces hashes' memory overhead.
Calling defined on a stash has been deprecated since 5.6.0, warned on
lexicals since 5.6.0, and warned for stashes and other package
variables since 5.12.0. C<defined %hash> has always exposed an
implementation detail: emptying a hash by deleting all entries from it does
not make C<defined %hash> false. Hence C<defined %hash> is not valid code to
determine whether an arbitrary hash is empty. Instead, use the behaviour
of an empty C<%hash> always returning false in scalar context.
=head3 Clearing stashes
Stash list assignment C<%foo:: = ()> used to make the stash temporarily
anonymous while it was being emptied. Consequently, any of its
subroutines referenced elsewhere would become anonymous, showing up as
"(unknown)" in C<caller>. They now retain their package names such that
C<caller> returns the original sub name if there is still a reference
to its typeglob and "foo::__ANON__" otherwise [perl #79208].
=head3 Dereferencing typeglobs
If you assign a typeglob to a scalar variable:
$glob = *foo;
the glob that is copied to C<$glob> is marked with a special flag
indicating that the glob is just a copy. This allows subsequent
assignments to C<$glob> to overwrite the glob. The original glob,
however, is immutable.
Some Perl operators did not distinguish between these two types of globs.
This would result in strange behaviour in edge cases: C<untie $scalar>
would not untie the scalar if the last thing assigned to it was a glob
(because it treated it as C<untie *$scalar>, which unties a handle).
Assignment to a glob slot (such as C<*$glob = \@some_array>) would simply
assign C<\@some_array> to C<$glob>.
To fix this, the C<*{}> operator (including its C<*foo> and C<*$foo> forms)
has been modified to make a new immutable glob if its operand is a glob
copy. This allows operators that make a distinction between globs and
scalars to be modified to treat only immutable globs as globs. (C<tie>,
C<tied> and C<untie> have been left as they are for compatibility's sake,
but will warn. See L</Deprecations>.)
This causes an incompatible change in code that assigns a glob to the
return value of C<*{}> when that operator was passed a glob copy. Take the
following code, for instance:
$glob = *foo;
*$glob = *bar;
The C<*$glob> on the second line returns a new immutable glob. That new
glob is made an alias to C<*bar>. Then it is discarded. So the second
assignment has no effect.
See L<http://rt.perl.org/rt3/Public/Bug/Display.html?id=77810> for
more detail.
=head3 Magic variables outside the main package
In previous versions of Perl, magic variables like C<$!>, C<%SIG>, etc. would
"leak" into other packages. So C<%foo::SIG> could be used to access signals,
C<${"foo::!"}> (with strict mode off) to access C's C<errno>, etc.
This was a bug, or an "unintentional" feature, which caused various ill effects,
such as signal handlers being wiped when modules were loaded, etc.
This has been fixed (or the feature has been removed, depending on how you see
it).
=head3 local($_) strips all magic from $_
local() on scalar variables gives them a new value but keeps all
their magic intact. This has proven problematic for the default
scalar variable $_, where L<perlsub> recommends that any subroutine
that assigns to $_ should first localize it. This would throw an
exception if $_ is aliased to a read-only variable, and could in general have
various unintentional side-effects.
Therefore, as an exception to the general rule, local($_) will not
only assign a new value to $_, but also remove all existing magic from
it as well.
=head3 Parsing of package and variable names
Parsing the names of packages and package variables has changed:
multiple adjacent pairs of colons, as in C<foo::::bar>, are now all
treated as package separators.
Regardless of this change, the exact parsing of package separators has
never been guaranteed and is subject to change in future Perl versions.
=head2 Changes to Syntax or to Perl Operators
=head3 C<given> return values
C<given> blocks now return the last evaluated
expression, or an empty list if the block was exited by C<break>. Thus you
can now write:
my $type = do {
given ($num) {
break when undef;
"integer" when /^[+-]?[0-9]+$/;
"float" when /^[+-]?[0-9]+(?:\.[0-9]+)?$/;
"unknown";
}
};
See L<perlsyn/Return value> for details.
=head3 Change in parsing of certain prototypes
Functions declared with the following prototypes now behave correctly as unary
functions:
*
\$ \% \@ \* \&
\[...]
;$ ;*
;\$ ;\% etc.
;\[...]
Due to this bug fix [perl #75904], functions
using the C<(*)>, C<(;$)> and C<(;*)> prototypes
are parsed with higher precedence than before. So
in the following example:
sub foo(;$);
foo $a < $b;
the second line is now parsed correctly as C<< foo($a) < $b >>, rather than
C<< foo($a < $b) >>. This happens when one of these operators is used in
an unparenthesised argument:
< > <= >= lt gt le ge
== != <=> eq ne cmp ~~
&
| ^
&&
|| //
.. ...
?:
= += -= *= etc.
, =>
=head3 Smart-matching against array slices
Previously, the following code resulted in a successful match:
my @a = qw(a y0 z);
my @b = qw(a x0 z);
@a[0 .. $#b] ~~ @b;
This odd behaviour has now been fixed [perl #77468].
=head3 Negation treats strings differently from before
The unary negation operator, C<->, now treats strings that look like numbers
as numbers [perl #57706].
=head3 Negative zero
Negative zero (-0.0), when converted to a string, now becomes "0" on all
platforms. It used to become "-0" on some, but "0" on others.
If you still need to determine whether a zero is negative, use
C<sprintf("%g", $zero) =~ /^-/> or the L<Data::Float> module on CPAN.
=head3 C<:=> is now a syntax error
Previously C<my $pi := 4> was exactly equivalent to C<my $pi : = 4>,
with the C<:> being treated as the start of an attribute list, ending before
the C<=>. The use of C<:=> to mean C<: => was deprecated in 5.12.0, and is
now a syntax error. This allows future use of C<:=> as a new token.
Outside the core's tests for it, we find no Perl 5 code on CPAN
using this construction, so we believe that this change will have
little impact on real-world codebases.
If it is absolutely necessary to have empty attribute lists (for example,
because of a code generator), simply avoid the error by adding a space before
the C<=>.
=head3 Change in the parsing of identifiers
Characters outside the Unicode "XIDStart" set are no longer allowed at the
beginning of an identifier. This means that certain accents and marks
that normally follow an alphabetic character may no longer be the first
character of an identifier.
=head2 Threads and Processes
=head3 Directory handles not copied to threads
On systems other than Windows that do not have
a C<fchdir> function, newly-created threads no
longer inherit directory handles from their parent threads. Such programs
would usually have crashed anyway [perl #75154].
=head3 C<close> on shared pipes
To avoid deadlocks, the C<close> function no longer waits for the
child process to exit if the underlying file descriptor is still
in use by another thread. It returns true in such cases.
=head3 fork() emulation will not wait for signalled children
On Windows parent processes would not terminate until all forked
children had terminated first. However, C<kill("KILL", ...)> is
inherently unstable on pseudo-processes, and C<kill("TERM", ...)>
might not get delivered if the child is blocked in a system call.
To avoid the deadlock and still provide a safe mechanism to terminate
the hosting process, Perl now no longer waits for children that
have been sent a SIGTERM signal. It is up to the parent process to
waitpid() for these children if child-cleanup processing must be
allowed to finish. However, it is also then the responsibility of the
parent to avoid the deadlock by making sure the child process
can't be blocked on I/O.
See L<perlfork> for more information about the fork() emulation on
Windows.
=head2 Configuration
=head3 Naming fixes in Policy_sh.SH may invalidate Policy.sh
Several long-standing typos and naming confusions in F<Policy_sh.SH> have
been fixed, standardizing on the variable names used in F<config.sh>.
This will change the behaviour of F<Policy.sh> if you happen to have been
accidentally relying on its incorrect behaviour.
=head3 Perl source code is read in text mode on Windows
Perl scripts used to be read in binary mode on Windows for the benefit
of the L<ByteLoader> module (which is no longer part of core Perl). This
had the side-effect of breaking various operations on the C<DATA> filehandle,
including seek()/tell(), and even simply reading from C<DATA> after filehandles
have been flushed by a call to system(), backticks, fork() etc.
The default build options for Windows have been changed to read Perl source
code on Windows in text mode now. L<ByteLoader> will (hopefully) be updated on
CPAN to automatically handle this situation [perl #28106].
=head1 Deprecations
See also L</Deprecated C APIs>.
=head2 Omitting a space between a regular expression and subsequent word
Omitting the space between a regular expression operator or
its modifiers and the following word is deprecated. For
example, C<< m/foo/sand $bar >> is for now still parsed
as C<< m/foo/s and $bar >>, but will now issue a warning.
=head2 C<\cI<X>>
The backslash-c construct was designed as a way of specifying
non-printable characters, but there were no restrictions (on ASCII
platforms) on what the character following the C<c> could be. Now,
a deprecation warning is raised if that character isn't an ASCII character.
Also, a deprecation warning is raised for C<"\c{"> (which is the same
as simply saying C<";">).
=head2 C<"\b{"> and C<"\B{">
In regular expressions, a literal C<"{"> immediately following a C<"\b">
(not in a bracketed character class) or a C<"\B{"> is now deprecated
to allow for its future use by Perl itself.
=head2 Perl 4-era .pl libraries
Perl bundles a handful of library files that predate Perl 5.
This bundling is now deprecated for most of these files, which are now
available from CPAN. The affected files now warn when run, if they were
installed as part of the core.
This is a mandatory warning, not obeying B<-X> or lexical warning bits.
The warning is modelled on that supplied by F<deprecate.pm> for
deprecated-in-core F<.pm> libraries. It points to the specific CPAN
distribution that contains the F<.pl> libraries. The CPAN versions, of
course, do not generate the warning.
=head2 List assignment to C<$[>
Assignment to C<$[> was deprecated and started to give warnings in
Perl version 5.12.0. This version of Perl (5.14) now also emits a warning
when assigning to C<$[> in list context. This fixes an oversight in 5.12.0.
=head2 Use of qw(...) as parentheses
Historically the parser fooled itself into thinking that C<qw(...)> literals
were always enclosed in parentheses, and as a result you could sometimes omit
parentheses around them:
for $x qw(a b c) { ... }
The parser no longer lies to itself in this way. Wrap the list literal in
parentheses like this:
for $x (qw(a b c)) { ... }
This is being deprecated because the parentheses in C<for $i (1,2,3) { ... }>
are not part of expression syntax. They are part of the statement
syntax, with the C<for> statement wanting literal parentheses.
The synthetic parentheses that a C<qw> expression acquired were only
intended to be treated as part of expression syntax.
Note that this does not change the behaviour of cases like:
use POSIX qw(setlocale localeconv);
our @EXPORT = qw(foo bar baz);
where parentheses were never required around the expression.
=head2 C<\N{BELL}>
This is because Unicode is using that name for a different character.
See L</Unicode Version 6.0 is now supported (mostly)> for more
explanation.
=head2 C<?PATTERN?>
C<?PATTERN?> (without the initial C<m>) has been deprecated and now produces
a warning. This is to allow future use of C<?> in new operators.
The match-once functionality is still available as C<m?PATTERN?>.
=head2 Tie functions on scalars holding typeglobs