forked from schacon/perl
-
Notifications
You must be signed in to change notification settings - Fork 1
/
perl.man
6007 lines (5490 loc) · 189 KB
/
perl.man
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
.rn '' }`
''' $RCSfile: perl.man,v $$Revision: 4.0.1.6 $$Date: 92/06/08 15:07:29 $
'''
''' $Log: perl.man,v $
''' Revision 4.0.1.6 92/06/08 15:07:29 lwall
''' patch20: documented that numbers may contain underline
''' patch20: clarified that DATA may only be read from main script
''' patch20: relaxed requirement for semicolon at the end of a block
''' patch20: added ... as variant on ..
''' patch20: documented need for 1; at the end of a required file
''' patch20: extended bracket-style quotes to two-arg operators: s()() and tr()()
''' patch20: paragraph mode now skips extra newlines automatically
''' patch20: documented PERLLIB and PERLDB
''' patch20: documented limit on size of regexp
'''
''' Revision 4.0.1.5 91/11/11 16:42:00 lwall
''' patch19: added little-endian pack/unpack options
'''
''' Revision 4.0.1.4 91/11/05 18:11:05 lwall
''' patch11: added sort {} LIST
''' patch11: added eval {}
''' patch11: documented meaning of scalar(%foo)
''' patch11: sprintf() now supports any length of s field
'''
''' Revision 4.0.1.3 91/06/10 01:26:02 lwall
''' patch10: documented some newer features in addenda
'''
''' Revision 4.0.1.2 91/06/07 11:41:23 lwall
''' patch4: added global modifier for pattern matches
''' patch4: default top-of-form format is now FILEHANDLE_TOP
''' patch4: added $^P variable to control calling of perldb routines
''' patch4: added $^F variable to specify maximum system fd, default 2
''' patch4: changed old $^P to $^X
'''
''' Revision 4.0.1.1 91/04/11 17:50:44 lwall
''' patch1: fixed some typos
'''
''' Revision 4.0 91/03/20 01:38:08 lwall
''' 4.0 baseline.
'''
'''
.de Sh
.br
.ne 5
.PP
\fB\\$1\fR
.PP
..
.de Sp
.if t .sp .5v
.if n .sp
..
.de Ip
.br
.ie \\n(.$>=3 .ne \\$3
.el .ne 3
.IP "\\$1" \\$2
..
'''
''' Set up \*(-- to give an unbreakable dash;
''' string Tr holds user defined translation string.
''' Bell System Logo is used as a dummy character.
'''
.tr \(*W-|\(bv\*(Tr
.ie n \{\
.ds -- \(*W-
.if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch
.if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch
.ds L" ""
.ds R" ""
.ds L' '
.ds R' '
'br\}
.el\{\
.ds -- \(em\|
.tr \*(Tr
.ds L" ``
.ds R" ''
.ds L' `
.ds R' '
'br\}
.TH PERL 1 "\*(RP"
.UC
.SH NAME
perl \- Practical Extraction and Report Language
.SH SYNOPSIS
.B perl
[options] filename args
.SH DESCRIPTION
.I Perl
is an interpreted language optimized for scanning arbitrary text files,
extracting information from those text files, and printing reports based
on that information.
It's also a good language for many system management tasks.
The language is intended to be practical (easy to use, efficient, complete)
rather than beautiful (tiny, elegant, minimal).
It combines (in the author's opinion, anyway) some of the best features of C,
\fIsed\fR, \fIawk\fR, and \fIsh\fR,
so people familiar with those languages should have little difficulty with it.
(Language historians will also note some vestiges of \fIcsh\fR, Pascal, and
even BASIC-PLUS.)
Expression syntax corresponds quite closely to C expression syntax.
Unlike most Unix utilities,
.I perl
does not arbitrarily limit the size of your data\*(--if you've got
the memory,
.I perl
can slurp in your whole file as a single string.
Recursion is of unlimited depth.
And the hash tables used by associative arrays grow as necessary to prevent
degraded performance.
.I Perl
uses sophisticated pattern matching techniques to scan large amounts of
data very quickly.
Although optimized for scanning text,
.I perl
can also deal with binary data, and can make dbm files look like associative
arrays (where dbm is available).
Setuid
.I perl
scripts are safer than C programs
through a dataflow tracing mechanism which prevents many stupid security holes.
If you have a problem that would ordinarily use \fIsed\fR
or \fIawk\fR or \fIsh\fR, but it
exceeds their capabilities or must run a little faster,
and you don't want to write the silly thing in C, then
.I perl
may be for you.
There are also translators to turn your
.I sed
and
.I awk
scripts into
.I perl
scripts.
OK, enough hype.
.PP
Upon startup,
.I perl
looks for your script in one of the following places:
.Ip 1. 4 2
Specified line by line via
.B \-e
switches on the command line.
.Ip 2. 4 2
Contained in the file specified by the first filename on the command line.
(Note that systems supporting the #! notation invoke interpreters this way.)
.Ip 3. 4 2
Passed in implicitly via standard input.
This only works if there are no filename arguments\*(--to pass
arguments to a
.I stdin
script you must explicitly specify a \- for the script name.
.PP
After locating your script,
.I perl
compiles it to an internal form.
If the script is syntactically correct, it is executed.
.Sh "Options"
Note: on first reading this section may not make much sense to you. It's here
at the front for easy reference.
.PP
A single-character option may be combined with the following option, if any.
This is particularly useful when invoking a script using the #! construct which
only allows one argument. Example:
.nf
.ne 2
#!/usr/bin/perl \-spi.bak # same as \-s \-p \-i.bak
.\|.\|.
.fi
Options include:
.TP 5
.BI \-0 digits
specifies the record separator ($/) as an octal number.
If there are no digits, the null character is the separator.
Other switches may precede or follow the digits.
For example, if you have a version of
.I find
which can print filenames terminated by the null character, you can say this:
.nf
find . \-name '*.bak' \-print0 | perl \-n0e unlink
.fi
The special value 00 will cause Perl to slurp files in paragraph mode.
The value 0777 will cause Perl to slurp files whole since there is no
legal character with that value.
.TP 5
.B \-a
turns on autosplit mode when used with a
.B \-n
or
.BR \-p .
An implicit split command to the @F array
is done as the first thing inside the implicit while loop produced by
the
.B \-n
or
.BR \-p .
.nf
perl \-ane \'print pop(@F), "\en";\'
is equivalent to
while (<>) {
@F = split(\' \');
print pop(@F), "\en";
}
.fi
.TP 5
.B \-c
causes
.I perl
to check the syntax of the script and then exit without executing it.
.TP 5
.BI \-d
runs the script under the perl debugger.
See the section on Debugging.
.TP 5
.BI \-D number
sets debugging flags.
To watch how it executes your script, use
.BR \-D14 .
(This only works if debugging is compiled into your
.IR perl .)
Another nice value is \-D1024, which lists your compiled syntax tree.
And \-D512 displays compiled regular expressions.
.TP 5
.BI \-e " commandline"
may be used to enter one line of script.
Multiple
.B \-e
commands may be given to build up a multi-line script.
If
.B \-e
is given,
.I perl
will not look for a script filename in the argument list.
.TP 5
.BI \-i extension
specifies that files processed by the <> construct are to be edited
in-place.
It does this by renaming the input file, opening the output file by the
same name, and selecting that output file as the default for print statements.
The extension, if supplied, is added to the name of the
old file to make a backup copy.
If no extension is supplied, no backup is made.
Saying \*(L"perl \-p \-i.bak \-e "s/foo/bar/;" .\|.\|. \*(R" is the same as using
the script:
.nf
.ne 2
#!/usr/bin/perl \-pi.bak
s/foo/bar/;
which is equivalent to
.ne 14
#!/usr/bin/perl
while (<>) {
if ($ARGV ne $oldargv) {
rename($ARGV, $ARGV . \'.bak\');
open(ARGVOUT, ">$ARGV");
select(ARGVOUT);
$oldargv = $ARGV;
}
s/foo/bar/;
}
continue {
print; # this prints to original filename
}
select(STDOUT);
.fi
except that the
.B \-i
form doesn't need to compare $ARGV to $oldargv to know when
the filename has changed.
It does, however, use ARGVOUT for the selected filehandle.
Note that
.I STDOUT
is restored as the default output filehandle after the loop.
.Sp
You can use eof to locate the end of each input file, in case you want
to append to each file, or reset line numbering (see example under eof).
.TP 5
.BI \-I directory
may be used in conjunction with
.B \-P
to tell the C preprocessor where to look for include files.
By default /usr/include and /usr/lib/perl are searched.
.TP 5
.BI \-l octnum
enables automatic line-ending processing. It has two effects:
first, it automatically chops the line terminator when used with
.B \-n
or
.B \-p ,
and second, it assigns $\e to have the value of
.I octnum
so that any print statements will have that line terminator added back on. If
.I octnum
is omitted, sets $\e to the current value of $/.
For instance, to trim lines to 80 columns:
.nf
perl -lpe \'substr($_, 80) = ""\'
.fi
Note that the assignment $\e = $/ is done when the switch is processed,
so the input record separator can be different than the output record
separator if the
.B \-l
switch is followed by a
.B \-0
switch:
.nf
gnufind / -print0 | perl -ln0e 'print "found $_" if -p'
.fi
This sets $\e to newline and then sets $/ to the null character.
.TP 5
.B \-n
causes
.I perl
to assume the following loop around your script, which makes it iterate
over filename arguments somewhat like \*(L"sed \-n\*(R" or \fIawk\fR:
.nf
.ne 3
while (<>) {
.\|.\|. # your script goes here
}
.fi
Note that the lines are not printed by default.
See
.B \-p
to have lines printed.
Here is an efficient way to delete all files older than a week:
.nf
find . \-mtime +7 \-print | perl \-nle \'unlink;\'
.fi
This is faster than using the \-exec switch of find because you don't have to
start a process on every filename found.
.TP 5
.B \-p
causes
.I perl
to assume the following loop around your script, which makes it iterate
over filename arguments somewhat like \fIsed\fR:
.nf
.ne 5
while (<>) {
.\|.\|. # your script goes here
} continue {
print;
}
.fi
Note that the lines are printed automatically.
To suppress printing use the
.B \-n
switch.
A
.B \-p
overrides a
.B \-n
switch.
.TP 5
.B \-P
causes your script to be run through the C preprocessor before
compilation by
.IR perl .
(Since both comments and cpp directives begin with the # character,
you should avoid starting comments with any words recognized
by the C preprocessor such as \*(L"if\*(R", \*(L"else\*(R" or \*(L"define\*(R".)
.TP 5
.B \-s
enables some rudimentary switch parsing for switches on the command line
after the script name but before any filename arguments (or before a \-\|\-).
Any switch found there is removed from @ARGV and sets the corresponding variable in the
.I perl
script.
The following script prints \*(L"true\*(R" if and only if the script is
invoked with a \-xyz switch.
.nf
.ne 2
#!/usr/bin/perl \-s
if ($xyz) { print "true\en"; }
.fi
.TP 5
.B \-S
makes
.I perl
use the PATH environment variable to search for the script
(unless the name of the script starts with a slash).
Typically this is used to emulate #! startup on machines that don't
support #!, in the following manner:
.nf
#!/usr/bin/perl
eval "exec /usr/bin/perl \-S $0 $*"
if $running_under_some_shell;
.fi
The system ignores the first line and feeds the script to /bin/sh,
which proceeds to try to execute the
.I perl
script as a shell script.
The shell executes the second line as a normal shell command, and thus
starts up the
.I perl
interpreter.
On some systems $0 doesn't always contain the full pathname,
so the
.B \-S
tells
.I perl
to search for the script if necessary.
After
.I perl
locates the script, it parses the lines and ignores them because
the variable $running_under_some_shell is never true.
A better construct than $* would be ${1+"$@"}, which handles embedded spaces
and such in the filenames, but doesn't work if the script is being interpreted
by csh.
In order to start up sh rather than csh, some systems may have to replace the
#! line with a line containing just
a colon, which will be politely ignored by perl.
Other systems can't control that, and need a totally devious construct that
will work under any of csh, sh or perl, such as the following:
.nf
.ne 3
eval '(exit $?0)' && eval 'exec /usr/bin/perl -S $0 ${1+"$@"}'
& eval 'exec /usr/bin/perl -S $0 $argv:q'
if 0;
.fi
.TP 5
.B \-u
causes
.I perl
to dump core after compiling your script.
You can then take this core dump and turn it into an executable file
by using the undump program (not supplied).
This speeds startup at the expense of some disk space (which you can
minimize by stripping the executable).
(Still, a "hello world" executable comes out to about 200K on my machine.)
If you are going to run your executable as a set-id program then you
should probably compile it using taintperl rather than normal perl.
If you want to execute a portion of your script before dumping, use the
dump operator instead.
Note: availability of undump is platform specific and may not be available
for a specific port of perl.
.TP 5
.B \-U
allows
.I perl
to do unsafe operations.
Currently the only \*(L"unsafe\*(R" operations are the unlinking of directories while
running as superuser, and running setuid programs with fatal taint checks
turned into warnings.
.TP 5
.B \-v
prints the version and patchlevel of your
.I perl
executable.
.TP 5
.B \-w
prints warnings about identifiers that are mentioned only once, and scalar
variables that are used before being set.
Also warns about redefined subroutines, and references to undefined
filehandles or filehandles opened readonly that you are attempting to
write on.
Also warns you if you use == on values that don't look like numbers, and if
your subroutines recurse more than 100 deep.
.TP 5
.BI \-x directory
tells
.I perl
that the script is embedded in a message.
Leading garbage will be discarded until the first line that starts
with #! and contains the string "perl".
Any meaningful switches on that line will be applied (but only one
group of switches, as with normal #! processing).
If a directory name is specified, Perl will switch to that directory
before running the script.
The
.B \-x
switch only controls the the disposal of leading garbage.
The script must be terminated with _\|_END_\|_ if there is trailing garbage
to be ignored (the script can process any or all of the trailing garbage
via the DATA filehandle if desired).
.Sh "Data Types and Objects"
.PP
.I Perl
has three data types: scalars, arrays of scalars, and
associative arrays of scalars.
Normal arrays are indexed by number, and associative arrays by string.
.PP
The interpretation of operations and values in perl sometimes
depends on the requirements
of the context around the operation or value.
There are three major contexts: string, numeric and array.
Certain operations return array values
in contexts wanting an array, and scalar values otherwise.
(If this is true of an operation it will be mentioned in the documentation
for that operation.)
Operations which return scalars don't care whether the context is looking
for a string or a number, but
scalar variables and values are interpreted as strings or numbers
as appropriate to the context.
A scalar is interpreted as TRUE in the boolean sense if it is not the null
string or 0.
Booleans returned by operators are 1 for true and 0 or \'\' (the null
string) for false.
.PP
There are actually two varieties of null string: defined and undefined.
Undefined null strings are returned when there is no real value for something,
such as when there was an error, or at end of file, or when you refer
to an uninitialized variable or element of an array.
An undefined null string may become defined the first time you access it, but
prior to that you can use the defined() operator to determine whether the
value is defined or not.
.PP
References to scalar variables always begin with \*(L'$\*(R', even when referring
to a scalar that is part of an array.
Thus:
.nf
.ne 3
$days \h'|2i'# a simple scalar variable
$days[28] \h'|2i'# 29th element of array @days
$days{\'Feb\'}\h'|2i'# one value from an associative array
$#days \h'|2i'# last index of array @days
but entire arrays or array slices are denoted by \*(L'@\*(R':
@days \h'|2i'# ($days[0], $days[1],\|.\|.\|. $days[n])
@days[3,4,5]\h'|2i'# same as @days[3.\|.5]
@days{'a','c'}\h'|2i'# same as ($days{'a'},$days{'c'})
and entire associative arrays are denoted by \*(L'%\*(R':
%days \h'|2i'# (key1, val1, key2, val2 .\|.\|.)
.fi
.PP
Any of these eight constructs may serve as an lvalue,
that is, may be assigned to.
(It also turns out that an assignment is itself an lvalue in
certain contexts\*(--see examples under s, tr and chop.)
Assignment to a scalar evaluates the righthand side in a scalar context,
while assignment to an array or array slice evaluates the righthand side
in an array context.
.PP
You may find the length of array @days by evaluating
\*(L"$#days\*(R", as in
.IR csh .
(Actually, it's not the length of the array, it's the subscript of the last element, since there is (ordinarily) a 0th element.)
Assigning to $#days changes the length of the array.
Shortening an array by this method does not actually destroy any values.
Lengthening an array that was previously shortened recovers the values that
were in those elements.
You can also gain some measure of efficiency by preextending an array that
is going to get big.
(You can also extend an array by assigning to an element that is off the
end of the array.
This differs from assigning to $#whatever in that intervening values
are set to null rather than recovered.)
You can truncate an array down to nothing by assigning the null list () to
it.
The following are exactly equivalent
.nf
@whatever = ();
$#whatever = $[ \- 1;
.fi
.PP
If you evaluate an array in a scalar context, it returns the length of
the array.
The following is always true:
.nf
scalar(@whatever) == $#whatever \- $[ + 1;
.fi
If you evaluate an associative array in a scalar context, it returns
a value which is true if and only if the array contains any elements.
(If there are any elements, the value returned is a string consisting
of the number of used buckets and the number of allocated buckets, separated
by a slash.)
.PP
Multi-dimensional arrays are not directly supported, but see the discussion
of the $; variable later for a means of emulating multiple subscripts with
an associative array.
You could also write a subroutine to turn multiple subscripts into a single
subscript.
.PP
Every data type has its own namespace.
You can, without fear of conflict, use the same name for a scalar variable,
an array, an associative array, a filehandle, a subroutine name, and/or
a label.
Since variable and array references always start with \*(L'$\*(R', \*(L'@\*(R',
or \*(L'%\*(R', the \*(L"reserved\*(R" words aren't in fact reserved
with respect to variable names.
(They ARE reserved with respect to labels and filehandles, however, which
don't have an initial special character.
Hint: you could say open(LOG,\'logfile\') rather than open(log,\'logfile\').
Using uppercase filehandles also improves readability and protects you
from conflict with future reserved words.)
Case IS significant\*(--\*(L"FOO\*(R", \*(L"Foo\*(R" and \*(L"foo\*(R" are all
different names.
Names which start with a letter may also contain digits and underscores.
Names which do not start with a letter are limited to one character,
e.g. \*(L"$%\*(R" or \*(L"$$\*(R".
(Most of the one character names have a predefined significance to
.IR perl .
More later.)
.PP
Numeric literals are specified in any of the usual floating point or
integer formats:
.nf
.ne 6
12345
12345.67
.23E-10
0xffff # hex
0377 # octal
4_294_967_296
.fi
String literals are delimited by either single or double quotes.
They work much like shell quotes:
double-quoted string literals are subject to backslash and variable
substitution; single-quoted strings are not (except for \e\' and \e\e).
The usual backslash rules apply for making characters such as newline, tab,
etc., as well as some more exotic forms:
.nf
\et tab
\en newline
\er return
\ef form feed
\eb backspace
\ea alarm (bell)
\ee escape
\e033 octal char
\ex1b hex char
\ec[ control char
\el lowercase next char
\eu uppercase next char
\eL lowercase till \eE
\eU uppercase till \eE
\eE end case modification
.fi
You can also embed newlines directly in your strings, i.e. they can end on
a different line than they begin.
This is nice, but if you forget your trailing quote, the error will not be
reported until
.I perl
finds another line containing the quote character, which
may be much further on in the script.
Variable substitution inside strings is limited to scalar variables, normal
array values, and array slices.
(In other words, identifiers beginning with $ or @, followed by an optional
bracketed expression as a subscript.)
The following code segment prints out \*(L"The price is $100.\*(R"
.nf
.ne 2
$Price = \'$100\';\h'|3.5i'# not interpreted
print "The price is $Price.\e\|n";\h'|3.5i'# interpreted
.fi
Note that you can put curly brackets around the identifier to delimit it
from following alphanumerics.
Also note that a single quoted string must be separated from a preceding
word by a space, since single quote is a valid character in an identifier
(see Packages).
.PP
Two special literals are _\|_LINE_\|_ and _\|_FILE_\|_, which represent the current
line number and filename at that point in your program.
They may only be used as separate tokens; they will not be interpolated
into strings.
In addition, the token _\|_END_\|_ may be used to indicate the logical end of the
script before the actual end of file.
Any following text is ignored, but may be read via the DATA filehandle.
(The DATA filehandle may read data only from the main script, but not from
any required file or evaluated string.)
The two control characters ^D and ^Z are synonyms for _\|_END_\|_.
.PP
A word that doesn't have any other interpretation in the grammar will be
treated as if it had single quotes around it.
For this purpose, a word consists only of alphanumeric characters and underline,
and must start with an alphabetic character.
As with filehandles and labels, a bare word that consists entirely of
lowercase letters risks conflict with future reserved words, and if you
use the
.B \-w
switch, Perl will warn you about any such words.
.PP
Array values are interpolated into double-quoted strings by joining all the
elements of the array with the delimiter specified in the $" variable,
space by default.
(Since in versions of perl prior to 3.0 the @ character was not a metacharacter
in double-quoted strings, the interpolation of @array, $array[EXPR],
@array[LIST], $array{EXPR}, or @array{LIST} only happens if array is
referenced elsewhere in the program or is predefined.)
The following are equivalent:
.nf
.ne 4
$temp = join($",@ARGV);
system "echo $temp";
system "echo @ARGV";
.fi
Within search patterns (which also undergo double-quotish substitution)
there is a bad ambiguity: Is /$foo[bar]/ to be
interpreted as /${foo}[bar]/ (where [bar] is a character class for the
regular expression) or as /${foo[bar]}/ (where [bar] is the subscript to
array @foo)?
If @foo doesn't otherwise exist, then it's obviously a character class.
If @foo exists, perl takes a good guess about [bar], and is almost always right.
If it does guess wrong, or if you're just plain paranoid,
you can force the correct interpretation with curly brackets as above.
.PP
A line-oriented form of quoting is based on the shell here-is syntax.
Following a << you specify a string to terminate the quoted material, and all lines
following the current line down to the terminating string are the value
of the item.
The terminating string may be either an identifier (a word), or some
quoted text.
If quoted, the type of quotes you use determines the treatment of the text,
just as in regular quoting.
An unquoted identifier works like double quotes.
There must be no space between the << and the identifier.
(If you put a space it will be treated as a null identifier, which is
valid, and matches the first blank line\*(--see Merry Christmas example below.)
The terminating string must appear by itself (unquoted and with no surrounding
whitespace) on the terminating line.
.nf
print <<EOF; # same as above
The price is $Price.
EOF
print <<"EOF"; # same as above
The price is $Price.
EOF
print << x 10; # null identifier is delimiter
Merry Christmas!
print <<`EOC`; # execute commands
echo hi there
echo lo there
EOC
print <<foo, <<bar; # you can stack them
I said foo.
foo
I said bar.
bar
.fi
Array literals are denoted by separating individual values by commas, and
enclosing the list in parentheses:
.nf
(LIST)
.fi
In a context not requiring an array value, the value of the array literal
is the value of the final element, as in the C comma operator.
For example,
.nf
.ne 4
@foo = (\'cc\', \'\-E\', $bar);
assigns the entire array value to array foo, but
$foo = (\'cc\', \'\-E\', $bar);
.fi
assigns the value of variable bar to variable foo.
Note that the value of an actual array in a scalar context is the length
of the array; the following assigns to $foo the value 3:
.nf
.ne 2
@foo = (\'cc\', \'\-E\', $bar);
$foo = @foo; # $foo gets 3
.fi
You may have an optional comma before the closing parenthesis of an
array literal, so that you can say:
.nf
@foo = (
1,
2,
3,
);
.fi
When a LIST is evaluated, each element of the list is evaluated in
an array context, and the resulting array value is interpolated into LIST
just as if each individual element were a member of LIST. Thus arrays
lose their identity in a LIST\*(--the list
(@foo,@bar,&SomeSub)
contains all the elements of @foo followed by all the elements of @bar,
followed by all the elements returned by the subroutine named SomeSub.
.PP
A list value may also be subscripted like a normal array.
Examples:
.nf
$time = (stat($file))[8]; # stat returns array value
$digit = ('a','b','c','d','e','f')[$digit-10];
return (pop(@foo),pop(@foo))[0];
.fi
.PP
Array lists may be assigned to if and only if each element of the list
is an lvalue:
.nf
($a, $b, $c) = (1, 2, 3);
($map{\'red\'}, $map{\'blue\'}, $map{\'green\'}) = (0x00f, 0x0f0, 0xf00);
The final element may be an array or an associative array:
($a, $b, @rest) = split;
local($a, $b, %rest) = @_;
.fi
You can actually put an array anywhere in the list, but the first array
in the list will soak up all the values, and anything after it will get
a null value.
This may be useful in a local().
.PP
An associative array literal contains pairs of values to be interpreted
as a key and a value:
.nf
.ne 2
# same as map assignment above
%map = ('red',0x00f,'blue',0x0f0,'green',0xf00);
.fi
Array assignment in a scalar context returns the number of elements
produced by the expression on the right side of the assignment:
.nf
$x = (($foo,$bar) = (3,2,1)); # set $x to 3, not 2
.fi
.PP
There are several other pseudo-literals that you should know about.
If a string is enclosed by backticks (grave accents), it first undergoes
variable substitution just like a double quoted string.
It is then interpreted as a command, and the output of that command
is the value of the pseudo-literal, like in a shell.
In a scalar context, a single string consisting of all the output is
returned.
In an array context, an array of values is returned, one for each line
of output.
(You can set $/ to use a different line terminator.)
The command is executed each time the pseudo-literal is evaluated.
The status value of the command is returned in $? (see Predefined Names
for the interpretation of $?).
Unlike in \f2csh\f1, no translation is done on the return
data\*(--newlines remain newlines.
Unlike in any of the shells, single quotes do not hide variable names
in the command from interpretation.
To pass a $ through to the shell you need to hide it with a backslash.
.PP
Evaluating a filehandle in angle brackets yields the next line
from that file (newline included, so it's never false until EOF, at
which time an undefined value is returned).
Ordinarily you must assign that value to a variable,
but there is one situation where an automatic assignment happens.
If (and only if) the input symbol is the only thing inside the conditional of a
.I while
loop, the value is
automatically assigned to the variable \*(L"$_\*(R".
(This may seem like an odd thing to you, but you'll use the construct
in almost every
.I perl
script you write.)
Anyway, the following lines are equivalent to each other:
.nf
.ne 5
while ($_ = <STDIN>) { print; }
while (<STDIN>) { print; }
for (\|;\|<STDIN>;\|) { print; }
print while $_ = <STDIN>;
print while <STDIN>;
.fi
The filehandles
.IR STDIN ,
.I STDOUT
and
.I STDERR
are predefined.
(The filehandles
.IR stdin ,
.I stdout
and
.I stderr
will also work except in packages, where they would be interpreted as
local identifiers rather than global.)
Additional filehandles may be created with the
.I open
function.
.PP
If a <FILEHANDLE> is used in a context that is looking for an array, an array
consisting of all the input lines is returned, one line per array element.
It's easy to make a LARGE data space this way, so use with care.
.PP
The null filehandle <> is special and can be used to emulate the behavior of
\fIsed\fR and \fIawk\fR.
Input from <> comes either from standard input, or from each file listed on
the command line.
Here's how it works: the first time <> is evaluated, the ARGV array is checked,
and if it is null, $ARGV[0] is set to \'-\', which when opened gives you standard
input.
The ARGV array is then processed as a list of filenames.
The loop
.nf
.ne 3
while (<>) {
.\|.\|. # code for each line
}
.ne 10
is equivalent to the following Perl-like pseudo code:
unshift(@ARGV, \'\-\') \|if \|$#ARGV < $[;
while ($ARGV = shift) {
open(ARGV, $ARGV);
while (<ARGV>) {
.\|.\|. # code for each line
}
}
.fi
except that it isn't as cumbersome to say, and will actually work.
It really does shift array ARGV and put the current filename into
variable ARGV.
It also uses filehandle ARGV internally\*(--<> is just a synonym for
<ARGV>, which is magical.
(The pseudo code above doesn't work because it treats <ARGV> as non-magical.)
.PP
You can modify @ARGV before the first <> as long as the array ends up
containing the list of filenames you really want.
Line numbers ($.) continue as if the input was one big happy file.
(But see example under eof for how to reset line numbers on each file.)
.PP
.ne 5
If you want to set @ARGV to your own list of files, go right ahead.
If you want to pass switches into your script, you can
put a loop on the front like this:
.nf
.ne 10
while ($_ = $ARGV[0], /\|^\-/\|) {
shift;
last if /\|^\-\|\-$\|/\|;
/\|^\-D\|(.*\|)/ \|&& \|($debug = $1);
/\|^\-v\|/ \|&& \|$verbose++;
.\|.\|. # other switches
}
while (<>) {
.\|.\|. # code for each line
}