Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Newer
Older
100644 521 lines (366 sloc) 14.266 kB
35b909c @rurban add pddtypes.pod - executive summary for const and types
authored
1 =head1 SUMMARY
2
3 perl design draft - types and const
4
5 =head2 Perl already has a type system
6
7 E.g. C<my Dog $wuff>; stores the "Dog" stashname in comppad_names.
8
9 Only for lexicals (which is good and strict-safe). Almost nobody uses it.
10 There are ~5 typed modules on CPAN, Net::DNS the most prominent.
11 5.10 broke all old type modules. Moose started afresh, but naive and
12 different.
13
14 The biggest performance win would be to be able to mark packages and
15 its @ISA as B<readonly> i.e. B<const> or type objects instances to do
16 method_call compile-time optimizations. Also hash function
17 optimizations (I<perfect>), natively typed arrays and hashes,
18 more constant folding and maybe shared strings.
19
20 const package MyBase 0.01 {
21 our @ISA = ();
22 sub new { bless { @_ }, shift }
23 }
24 const package MyChild 0.01 {
25 our const @ISA = ('MyBase');
26 }
27
28 my $obj = MyChild->new;
29 => MyClass::new()
30
31 See L<perltypes/"Compile-time type optimizations">
32
33 Static compilers such as <B::C> and <B::CC> can optimize storage and
34 run-time performance furthermore. Most of the previously dynamically
35 allocated data can be made static (no parsing time, instant startup
36 time), some data can use COW.
37
38 The management win would be to actually enable strict typing rules
39 and to catch coding mistakes at compile-time.
40 See L<perltypes/"Compile-time type checks">
41
42 =head2 MOP with 5.18
43
44 With the new MOP we can mark classes as C<closed> or immutable
45 (= optimizable), and C<type> function arguments (optimizable method
46 calls + optional type checks).
47
48 package My;
49
50 class MyBase (is_closed => 1) { sub new { bless {@_}, shift } }
51 class MyDog (is_closed => 1, extends => MyBase) {
52 #sub disallowed!
53 method bark (Dog $dog) { print "bark ",ref $dog; }
54 }
55
56 =head2 const
57
d1c07ab @rurban pddtypes.pod: more in =head2 const
authored
58 const is added as type-qualifier to lexical declarations and to packages
59 with the feature B<'const'>, which was added with v5.18.
60
35b909c @rurban add pddtypes.pod - executive summary for const and types
authored
61 Declaration of C<my const> variables and packages allows compile-time
62 checks, shared strings, lexical scope instead of C<use constant>,
63 familiar syntax, and efficient optimizations, also for extensions,
64 as the L<p5-mop>, L<coretypes> or L<B::C>.
d1c07ab @rurban pddtypes.pod: more in =head2 const
authored
65 const packages allow compile-time optimizations of method calls.
66
67 use v5.18;
68 my const $a = 0;
69 my const @a = (0..9);
70 my const %h = ('ok' => '1',
71 'bad' => '2');
72 const package foo {
73 ...
74 }
35b909c @rurban add pddtypes.pod - executive summary for const and types
authored
75
76 Why not as attribute C<my $var :const>?
77
78 1. This was my first proposal. I<blogs.perl Feb 2011>
79
80 2. This attribute must also be special cased in core as my const.
81 The hard part is optree support for const pads, not the keyword.
82
83 3. Third party modules cannot use this attribute, as there is no
84 CHECK time hook for attributes yet, only at run-time. But at
85 run-time it is too late.
86
87 4. The internal implementation look both bad, I'll try both.
88 It will be easier to use for C<class>, as class declarations
89 are already overloaded with new syntax: extends, with, is,
90 is_closed, metaclass, DEMOLISH, BUILD, FINALIZE.
91 class NAME (is_closed => 1) {} currently defines an
92 immutable, constant class.
93
94 5. my const looks better and familiar
95
96 my const($a, int $b) = (0,1);
97 my (const $a, int $b) = (0,1);
98 vs:
99 my ($a:const, int $b:const) = (0,1);
100
101 =head2 p5 does NOT declare a type system
102
103 p5 only allows storing types in C<comppad_names> and does some
104 compile-time optimizations with const and methods. It only declares
105 const.
106
107 p5 does not declare a type system by itself. One must use an extension
108 which declares and handles its types. p5-mop is such a meta type system.
109
110 L<coretypes> declares the native core types int, double and string for
111 IV, NV and PV, for scalars, arrays and hashes. Not for functions yet,
112 as function parameters are handled by extensions. coretypes is
113 backwards compat.
114
115 There is no p5 super object, such as C<class> or C<object>. Maybe the
116 mop needs one, but the three coretypes do not inherit. There is no
117 C<< $var->>print >>, C<< @a->reverse >> and such planned.
118 This can be done by mixins. coretypes will be slim, p5-mop will be fat.
119 But hopefully optimizable (i.e. compile-time) in the general case.
120
121 PS: Several people at YAPC expressed their wish to make class immutable
122 to be the new default. Then there must be a syntax to allow run-time changes
123 (i.e. non-constant classes).
124 class NAME (is_closed => 0) {}
125 class NAME :mutable {}
126 or such.
127
128
129 =head2 Acceptable type upgrades with const and coretypes
130
131 const variables are not purely constant, they are different from strictly
132 typed variables.
133 const variables may be upgraded to its string or numeric representation.
134 They may be numified and/or stringified, strictly typed variables not.
135
136 my const $a = 1;
137 my const $s = "1";
138 my int $ti = 0;
139 my string $ts = "1";
140
141 print "my $a"; # valid, upgraded from IV to PVIV
142 print "my $ti"; # invalid, compile-time type violation error,
143 # the int IV cannot be stringified
144 You have to use:
145 sprintf "my %d",$ti; # or
146 print "my ",$ti;
147
148 $g = $s + 1; # valid, upgraded from PV to PVIV
149 $g = $ts + 1; # invalid numify, compile-time type violation error
150 $g = 0+$ts; # invalid numify, compile-time type violation error
151
152 =head2 const and magic @_
153
154 How about function arguments and return value constness?
155
156 Return values are always copied without keeping constness from within a function.
157
158 # valid
159 perl -MReadonly -e'sub ret { Readonly my $i => 1; $i} my $x=ret();$x+=1'
160
161 Arguments handled by reference keep constness:
162
163 # invalid
164 perl -MReadonly -e'sub get { Readonly $_[0]; } my $x=1; get($x); $x+=1;'
165 => Modification of a read-only value attempted
166
167 Arguments copied by value from @_ loose constness:
168
169 # valid
170 perl -MReadonly -e'sub get { my $x=shift; Readonly $x; } my $x=1; get($x); $x+=1;'
171 perl -MReadonly -e'sub get { my $x=shift; $x+=1; } Readonly my $x=>1; get($x);'
172
173 =head2 Declare function signatures and types (target 5.20)
174
175 perlsub has this say:
176 "Some folks would prefer full alphanumeric prototypes. Alphanumerics have been
177 intentionally left out of prototypes for the express purpose of someday in the
178 future adding named, formal parameters. The current mechanism's main goal is to let
179 module writers provide better diagnostics for module users. Larry feels the
180 notation quite understandable to Perl programmers, and that it will not intrude
181 greatly upon the meat of the module, nor make it harder to read. The line noise is
182 visually encapsulated into a small pill that's easy to swallow."
183
184 We want to optionally declare function parameter names and types and
185 the return type. There is no need to come up with new keywords like
186 fun just seperate sub prototypes from sub parameters. The simple rule
187 is: If there is a whitespace or alphanumeric sequence in the protoype,
188 it's no prototype. The general rule, esp. for single parameters: If
189 there is any non-prototype character, it's an parameter declaration
190 then.
191
192 Prototype changes the parser bindings, named function signatures
193 avoids manual @_ extraction, function types and const declarations
194 will catch type errors earlier and helps in compiler-time
195 optimizations.
196
197 sub fun ($arg1) {}
198
199 Function parameters have optional names and if so use them inside the function as such.
200 @_ is not used then externaly, ony internaly.
201
202 sub adder ($arg1, $arg2) { $arg1 + arg2 }
203
204 You are able to declare types of function parameters and return values.
205
206 int sub adder (const int $arg1, const int $arg2) { $arg1 + arg2 }
207
208 You can als use types only without names. Note that the C<;> semicolon here
209 denotes the 2nd argument as optional.
210
211 int sub adder (const int; const int) { shift + shift }
212
213 With names it is better to use the C<=> syntax for optional parameter declarations
214 and a default value.
215 You are able to declare optional parameter default values with using a name and C<=>
216 and a literal default value. Default values can be constants or variables, but no
217 function calls.
218
219 int sub adder (const int $arg1, const int $arg2=0) { $arg1 + arg2 }
220 adder(1);
221 adder(1,1);
222
223 Arguments are copied by default. To use pass by reference style as with $_[0]
224 which changes the passed value, use the ref syntax \$name
225
226 int sub adder (int \$arg1, const int $arg2=0) { $arg1 += arg2 }
227 my $i=0;
228 adder($i,1);
229 adder($i,1);
230 print $i;
231 => 2
232
d1c07ab @rurban pddtypes.pod: more in =head2 const
authored
233 See also L<Method::Signatures>.
35b909c @rurban add pddtypes.pod - executive summary for const and types
authored
234
235 =head3 Return type declarations
236
237 The parameters were straightforward, but this is now hairy, as there
238 are many competing syntax variants currently used.
239
240 At first: return types loose constness.
241
242 const int
243 sub ret_const { # => returns const int
244 ReadOnly my $foo => 1;
245 my const $myarg = shift;
246
247 $_[0] = 2; # run-time error byref
248 $foo # return a copy of a const
249 }
250
251 my const $arg = 1;
252 my $foo = ret_const($arg);
253 $foo += 1;
254
255 Pointy Variant
256 sub function (int $i -> int) {}
257 Hashy Variant
258 sub function (int $i => int) {}
259 Old Variant (use typesafety)
260 sub function (int; int $i) {}
261 Modern C-like variant
262 int sub function (int $i) {}
263
42b0094 @rurban pddtypes.pod: add section "Changes & existing bugs"
authored
264 =head2 Changes & existing bugs
265
266 =head3 Can't declare subroutine entry in "my"
267
268 Multiple B<my> declarations with types are not parsed correctly.
269
270 perl -e'$i::x; my (j $a, i $b)=(1,2);'
271 => Can't declare subroutine entry in "my" at -e line 1, near ")="
272
273 perl -e'$i::x; my (i $a)=(1);'
274 => Can't declare subroutine entry in "my" at -e line 1, near ")="
275
276 This is a parser problem.
277 A C<my (TYPE EXPR, ...)> declaration should not be mixed with a function call,
278 i.e. subroutine entry. C<my> is a reserved keyword and cannot be the name of
279 a subroutine.
280
281 The error should be the same as with
282 perl -e'my j $a;'
283 => No such class j at -e line 1, near "my j"
284 and
285 perl -e'$i::x; my i $a;'
286 parses correctly.
287
288
289
290
35b909c @rurban add pddtypes.pod - executive summary for const and types
authored
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315 =head1 Private implementation notes
316
317 Please ignore
318
319 =head2 optree traversal
320
321 Generally, optree traversal to do optimizations in perl5 is forward only.
322 You do not know the previous op, even kids do not know its parents.
323 Some ops have special pointers backwards though, such as some UNOPs and BINOPs.
324 So to do a simple tree-optimization, such as
325
326 my const $i = 0;
327
328 const (IV 0)
329 padsv [$a]/CONST
330 sassign *
331
332 which is represented in the optree as
333
334 5 <2> sassign vKS*/2 ->6
335 3 <$> const(IV 0) s ->4
336 4 <0> padsv[$a:1,2] sRM*/LVINTRO,CONSTINIT ->5
337
338 you need to step forward to sassign to see that the 2nd argument
339 padsv is constant, so either in the case of
340 $i = 1; # $i being a const pad
341
342 9 <2> sassign vKS/2 ->a
343 7 <$> const(IV 1) s ->8
344 8 <0> padsv[$a:1,2] sRM* ->9
345
346 which is illegal and throws a compiler error.
347
348 Or in the case above as
349 my const $i = 0;
350 which is a valid initialization, and allows overwriting the constant $i pad.
351
352 1.
353
354 One would like to do constant folding directly at sassign, to check if the rhs of
355 the expression evaluates to a constant.
356
357 my $i = 1 + ($a << 2);
358
359 constant if $a is const, otherwise not.
360
361 If so, the rhs can be shortened to CONST. No need for various compiler passes
362 through the whole optree. I<"localize optimizations">
363
364 my $i;
365 my const $a = 2;
366 $i = 1 + ($a << 2);
367 =>
368 const(IV 2)
369 padsv [$a] /CONST
370 sassign /CONSTINIT
371 const(IV 9) <=== optimized 1 + (2 << 2)
372 padsv [$i]
373 sassign
374
375 2.
376
377 Or get rid of a const padsv initialization by sassign at all
378 if the rhs is parsed already to a constant scalar.
379
380 perl -DT -e'my const $a=0;'
381
382 0:LEX_NORMAL/XSTATE "\n;"
383 <== MY(ival=1)
384
385 1:LEX_NORMAL/XTERM "$a=0;\n"
386 <== '$'
387
388 1:LEX_NORMAL/XOPERATOR "=0;\n"
389 Pending identifier '$a'
390 <== PRIVATEREF(opval=op_padany)
391
392 1:LEX_NORMAL/XOPERATOR "=0;\n"
393 <== ASSIGNOP(ival=op_null)
394
395 1:LEX_NORMAL/XTERM "0;\n"
396 Saw number in ";\n"
397 <== THING(opval=op_const) IV(0)
398
399 1:LEX_NORMAL/XOPERATOR ";\n"
400 <== ';'
401
402 1:LEX_NORMAL/XSTATE "\n"
403 <== ';'
404
405 1:LEX_NORMAL/XSTATE ""
406 Tokener got EOF
407 <== EOF
408
409 parsed as:
410 MY $a(padany) ASSIGNOP THING const(IV 0);
411 compiled to:
412 const(IV 0)
413 padsv[$a]/CONST
414 sassign /CONSTINIT
415
416 Since padsv already knows that it will be assigned to a CONST, and that is const
417 it should store the value and the READONLY flag and omit the CONST and SASSIGN ops at all.
418
419 =head2 CONST
420
421 It would be nice to use compile my const $i=1 to
422 CONST(IV=1) instead of PADSV($i) with $i SVf_READONLY
423 to have faster access for the optimizer.
424
425 But the compiler needs to find lexical pads in the scope
426 upwards, and I'm not sure if a CONST->op_sv assumption
427 pointing to a pad is a good idea. A lexical is still a
428 lexical.
429 Every PAD*V($i) with $i SVf_READONLY should be marked as
430 op_private = OPpPAD_CONST when the PAD is created (looked up)
431 to be easier and more reliable detectable by the compiler,
432 and visible via B::Concise/Deparse.
433
434 store my const $padsv as OP_CONST with ->op_sv pointing to the pad?
435 at all, convert early or later?
436 how about pad_findlex and lexical scoping rules then?
437
438 { my const $i;
439 sub x {$i+20}
440 }
441 seems to be safe to convert early, and not waste a padsv.
442 not for padav and padhv.
443
444 How to pad_findlex a const $i if optimized to OP_CONST?
445 How about dynamic scope at run-time?
446 How about late binding CvLATE: ANON and PVFM. (delayed creation of the pad)
447 intro_my?
448
449 if so:
450 ck: save to strip off padsv and sassign
451 CONST IV=1
452
453 Nope. Better add const PADSV to the optimizers.
454 CONST wants its sv as sv.
455
456 --
457
458 my const $i=1; # simple scalar case
459
460 Lexer:
461 MY(ival=2)
462 PRIVATEREF padany
463 ASSIGNOP
464 Initialize my const $i
465 const IV 1
466 op:
467 CONST IV=1
468 PADSV OPpPAD_CONST+OPpPAD_CONSTINIT, targ -> READONLY
469 SASSIGN OPf_SPECIAL (for const init temp. overwrite)
470
471 --
472
473 my const ($i)=(1); # list case
474
475 Lexer:
476 MY(ival=2)
477 PRIVATEREF padany
478 ASSIGNOP
479 Initialize my const $i
480 const IV 1
481 op:
482 CONST IV=1
483 PADSV OPpPAD_CONST+OPpPAD_CONSTINIT, targ -> READONLY
484 SASSIGN OPf_SPECIAL (for const init temp. overwrite)
485
486 --
487
488 my const @a=(1); # PADAV
489
490 Lexer:
491 MY(ival=2)
492 PRIVATEREF padany
493 ASSIGNOP
494 Initialize my const @a
495 const IV 1
496 op:
497 pushmark
498 CONST IV=1
499 pushmark
500 PADAV OPpPAD_CONST+OPpPAD_CONSTINIT, targ -> READONLY
501 AASSIGN OPf_SPECIAL
502
503 --
504
505 my const $i;
506 ...
507 should warn: Uninitialized const $i at -e, line 1
508
509 constant folding:
510 my const $i=1; $x=$i+20;
511
512 const 1
513 padsv $i
514 sassign
515 =>
516 padsv $i const 21
517 const 20
518 add
519 gvsv gvsv
520 sassign sassign
Something went wrong with that request. Please try again.