Skip to content
Newer
Older
100644 489 lines (343 sloc) 13 KB
35b909c @rurban add pddtypes.pod - executive summary for const and types
authored
1 =head1 SUMMARY
2
3 perl design draft - types and const
4
5 =head2 Perl already has a type system
6
7 E.g. C<my Dog $wuff>; stores the "Dog" stashname in comppad_names.
8
9 Only for lexicals (which is good and strict-safe). Almost nobody uses it.
10 There are ~5 typed modules on CPAN, Net::DNS the most prominent.
11 5.10 broke all old type modules. Moose started afresh, but naive and
12 different.
13
14 The biggest performance win would be to be able to mark packages and
15 its @ISA as B<readonly> i.e. B<const> or type objects instances to do
16 method_call compile-time optimizations. Also hash function
17 optimizations (I<perfect>), natively typed arrays and hashes,
18 more constant folding and maybe shared strings.
19
20 const package MyBase 0.01 {
21 our @ISA = ();
22 sub new { bless { @_ }, shift }
23 }
24 const package MyChild 0.01 {
25 our const @ISA = ('MyBase');
26 }
27
28 my $obj = MyChild->new;
29 => MyClass::new()
30
31 See L<perltypes/"Compile-time type optimizations">
32
33 Static compilers such as <B::C> and <B::CC> can optimize storage and
34 run-time performance furthermore. Most of the previously dynamically
35 allocated data can be made static (no parsing time, instant startup
36 time), some data can use COW.
37
38 The management win would be to actually enable strict typing rules
39 and to catch coding mistakes at compile-time.
40 See L<perltypes/"Compile-time type checks">
41
42 =head2 MOP with 5.18
43
44 With the new MOP we can mark classes as C<closed> or immutable
45 (= optimizable), and C<type> function arguments (optimizable method
46 calls + optional type checks).
47
48 package My;
49
50 class MyBase (is_closed => 1) { sub new { bless {@_}, shift } }
51 class MyDog (is_closed => 1, extends => MyBase) {
52 #sub disallowed!
53 method bark (Dog $dog) { print "bark ",ref $dog; }
54 }
55
56 =head2 const
57
58 Declaration of C<my const> variables and packages allows compile-time
59 checks, shared strings, lexical scope instead of C<use constant>,
60 familiar syntax, and efficient optimizations, also for extensions,
61 as the L<p5-mop>, L<coretypes> or L<B::C>.
62
63 use coretypes;
64 my const int @a = (0..9);
65 my const string %h
66 = ('ok' => '1',
67 'bad' => '2');
68
69 Why not as attribute C<my $var :const>?
70
71 1. This was my first proposal. I<blogs.perl Feb 2011>
72
73 2. This attribute must also be special cased in core as my const.
74 The hard part is optree support for const pads, not the keyword.
75
76 3. Third party modules cannot use this attribute, as there is no
77 CHECK time hook for attributes yet, only at run-time. But at
78 run-time it is too late.
79
80 4. The internal implementation look both bad, I'll try both.
81 It will be easier to use for C<class>, as class declarations
82 are already overloaded with new syntax: extends, with, is,
83 is_closed, metaclass, DEMOLISH, BUILD, FINALIZE.
84 class NAME (is_closed => 1) {} currently defines an
85 immutable, constant class.
86
87 5. my const looks better and familiar
88
89 my const($a, int $b) = (0,1);
90 my (const $a, int $b) = (0,1);
91 vs:
92 my ($a:const, int $b:const) = (0,1);
93
94 =head2 p5 does NOT declare a type system
95
96 p5 only allows storing types in C<comppad_names> and does some
97 compile-time optimizations with const and methods. It only declares
98 const.
99
100 p5 does not declare a type system by itself. One must use an extension
101 which declares and handles its types. p5-mop is such a meta type system.
102
103 L<coretypes> declares the native core types int, double and string for
104 IV, NV and PV, for scalars, arrays and hashes. Not for functions yet,
105 as function parameters are handled by extensions. coretypes is
106 backwards compat.
107
108 There is no p5 super object, such as C<class> or C<object>. Maybe the
109 mop needs one, but the three coretypes do not inherit. There is no
110 C<< $var->>print >>, C<< @a->reverse >> and such planned.
111 This can be done by mixins. coretypes will be slim, p5-mop will be fat.
112 But hopefully optimizable (i.e. compile-time) in the general case.
113
114 PS: Several people at YAPC expressed their wish to make class immutable
115 to be the new default. Then there must be a syntax to allow run-time changes
116 (i.e. non-constant classes).
117 class NAME (is_closed => 0) {}
118 class NAME :mutable {}
119 or such.
120
121
122 =head2 Acceptable type upgrades with const and coretypes
123
124 const variables are not purely constant, they are different from strictly
125 typed variables.
126 const variables may be upgraded to its string or numeric representation.
127 They may be numified and/or stringified, strictly typed variables not.
128
129
130 my const $a = 1;
131 my const $s = "1";
132 my int $ti = 0;
133 my string $ts = "1";
134
135 print "my $a"; # valid, upgraded from IV to PVIV
136 print "my $ti"; # invalid, compile-time type violation error,
137 # the int IV cannot be stringified
138 You have to use:
139 sprintf "my %d",$ti; # or
140 print "my ",$ti;
141
142 $g = $s + 1; # valid, upgraded from PV to PVIV
143 $g = $ts + 1; # invalid numify, compile-time type violation error
144 $g = 0+$ts; # invalid numify, compile-time type violation error
145
146 =head2 const and magic @_
147
148 How about function arguments and return value constness?
149
150 Return values are always copied without keeping constness from within a function.
151
152 # valid
153 perl -MReadonly -e'sub ret { Readonly my $i => 1; $i} my $x=ret();$x+=1'
154
155 Arguments handled by reference keep constness:
156
157 # invalid
158 perl -MReadonly -e'sub get { Readonly $_[0]; } my $x=1; get($x); $x+=1;'
159 => Modification of a read-only value attempted
160
161 Arguments copied by value from @_ loose constness:
162
163 # valid
164 perl -MReadonly -e'sub get { my $x=shift; Readonly $x; } my $x=1; get($x); $x+=1;'
165 perl -MReadonly -e'sub get { my $x=shift; $x+=1; } Readonly my $x=>1; get($x);'
166
167 =head2 Declare function signatures and types (target 5.20)
168
169 perlsub has this say:
170 "Some folks would prefer full alphanumeric prototypes. Alphanumerics have been
171 intentionally left out of prototypes for the express purpose of someday in the
172 future adding named, formal parameters. The current mechanism's main goal is to let
173 module writers provide better diagnostics for module users. Larry feels the
174 notation quite understandable to Perl programmers, and that it will not intrude
175 greatly upon the meat of the module, nor make it harder to read. The line noise is
176 visually encapsulated into a small pill that's easy to swallow."
177
178 We want to optionally declare function parameter names and types and
179 the return type. There is no need to come up with new keywords like
180 fun just seperate sub prototypes from sub parameters. The simple rule
181 is: If there is a whitespace or alphanumeric sequence in the protoype,
182 it's no prototype. The general rule, esp. for single parameters: If
183 there is any non-prototype character, it's an parameter declaration
184 then.
185
186 Prototype changes the parser bindings, named function signatures
187 avoids manual @_ extraction, function types and const declarations
188 will catch type errors earlier and helps in compiler-time
189 optimizations.
190
191 sub fun ($arg1) {}
192
193 Function parameters have optional names and if so use them inside the function as such.
194 @_ is not used then externaly, ony internaly.
195
196 sub adder ($arg1, $arg2) { $arg1 + arg2 }
197
198 You are able to declare types of function parameters and return values.
199
200 int sub adder (const int $arg1, const int $arg2) { $arg1 + arg2 }
201
202 You can als use types only without names. Note that the C<;> semicolon here
203 denotes the 2nd argument as optional.
204
205 int sub adder (const int; const int) { shift + shift }
206
207 With names it is better to use the C<=> syntax for optional parameter declarations
208 and a default value.
209 You are able to declare optional parameter default values with using a name and C<=>
210 and a literal default value. Default values can be constants or variables, but no
211 function calls.
212
213 int sub adder (const int $arg1, const int $arg2=0) { $arg1 + arg2 }
214 adder(1);
215 adder(1,1);
216
217 Arguments are copied by default. To use pass by reference style as with $_[0]
218 which changes the passed value, use the ref syntax \$name
219
220 int sub adder (int \$arg1, const int $arg2=0) { $arg1 += arg2 }
221 my $i=0;
222 adder($i,1);
223 adder($i,1);
224 print $i;
225 => 2
226
227 See also <Method::Signatures>.
228
229 =head3 Return type declarations
230
231 The parameters were straightforward, but this is now hairy, as there
232 are many competing syntax variants currently used.
233
234 At first: return types loose constness.
235
236 const int
237 sub ret_const { # => returns const int
238 ReadOnly my $foo => 1;
239 my const $myarg = shift;
240
241 $_[0] = 2; # run-time error byref
242 $foo # return a copy of a const
243 }
244
245 my const $arg = 1;
246 my $foo = ret_const($arg);
247 $foo += 1;
248
249 Pointy Variant
250 sub function (int $i -> int) {}
251 Hashy Variant
252 sub function (int $i => int) {}
253 Old Variant (use typesafety)
254 sub function (int; int $i) {}
255 Modern C-like variant
256 int sub function (int $i) {}
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283 =head1 Private implementation notes
284
285 Please ignore
286
287 =head2 optree traversal
288
289 Generally, optree traversal to do optimizations in perl5 is forward only.
290 You do not know the previous op, even kids do not know its parents.
291 Some ops have special pointers backwards though, such as some UNOPs and BINOPs.
292 So to do a simple tree-optimization, such as
293
294 my const $i = 0;
295
296 const (IV 0)
297 padsv [$a]/CONST
298 sassign *
299
300 which is represented in the optree as
301
302 5 <2> sassign vKS*/2 ->6
303 3 <$> const(IV 0) s ->4
304 4 <0> padsv[$a:1,2] sRM*/LVINTRO,CONSTINIT ->5
305
306 you need to step forward to sassign to see that the 2nd argument
307 padsv is constant, so either in the case of
308 $i = 1; # $i being a const pad
309
310 9 <2> sassign vKS/2 ->a
311 7 <$> const(IV 1) s ->8
312 8 <0> padsv[$a:1,2] sRM* ->9
313
314 which is illegal and throws a compiler error.
315
316 Or in the case above as
317 my const $i = 0;
318 which is a valid initialization, and allows overwriting the constant $i pad.
319
320 1.
321
322 One would like to do constant folding directly at sassign, to check if the rhs of
323 the expression evaluates to a constant.
324
325 my $i = 1 + ($a << 2);
326
327 constant if $a is const, otherwise not.
328
329 If so, the rhs can be shortened to CONST. No need for various compiler passes
330 through the whole optree. I<"localize optimizations">
331
332 my $i;
333 my const $a = 2;
334 $i = 1 + ($a << 2);
335 =>
336 const(IV 2)
337 padsv [$a] /CONST
338 sassign /CONSTINIT
339 const(IV 9) <=== optimized 1 + (2 << 2)
340 padsv [$i]
341 sassign
342
343 2.
344
345 Or get rid of a const padsv initialization by sassign at all
346 if the rhs is parsed already to a constant scalar.
347
348 perl -DT -e'my const $a=0;'
349
350 0:LEX_NORMAL/XSTATE "\n;"
351 <== MY(ival=1)
352
353 1:LEX_NORMAL/XTERM "$a=0;\n"
354 <== '$'
355
356 1:LEX_NORMAL/XOPERATOR "=0;\n"
357 Pending identifier '$a'
358 <== PRIVATEREF(opval=op_padany)
359
360 1:LEX_NORMAL/XOPERATOR "=0;\n"
361 <== ASSIGNOP(ival=op_null)
362
363 1:LEX_NORMAL/XTERM "0;\n"
364 Saw number in ";\n"
365 <== THING(opval=op_const) IV(0)
366
367 1:LEX_NORMAL/XOPERATOR ";\n"
368 <== ';'
369
370 1:LEX_NORMAL/XSTATE "\n"
371 <== ';'
372
373 1:LEX_NORMAL/XSTATE ""
374 Tokener got EOF
375 <== EOF
376
377 parsed as:
378 MY $a(padany) ASSIGNOP THING const(IV 0);
379 compiled to:
380 const(IV 0)
381 padsv[$a]/CONST
382 sassign /CONSTINIT
383
384 Since padsv already knows that it will be assigned to a CONST, and that is const
385 it should store the value and the READONLY flag and omit the CONST and SASSIGN ops at all.
386
387 =head2 CONST
388
389 It would be nice to use compile my const $i=1 to
390 CONST(IV=1) instead of PADSV($i) with $i SVf_READONLY
391 to have faster access for the optimizer.
392
393 But the compiler needs to find lexical pads in the scope
394 upwards, and I'm not sure if a CONST->op_sv assumption
395 pointing to a pad is a good idea. A lexical is still a
396 lexical.
397 Every PAD*V($i) with $i SVf_READONLY should be marked as
398 op_private = OPpPAD_CONST when the PAD is created (looked up)
399 to be easier and more reliable detectable by the compiler,
400 and visible via B::Concise/Deparse.
401
402 store my const $padsv as OP_CONST with ->op_sv pointing to the pad?
403 at all, convert early or later?
404 how about pad_findlex and lexical scoping rules then?
405
406 { my const $i;
407 sub x {$i+20}
408 }
409 seems to be safe to convert early, and not waste a padsv.
410 not for padav and padhv.
411
412 How to pad_findlex a const $i if optimized to OP_CONST?
413 How about dynamic scope at run-time?
414 How about late binding CvLATE: ANON and PVFM. (delayed creation of the pad)
415 intro_my?
416
417 if so:
418 ck: save to strip off padsv and sassign
419 CONST IV=1
420
421 Nope. Better add const PADSV to the optimizers.
422 CONST wants its sv as sv.
423
424 --
425
426 my const $i=1; # simple scalar case
427
428 Lexer:
429 MY(ival=2)
430 PRIVATEREF padany
431 ASSIGNOP
432 Initialize my const $i
433 const IV 1
434 op:
435 CONST IV=1
436 PADSV OPpPAD_CONST+OPpPAD_CONSTINIT, targ -> READONLY
437 SASSIGN OPf_SPECIAL (for const init temp. overwrite)
438
439 --
440
441 my const ($i)=(1); # list case
442
443 Lexer:
444 MY(ival=2)
445 PRIVATEREF padany
446 ASSIGNOP
447 Initialize my const $i
448 const IV 1
449 op:
450 CONST IV=1
451 PADSV OPpPAD_CONST+OPpPAD_CONSTINIT, targ -> READONLY
452 SASSIGN OPf_SPECIAL (for const init temp. overwrite)
453
454 --
455
456 my const @a=(1); # PADAV
457
458 Lexer:
459 MY(ival=2)
460 PRIVATEREF padany
461 ASSIGNOP
462 Initialize my const @a
463 const IV 1
464 op:
465 pushmark
466 CONST IV=1
467 pushmark
468 PADAV OPpPAD_CONST+OPpPAD_CONSTINIT, targ -> READONLY
469 AASSIGN OPf_SPECIAL
470
471 --
472
473 my const $i;
474 ...
475 should warn: Uninitialized const $i at -e, line 1
476
477 constant folding:
478 my const $i=1; $x=$i+20;
479
480 const 1
481 padsv $i
482 sassign
483 =>
484 padsv $i const 21
485 const 20
486 add
487 gvsv gvsv
488 sassign sassign
Something went wrong with that request. Please try again.