Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Newer
Older
100644 231 lines (154 sloc) 10.681 kB
e41f011 @rurban 0.05_02: added YAPC::US 2011 slides
authored
1 Now that I have ["almost"](http://code.google.com/p/perl-compiler/source/detail?r=912) fixed the remaining compiler bugs, I wanted to use the existing framework to enable the possible speedup by using types, esp. low-level internal types. C-style integers and double scalars are used internally in [**B::CC**](http://search.cpan.org/dist/B-C/lib/B/CC.pm) instead of full Perl `IV`'s / `NV`'s if declared as such, and thus greatly improves execution speed. Esp. for inlinable arithmetic and comparison blocks.
2
3 Only at the end of such a block (and only when really needed) the calculated C vars are written back to the perl pad. So I needed better [**Opcodes**](http://search.cpan.org/dist/Opcodes) flags to define which ops read or write pads, and more possible optimization hints. Most of these flags should be added to core later, when the time will come to speed up not only the `B::CC` compiler, but also the internal perl compiler.
4
5 But how to declare types?
6 =========================
7
8 There are two attractive possibilities, and the existing hack. There's also the possibility to use [**Ctypes**](http://gitorious.org/perl-ctypes) to define stricter types (`c_int`), but this would be misleading. Ctypes are for external vars only, not for mix of internal and externals (compiled / uncompiled).
9
10 First the hack:
11
12 1. name magic
13 ------------------
14
15 Every lexical variable suffixed by "_i", "_d" and an optional "r" (register) suffix are declared as int, double and register (hot, short lived).
16 This disturbs the namespace and should be possible to disable by a cmdline switch.
17
18 **-fno-name-magic** is the new switch since B-C-1.30 to disable the type specification with the old magic variable names. With perl-5.16 `-fname-magic` will not be selected by default anymore, you'll have to specify it explictily.
19
20 2. type classes
21 ------------------
22
23 my int $i; my double $d;
24 my string %hash; my int @array;
25
26 Easy readable, but you need to define the `int` and `double` packages in your code, so that perl can parse it.
27
28 Luckily there exists a types module already which does compile-time type checking and exactly defines those classes. [**use types;**](http://search.cpan.org/dist/types/) by Artur Bergman from 2002.
29 Well, float so far, but I will rename it to double.
30
31 Compile-time type checking slows down execution in pure perl, but will lead to dramatic performance gains if compiled.
32 And the good thing: It can also be used to increase pure perl speed by using similar optimizations from B::CC in core also. [i_opt](https://github.com/rurban/perl/tree/i_opt), special CORE flags and faster access for arrays and hashes is the idea.
33
34 3. type attributes
35 ---------------------
36
37 Type classes are not the full story, we also need type attributes, such as **:readonly** for more massive speedups, and **:unsigned** to specify UV's instead of IV's, and widening the type checks and possible types for stringified vars e.g. such as
38
39 my int $x : string; # PVIV
40
41 Using attributes needs to pollute the package namespace with some magic function names. This is a lame design but I think I found the least polluting one (which is certainly not using `Attribute::Handler`).
42
43 There is `Readonly`, but this does no performance, only type safety.
44 `:const` will lead to internal optimizations as done in B::CC, such as not writing back from C to perl, and using special datastructures (perfect hashes, faster and smaller arrays, simplier loops, ...).
45
46 use types (the module as planned)
47 ===========================
48
49 This pragma does compile-time type checking.
50
51 It also permits internal compiler optimizations, and adds type attribute definitions.
52
53 SYNOPSIS
54 -------------
55
56 my double $foo = "string"; #compile time error
57 sub foo (int $foo) { my ($foo) = @_ };
58 foo("hi"); #compile time Type mismatch error
59
60 my int $int;
61 sub foo { my double $foo; return $foo }
62 $int = $foo; # compile time Type mismatch error
63
64 my int @array = (0..10); # optimized internal representation
65 $array[2] = 2; # no SV, just the raw int at slot 2.
66
67 my int @array :const = (0..10); # even more optimized internal representation
68 $array[2] = 2; # int @array is readonly
69 $array[2] = '2'; # compile time Type mismatch error
70
71 my string %hash : readonly = (foo => any, bar => any); # optimized gperf representation
72 print $hash{foo}; # faster lookup
73 $hash{new} = 1; # compile time Type mismatch error
74
75 DESCRIPTION
76 ------------------
77
78 This pragma uses the optimize module to analyze the optree and
79 turns on compile-time type checking.
80
81 It is also the base for optimizing compiler passes, for perl CORE (planned)
82 and for `B::CC` compiled code (done).
83
84 Currently we support SCALAR lexicals with the type classes int, double, number, string and user defined classes.
85
86 The implicit casting rules are as follows:
87
88 int < > number
89 int > double
90 double < > number
91 number > string
92
93
94 Normal type casting is allowed both up and down the inheritance tree,
95 so in theory user defined classes should work already, requires one
96 to do use base or set `@ISA` at compile-time in a BEGIN block.
97
98 Implemented are only `SCALAR` types yet, not `ARRAY` nor `HASH`.
99
100 ATTRIBUTES
101 -----------------
102
103 Planned type attributes are **:int**, **:double**, **:string**,
104 **:unsigned**, **:const** and **:readonly**, eventually also **:register** and **:temporary** (as used in the current compiler)
105
106 The attributes are perl attributes, and `int|double|string` are either
107 classes or hint attributes for more allowed types.
108 If defined as class, compile-time type checks are performed, if as attribute not. Only compiler optimizations are used then.
109
110 my int $i :double; # declares a NV with SVf_IOK.
111 my $i:int:double; # same but without type-check
112 my int $i; # declares an IV.
113 my $i:int; # same but without type-check
114 my int $i :string; # declares a PVIV. An int with allowed stringify and read-as-string.
115
116 my int @array :unsigned = (0..4);
117 # Will be used as c var in faster arithmetic and cmp.
118 # Will use no SV value slot, just the direct value.
119
120 my string %hash : readonly = (foo => any, bar => any);
121 # declare string keys only
122 # and may be generated as read-only perfect hash.
123
124 **:unsigned** is valid for int only and declares an UV. This helps e.g. `i_opt` with bit manip.
125
126 **:const** and **:readonly** throw a compile-time error on write access and may
127 optimize the internal structure of the variable.
128 E.g. In `CORE` hashes may be optimized to perfect hashes for faster lookup.
129 Array values for int and double can be optimized (faster accesss, less memory).
130 In `B::CC` we don't need to write back the variable to perl (lexical write_back).
131
132 **:register** denotes optionally a short and hot life-time. for loop
133 variables are automatically detected as such.
134
135 **:temporary** are usually generated internally with nameless lexicals.
136 They are more aggressivly destroyed and ignored.
137
138 STATUS
139 -----------
140
141 `types` has a lot of yet failing dependencies, but I'm working on it.
142 Non-core support for `B::Generate` is the worst.
143
144 OK (classes only):
145
146 my int $i;
147 my double $d;
148
149 NOT YET OK (attributes and non-scalars):
150
151 my int $i :register;
152 my $i :int;
153 my $const :const :int;
154 my int $uv :unsigned;
155
156 my int @array;
157 my string %hash;
158 my int $arrayref = [];
159 my int $hashref = {};
160
161 Return values
162 -----------------
163
164 Return values are implicitly figured out by the subroutine, this includes
165 both falling of the end or by expliticly calling return, if two return
166 values of the same sub differ you will get an error message.
167
168 Arguments
169 -------------
170
171 Arguments are declared with prototype syntax, they can either be named
172 or just typed, if typed only the calling convertions are checked, if
173 named then that named lexical will get that type without the need
174 for expliticty typing it, thus allowing list assignment from @_.
175
176 Thanks Artur and Malcolm who designed this years ago.
177 New will be only type optimizations and attributes, the API is stable since years.
178 Just unused.
179 Time to speed up perl 5.16.
180
181 [https://github.com/rurban/types/](https://github.com/rurban/types/)
182
183
184 Implementation ideas
185 ================
186
187 Perfect hashes
188 -------------------
189
190 Modern hash functions can nowadays be generated at run-time, leading to perfect hash functions or even minimal perfect hash functions, even for billions of keys.
191
192 Hashes declared as **readonly**, such as `my %hash : ro = (key => value, ...);`
193 or as readonly hashref can be *pre-studied*.
194 Which means a dynamic hashing function is generated at run-time using the preferred [BDZ](http://cmph.sourceforge.net/bdz.html) or also called RAM hash function, as implemented under the LGPL in the [cmph library](http://cmph.sourceforge.net/).
195
196 Hashes which are generated at run-time benefit from a new op which creates the
197 perfect hash function at run-time in RAM also via BDZ.
198 We will use the existing function **`study %hash;`** or **`study $hashref;`**.
199
200 Such perfect hashes can only be generated for uniform key types of course, strings usually. We can decide later if to use the simplier perfect hash, or a stronger and better minimal perfect hash, maybe we will decide this on the hash size.
201
202 Space and time costs for the generation and final storage of a BDZ function are linear, O(n) bits for the generation step, 1bit per key for the generated hash function,
203 lookup is O(1) (*no collisions*), see the [detailed analysis](http://cmph.sourceforge.net/papers/thesis.pdf).
204
205 Of course for the compiler with readonly hashes we can think of using **gperf**,
206 which needs much longer and cannot be used at run-time, but generates better and smaller hash functions.
207 Heck we don't even use good hashes for generated XS constants yet, only `Win32::GUI` does.
208
209
210 Typed arrays
211 -----------------
212
213 A typed array such as `my int @array;` uses uniform types for the values not the typed keys as for perfect hashes. An array key is always an int.
214 We can use the existing `AvARRAY` buffer, just with direct access to the value slot.
215 We need no SV indirection, which saves time and space.
216 We will need a new `AvTYPED` flag and either restrict to IOK, NOK and POK, or add
217 a xpvav slot for a broader type declaration.
218
219 The implementation is straightforward.
220
221 Links
222 -------
223 [http://perl.plover.com/classes/typing/](http://perl.plover.com/classes/typing/)
224
225 [http://search.cpan/dist/types](http://search.cpan/dist/types)
226
227 [http://search.cpan/dist/typesafety](http://search.cpan/dist/typesafety)
228
229 [http://search.cpan.org/dist/Moose/lib/Moose/Manual/Types.pod](http://search.cpan.org/dist/Moose/lib/Moose/Manual/Types.pod)
230
Something went wrong with that request. Please try again.