2
2
3
3
= TITLE Performance
4
4
5
- = SUBTITLE Measuring and improving throughput, latency, ram usage etc. at run-time or compile-time
5
+ = SUBTITLE Measuring and improving speed and ram usage at run-time or compile-time
6
6
7
- B < N.B. > Make sure you're not wasting time on the wrong code: start by identifying your
8
- L < "critical 3%"|https://en.wikiquote.org/wiki/Donald_Knuth > by "profiling" your code (explained on this page).
9
-
10
- This page describes some tools for identifying problem code and some ways to improve it.
11
- It's not exhaustive.
12
- Please consider talking to someone in the community before deciding the answer to
13
- L < Is Perl 6 fast enough for me?|http://doc.perl6.org/language/faq#Is_Perl_6_fast_enough_for_me? > is "No".
7
+ B < Make sure you're not wasting time on the wrong code > : start by identifying your
8
+ L < "critical 3%"|https://en.wikiquote.org/wiki/Donald_Knuth > by profiling your code (explained on this page).
14
9
15
10
If you decide to talk about a performance problem, please first prepare a one-liner and/or public gist
16
11
that illustrates the problem.
17
- Also, let folk know if you are using Perl 6 in your $dayjob or exploring it for fun.
18
- And think about what the minimum speed increase (or ram reduction or whatever) would have to be
19
- to make a worthwhile difference to you.
12
+ Also, let folk know if you are using Perl 6 in your $dayjob or exploring it for fun
13
+ and think about what the minimum speed increase (or ram reduction or whatever) you want/need.
20
14
What if it took a month for folk to help you achieve that? A year?
21
15
22
16
= head1 Identifying the problem
23
17
24
- = head2 C < now - INIT now > (speed)
18
+ = head2 C < now - INIT now >
25
19
26
20
Expressions of the form C < now - BEGIN now > , where C < BEGIN > is a
27
- L < phase in the running of a Perl 6 program|/language/phasers > , provide a simple idiom for simple timings.
21
+ L < phase in the running of a Perl 6 program|/language/phasers > , provide a great idiom for timing code snippets.
22
+
28
23
Using the C < m: your code goes here > L < #perl6 channel evalbot|http://doc.perl6.org/language/glossary#camelia >
29
- one might write:
24
+ you can write lines like :
30
25
31
26
m: say now - INIT now
32
27
rakudo-moar 8bd7ee: OUTPUT « 0.0018558 »
33
28
34
29
The C < now > to the left of C < INIT > runs 0.0018558 seconds I < later > than the C < now > to the right of the C < INIT >
35
30
because the latter occurs during L < the INIT phase|/language/phasers#INIT > .
36
31
37
- = head2 C < prof-m: your code goes here > (speed/ram)
32
+ = head2 C < prof-m: your code goes here >
38
33
39
34
Entering C < prof-m: your code goes here > in the L < #perl6 channel|http://doc.perl6.org/language/glossary#IRC >
40
35
invokes an evalbot that runs a Perl 6 compiler with a C < --profile > option.
41
36
The evalbot's output includes a link to L < profile info|https://en.wikipedia.org/wiki/Profiling_(computer_programming) > :
42
37
43
38
yournick prof-m: say 'hello world'
44
- camelia prof-m 273e89: OUTPUT « hello worldWriting profiler output to /tmp/mprof.html »
39
+ camelia prof-m 273e89: OUTPUT « hello world... »
45
40
.. Prof: http://p.p6c.org/20f9e25
46
41
47
42
To learn how to interpret the profile info, ask questions on channel.
48
43
49
- = head2 Profiling locally (speed/ram)
44
+ = head2 Profiling locally
50
45
51
46
When using the L < MoarVM|http://moarvm.org > backend the L < Rakudo|http://rakudo.org > compiler's C < --profile >
52
47
command line option writes profile information as an HTML file.
53
48
54
- To learn how to interpret the profile info, use the C < prof-m: your code goes here > evalbot explained
55
- above and ask questions on channel.
49
+ To learn how to interpret the profile info, use the C < prof-m: your code goes here > evalbot ( explained
50
+ above) and ask questions on channel.
56
51
57
- = head2 Profiling compilation (speed/ram)
52
+ = head2 Profiling compilation
58
53
59
54
The Rakudo compiler's C < --profile-compile > option profiles the time and memory used to compile code.
60
55
61
- = head2 Benchmarks (speed)
56
+ = head2 Benchmarks
62
57
63
- L < perl6-bench|https://github.com/japhb/perl6-bench > graphs the speed for a range of benchmarks .
58
+ Use L < perl6-bench|https://github.com/japhb/perl6-bench > .
64
59
65
- If you do this for multiple compilers (typically versions of Perl 5, Perl 6, or NQP)
60
+ If you run perl6-bench for multiple compilers (typically versions of Perl 5, Perl 6, or NQP)
66
61
then results for each are visually overlaid on the same graphs to provide for quick and easy comparison.
67
62
68
63
= head1 Improving code
69
64
70
- B < N.B. > Make sure you're not wasting time on the wrong code: start by identifying your
65
+ This bears repeating: B < make sure you're not wasting time on the wrong code> : start by identifying your
71
66
L < "critical 3%"|https://en.wikiquote.org/wiki/Donald_Knuth > via profiling, as discussed in several sections above.
72
67
73
- = head2 Improving code, line by line (speed/ram)
68
+ = head2 Line by line
74
69
75
70
A quick and fun way to try improve code line-by-line is to collaborate with others using the
76
71
L < #perl6|http://doc.perl6.org/language/glossary#IRC > evalbot L < camelia|http://doc.perl6.org/language/glossary#camelia > .
77
72
78
- = head2 Improving code, routine by routine (speed)
73
+ = head2 Routine by routine
79
74
80
- With multidispatch you can drop in new variants of routines alongside existing ones:
81
-
82
- multi sub foo (Any $a, Any $b) { ... } # existing code generically matches a two arg foo call
83
- multi sub foo ("quux", Int $b) { ... } # new specific code takes over for a foo("quux", 42) call
75
+ With multidispatch you can drop in new variants of routines "alongside" existing ones:
76
+
77
+ # existing code generically matches a two arg foo call:
78
+ multi sub foo (Any $a, Any $b) { ... }
79
+
80
+ # new variant takes over for a foo("quux", 42) call:
81
+ multi sub foo ("quux", Int $b) { ... }
84
82
85
83
The call overhead of having multiple C < foo > definitions is generally insignificant (though see discussion
86
84
of C < where > below), so if your new definition handles its particular case more quickly/leanly than the
87
85
previously existing set of definitions then you probably just made your code that much faster/leaner for that case.
88
86
89
- = head2 Compile-time vs run-time type- checks and call resolution (speed)
87
+ = head2 Type- checks and call resolution
90
88
91
- Most L < C < where > clauses|/type/Signature#Type_Constraints> ( and thus most
92
- L < subtypes|http://design.perl6.org/S12.html#Types_and_Subtypes > ) force dynamic (run-time)
93
- type checking and call resolution (which is slower, or at least later, than compile-time) .
89
+ Most L < C < where > clauses|/type/Signature#Type_Constraints> -- and thus most
90
+ L < subtypes|http://design.perl6.org/S12.html#Types_and_Subtypes > ) -- force dynamic (run-time)
91
+ type checking and call resolution. This is slower, or at least later, than compile-time.
94
92
95
- Method calls are generally resolved as late as possible, so dynamically (at run-time),
96
- whereas sub (function) calls are resolvable statically (at compile-time).
97
-
98
- = head2 Turning sequential/blocking algorithms into parallel/non-blocking (throughput/latency)
99
-
100
- See the slides for
101
- L < Parallelism, Concurrency, and Asynchrony in Perl 6|http://jnthn.net/papers/2015-yapcasia-concurrency.pdf#page=17 >
102
- and/or L < the matching video|https://www.youtube.com/watch?v=JpqnNCx7wVY&list=PLRuESFRW2Fa77XObvk7-BYVFwobZHdXdK&index=8 > .
93
+ Method calls are generally resolved as late as possible, so dynamically, at run-time,
94
+ whereas sub calls are resolvable statically, at compile-time.
103
95
104
- = head2 Other algorithmic changes (throughput/latency/ram/etc.)
96
+ = head2 Choosing better algorithms
105
97
106
- The previous section is a particular case of L < algorithmic efficiency|https://en.wikipedia.org/wiki/Algorithmic_efficiency > ,
98
+ Improving L < algorithmic efficiency|https://en.wikipedia.org/wiki/Algorithmic_efficiency > is
107
99
one of the most reliable techniques for making large performance improvements regardless of language or compiler.
108
100
109
101
A classic example is L < Boyer-Moore|https://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm > .
@@ -114,7 +106,15 @@ Boyer-Moore algorithm starts by comparing the *last* character of the small stri
114
106
character in the large string. For most strings the Boyer-Moore algorithm is close to N times faster where
115
107
N is the length of the small string.
116
108
117
- = head2 Using native types (speed/ram)
109
+ = head2 Changing sequential/blocking code to parallel/non-blocking
110
+
111
+ This is a very important class of algorithmic improvement.
112
+
113
+ See the slides for
114
+ L < Parallelism, Concurrency, and Asynchrony in Perl 6|http://jnthn.net/papers/2015-yapcasia-concurrency.pdf#page=17 >
115
+ and/or L < the matching video|https://www.youtube.com/watch?v=JpqnNCx7wVY&list=PLRuESFRW2Fa77XObvk7-BYVFwobZHdXdK&index=8 > .
116
+
117
+ = head2 Using native types
118
118
119
119
You can make some code run faster and/or use less ram by adding native types to your code:
120
120
@@ -126,53 +126,53 @@ XXX Start of section to be written
126
126
127
127
XXX End of section to be written
128
128
129
- = head2 Using existing performant code (speed/ram)
129
+ = head2 Using existing high performance code
130
130
131
- Is there already a performant implementation in some other language of what you're trying to speed up / slim down?
131
+ Is there an existing high (enough) performance implementation of what you're trying to speed up / slim down?
132
132
133
- There are a lot of C libs out there. The Perl 6 FFI L < NativeCall|/language/nativecall > makes it easy to create
134
- wrappers for C libs such as L < Gumbo|https://github.com/Skarsnik/perl6-gumbo > or for C++ libs (experimental).
133
+ There are a lot of C libs out there.
134
+ L < NativeCall|/language/nativecall > makes it easy to create wrappers for C libs (there's experimental
135
+ support for C++ libs too) such as L < Gumbo|https://github.com/Skarsnik/perl6-gumbo > .
135
136
(Data marshalling and call handling is somewhat poorly optimized at the time of writing this but for many
136
137
applications that won't matter.)
137
138
138
139
Perl 5's compiler can be treated as a C lib. Mix in Perl 6 types, the L < MOP|/language/mop > , and some hairy
139
- programming that someone else has done for you, and you get to conveniently
140
+ programming that someone else has done for you, and the upshot is that you can conveniently
140
141
L < use Perl 5 modules in Perl 6|http://stackoverflow.com/a/27206428/1077672 > .
141
- You can call Perl 5 functions and methods as if they were written in Perl 6;
142
- pass integers, strings, arrays, hashes, code references, file handles and objects back-and-forth;
143
- even subclass Perl 5 classes in Perl 6.
144
- In principle you can take advantage of the performance of mature Perl 5 solutions without even knowing any Perl 5.
145
142
146
- More generally, Perl 6 is designed to be able to smoothly interop with any other language so there's a number of
147
- L < modules aimed at providing convenient use of libs from other langs|http://modules.perl6.org/#q=inline > .
143
+ More generally, Perl 6 is designed to be able to smoothly interop with any other language and there are a number
144
+ of L < modules aimed at providing convenient use of libs from other langs|http://modules.perl6.org/#q=inline > .
148
145
149
146
= head2 Speeding up Rakudo itself
150
147
151
148
The focus to date (Feb 2016) regarding the compiler has been correctness, not speed, but that's expected to
152
149
change somewhat this year and beyond. You can talk to compiler devs on the freenode IRC channels #perl6 and
153
150
#moarvm about what to expect. Or you can contribute yourself:
154
151
155
- * Rakudo is largely written in Perl 6. So if you can write Perl 6, then you can hack on the compiler,
152
+ = item Rakudo is largely written in Perl 6. So if you can write Perl 6, then you can hack on the compiler,
156
153
including optimizing any of the large body of existing high-level code that impacts the speed of your code
157
154
(and everyone else's).
158
155
159
- * Most of the rest of the compiler is written in a small language called
160
- L < NQP|https://github.com/perl6/nqp > .
161
- NQP is more or less just a subset of Perl 6.
162
- So if you can write Perl 6 you can fairly easily learn to use and improve the mid-level NQP code too,
163
- at least from a pure language point of view.
156
+ = item Most of the rest of the compiler is written in a small language called
157
+ L < NQP|https://github.com/perl6/nqp > that's basically a subset of Perl 6.
158
+ If you can write Perl 6 you can fairly easily learn to use and improve the mid-level NQP code too,
159
+ at least from a pure language point of view.
164
160
Start with L < NQP and internals course|http://edumentab.github.io/rakudo-and-nqp-internals-course/ > .
165
161
166
- * Finally, if low-level C hacking is your idea of fun, checkout L < MoarVM|http://moarvm.org > .
162
+ = item Finally, if low-level C hacking is your idea of fun, checkout L < MoarVM|http://moarvm.org > .
167
163
168
- = head2 Other possibilities
164
+ = head2 Still need more?
169
165
170
- There are many other possibilities :
166
+ There are many other things to consider :
171
167
improving L < data alignment|https://en.wikipedia.org/wiki/Data_structure_alignment > ,
172
168
L < data granularity|https://en.wikipedia.org/wiki/Granularity#Data_granularity > ,
173
- <data compression|https://en.wikipedia.org/wiki/Data_compression>, and
169
+ L < data compression|https://en.wikipedia.org/wiki/Data_compression > , and
174
170
L < locality of reference|https://en.wikipedia.org/wiki/Locality_of_reference > to name a few.
175
171
If you think some topic needs more coverage on this page please submit a PR or tell someone your idea.
176
172
Thanks. :)
177
173
174
+ B < Tried everything? Frustrated? > Please consider talking to someone in the community about your use-case
175
+ before giving up or concluding the answer to
176
+ L < Is Perl 6 fast enough for me?|http://doc.perl6.org/language/faq#Is_Perl_6_fast_enough_for_me? > is "No".
177
+
178
178
= end pod
0 commit comments