4
4
5
5
= SUBTITLE Measuring and improving run-time or compile-time performance
6
6
7
- This page is about anything to do with L < computer
7
+ This page is about L < computer
8
8
performance|https://en.wikipedia.org/wiki/Computer_performance > in the context
9
9
of Perl 6.
10
10
11
- = head1 First, clarify the problem
11
+ = head1 First, profile your code
12
12
13
13
B < Make sure you're not wasting time on the wrong code > : start by identifying
14
14
your L < "critical 3%"|https://en.wikiquote.org/wiki/Donald_Knuth > by "profiling"
15
- as explained below .
15
+ your code's performance. The rest of this document shows you how to do that .
16
16
17
17
= head2 Time with C < now - INIT now >
18
18
@@ -34,15 +34,15 @@ phase|/language/phasers#INIT>.
34
34
35
35
= head2 Profile locally
36
36
37
- When using the L < MoarVM|http://moarvm.org > backend the
37
+ When using the L < MoarVM|http://moarvm.org > backend, the
38
38
L < Rakudo|http://rakudo.org > compiler's C < --profile > command line option writes
39
- profile information as an HTML file. However, if the profile is too big it can
40
- be slow to open in a browser . In that case, if you use the C < --profile-
41
- filename=file.extension > option with an extension of C < .json > , you can use the
42
- L < Qt viewer|https://github.com/tadzik/p6profiler-qt > on the resulting JSON file .
39
+ the profile data to an HTML file. If the profile data is too big, it could take a
40
+ long time for a browser to open the file . In that case, output to a file with a
41
+ C < .json > extension, then open the file with
42
+ L < Qt viewer|https://github.com/tadzik/p6profiler-qt > .
43
43
44
- Another option (especially useful for profiles too big even for the Qt viewer)
45
- is to use an extension of C < .sql > . This will write the profile data as a series
44
+ To deal with even larger profiles, output to a file with a C < .sql > extension.
45
+ This will write the profile data as a series
46
46
of SQL statements, suitable for opening in SQLite.
47
47
48
48
= begin code :skip-test
@@ -75,40 +75,41 @@ of SQL statements, suitable for opening in SQLite.
75
75
= end code
76
76
77
77
To learn how to interpret the profile info, use the C < prof-m: your code goes
78
- here > evalbot (explained above) and ask questions on channel.
78
+ here > evalbot (explained above) and ask questions on the channel.
79
79
80
80
= head2 Profile compiling
81
81
82
- The Rakudo compiler's C < -- profile-compile > option profiles the time and memory
83
- used to compile code .
82
+ If you want to profile the time and memory it takes to compile your code, use
83
+ Rakudo's C < --profile- compile> option .
84
84
85
85
= head2 Create or view benchmarks
86
86
87
87
Use L < perl6-bench|https://github.com/japhb/perl6-bench > .
88
88
89
- If you run perl6-bench for multiple compilers (typically versions of Perl 5,
90
- Perl 6, or NQP) then results for each are visually overlaid on the same graphs
89
+ If you run perl6-bench for multiple compilers (typically, versions of Perl 5,
90
+ Perl 6, or NQP), results for each are visually overlaid on the same graphs,
91
91
to provide for quick and easy comparison.
92
92
93
93
= head2 Share problems
94
94
95
- Once you've used the above techniques to pinpoint code and performance that
96
- really matters you're in a good place to share problems, one at a time :
95
+ Once you've used the above techniques to identify the code to improve,
96
+ you can then begin to address (and share) the problem with others :
97
97
98
- = item For each problem you see , distill it down to a one-liner or short public
99
- gist of code that either already includes performance numbers or is small enough
98
+ = item For each problem, distill it down to a one-liner or the
99
+ gist and either provide performance numbers or make the snippet small enough
100
100
that it can be profiled using C < prof-m: your code or gist URL goes here > .
101
101
102
102
= item Think about the minimum speed increase (or ram reduction or whatever) you
103
- need/want. What if it took a month for folk to help you achieve that? A year?
103
+ need/want, and think about the cost associated with achieving that goal. What's
104
+ the improvement worth in terms of people's time and energy?
104
105
105
- = item Let folk know if your Perl 6 use-case is in a production setting or just for fun.
106
+ = item Let others know if your Perl 6 use-case is in a production setting or just for fun.
106
107
107
108
= head1 Solve problems
108
109
109
110
This bears repeating: B < make sure you're not wasting time on the wrong code > .
110
- Start by identifying the L < "critical
111
- 3%"|https://en.wikiquote.org/wiki/Donald_Knuth > of your code.
111
+ Start by identifying the L < "critical 3%"|https://en.wikiquote.org/wiki/Donald_Knuth >
112
+ of your code.
112
113
113
114
= head2 Line by line
114
115
@@ -118,7 +119,7 @@ L<camelia|/language/glossary#camelia>.
118
119
119
120
= head2 Routine by routine
120
121
121
- With multidispatch you can drop in new variants of routines "alongside" existing
122
+ With multidispatch, you can drop in new variants of routines "alongside" existing
122
123
ones:
123
124
124
125
# existing code generically matches a two arg foo call:
@@ -129,9 +130,9 @@ ones:
129
130
130
131
The call overhead of having multiple C < foo > definitions is generally
131
132
insignificant (though see discussion of C < where > below), so if your new
132
- definition handles its particular case more quickly/leanly than the previously
133
- existing set of definitions then you probably just made your code that much
134
- faster/leaner for that case.
133
+ definition handles its particular case more efficiently than the previously
134
+ existing set of definitions, then you probably just made your code that much
135
+ more efficient for that case.
135
136
136
137
= head2 Speed up type-checks and call resolution
137
138
@@ -140,14 +141,13 @@ L<subsets|https://design.perl6.org/S12.html#Types_and_Subtypes> – force dynami
140
141
(run-time) type checking and call resolution for any call it I < might > match.
141
142
This is slower, or at least later, than compile-time.
142
143
143
- Method calls are generally resolved as late as possible, so dynamically, at run-
144
- time, whereas sub calls are generally resolvable statically, at compile-time.
144
+ Method calls are generally resolved as late as possible ( dynamically at run-time),
145
+ whereas sub calls are generally resolved statically at compile-time.
145
146
146
147
= head2 Choose better algorithms
147
148
148
- One of the most reliable techniques for making large performance improvements
149
- regardless of language or compiler is to pick an algorithm better suited to your
150
- needs.
149
+ One of the most reliable techniques for making large performance improvements,
150
+ regardless of language or compiler, is to pick a more appropriate algorithm.
151
151
152
152
A classic example is
153
153
L < Boyer-Moore|https://en.wikipedia.org/wikiBoyer%E2%80%93Moore_string_search_algorithm > .
@@ -157,14 +157,14 @@ the second characters, or, if they don't match, compare the first character of
157
157
the small string with the second character in the large string, and so on. In
158
158
contrast, the Boyer-Moore algorithm starts by comparing the *last* character of
159
159
the small string with the correspondingly positioned character in the large
160
- string. For most strings the Boyer-Moore algorithm is close to N times faster
160
+ string. For most strings, the Boyer-Moore algorithm is close to N times faster
161
161
algorithmically, where N is the length of the small string.
162
162
163
163
The next couple sections discuss two broad categories for algorithmic
164
164
improvement that are especially easy to accomplish in Perl 6. For more on this
165
165
general topic, read the wikipedia page on L < algorithmic
166
166
efficiency|https://en.wikipedia.org/wiki/Algorithmic_efficiency > , especially the
167
- See also section near the end.
167
+ ' See also' section near the end.
168
168
169
169
= head3 Change sequential/blocking code to parallel/non-blocking
170
170
@@ -176,68 +176,62 @@ and/or L<the matching video|https://www.youtube.com/watch?v=JpqnNCx7wVY&list=PLR
176
176
177
177
= head2 Use existing high performance code
178
178
179
- Is there an existing high (enough) performance implementation of what you're
180
- trying to speed up / slim down?
179
+ There are plenty of high performance C libraries that you can use within Perl 6 and
180
+ L < NativeCall|/language/nativecall > makes it easy to create wrappers for them. There's
181
+ experimental support for C++ libraries, too.
181
182
182
- There are a lot of C libs out there. L < NativeCall|/language/nativecall > makes it
183
- easy to create wrappers for C libs (there's experimental support for C++ libs
184
- too) such as L < Gumbo|https://github.com/Skarsnik/perl6-gumbo > . (Data marshalling
185
- and call handling is somewhat poorly optimized at the time of writing this but
186
- for many applications that won't matter.)
183
+ If you want to L < use Perl 5 modules in Perl 6|http://stackoverflow.com/a/27206428/1077672 > ,
184
+ mix in Perl 6 types and the L < Meta-Object Protocol|/language/mop > .
187
185
188
- Perl 5's compiler can be treated as a C lib. Mix in Perl 6 types, the
189
- L < MOP|/language/mop > , and some hairy programming that someone else has done for
190
- you, and the upshot is that you can conveniently
191
- L < use Perl 5 modules in Perl 6|http://stackoverflow.com/a/27206428/1077672 > .
192
-
193
- More generally, Perl 6 is designed for smooth interop with other languages and
194
- there are a number of L < modules aimed at providing convenient use of libs from
186
+ More generally, Perl 6 is designed to smoothly interoperate with other languages and
187
+ there are a number of L < modules aimed at facilitating the use of libs from
195
188
other langs|https://modules.perl6.org/#q=inline > .
196
189
197
190
= head2 Make the Rakudo compiler generate faster code
198
191
199
- The focus to date (Feb 2016) regarding the compiler has been correctness, not
200
- how fast it generates code or, more importantly, how fast or lean the code it
201
- generates runs. But that's expected to change somewhat this year and beyond . You
192
+ To date, the focus for the compiler has been correctness, not
193
+ how fast it generates code or how fast or lean the code it
194
+ generates runs. But that's expected to change, eventually.. . You
202
195
can talk to compiler devs on the freenode IRC channels #perl6 and #moarvm about
203
- what to expect. Better still you can contribute yourself:
196
+ what to expect. Better still, you can contribute yourself:
204
197
205
198
= item Rakudo is largely written in Perl 6. So if you can write Perl 6, then you
206
199
can hack on the compiler, including optimizing any of the large body of existing
207
200
high-level code that impacts the speed of your code (and everyone else's).
208
201
209
202
= item Most of the rest of the compiler is written in a small language called
210
203
L < NQP|https://github.com/perl6/nqp > that's basically a subset of Perl 6. If you
211
- can write Perl 6 you can fairly easily learn to use and improve the mid-level
212
- NQP code too, at least from a pure language point of view. To dig in to NQP and
213
- Rakudo's guts, start with L < NQP and internals course|http://edumentab.github.io/rakudo-and-nqp-internals-course/ > .
204
+ can write Perl 6, you can fairly easily learn to use and improve the mid-level
205
+ NQP code too, at least from a pure language point of view. To dig into NQP and
206
+ Rakudo's guts, start with
207
+ L < NQP and internals course|http://edumentab.github.io/rakudo-and-nqp-internals-course/ > .
214
208
215
209
= item If low-level C hacking is your idea of fun, checkout
216
210
L < MoarVM|http://moarvm.org > and visit the freenode IRC channel #moarvm
217
211
(L < logs|https://irclog.perlgeek.de/moarvm/ > ).
218
212
219
213
= head2 Still need more ideas?
220
214
221
- There are endless performance topics.
222
-
223
215
Some known current Rakudo performance weaknesses not yet covered in this page
224
- include use of gather/take, use of junctions, regexes, and string handling in
216
+ include the use of gather/take, junctions, regexes and string handling in
225
217
general.
226
218
227
- If you think some topic needs more coverage on this page please submit a PR or
219
+ If you think some topic needs more coverage on this page, please submit a PR or
228
220
tell someone your idea. Thanks. :)
229
221
230
222
= head1 Not getting the results you need/want?
231
223
232
224
If you've tried everything on this page to no avail, please consider discussing
233
- things with a compiler dev on #perl6 so we can learn from your use-case and what
225
+ things with a compiler dev on #perl6, so we can learn from your use-case and what
234
226
you've found out about it so far.
235
227
236
- Once you know one of the main devs knows of your plight, allow enough time for
237
- an informed response (a few days or weeks depending on the exact nature of your
228
+ Once a dev knows of your plight, allow enough time for
229
+ an informed response (a few days or weeks, depending on the exact nature of your
238
230
problem and potential solutions).
239
231
240
- If I < that > hasn't worked out either, please consider filing an issue discussing your experience at
241
- L < our user experience repo|https://github.com/perl6/user-experience/issues > before moving on. Thanks. :)
232
+ If I < that > hasn't worked out, please consider filing an issue about your experience at
233
+ L < our user experience repo|https://github.com/perl6/user-experience/issues > before moving on.
234
+
235
+ Thanks. :)
242
236
243
237
= end pod
0 commit comments