forked from matthiasl/Erlang-FAQ
/
how_do_i.xml
executable file
·646 lines (548 loc) · 20.3 KB
/
how_do_i.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE chapter SYSTEM "chapter.dtd">
<chapter>
<header>
<title>How do I...</title>
<prepared>Matthias Lang</prepared>
<docno></docno>
<date>2007-09-12</date>
<rev>1.0</rev>
</header>
<p>
This section is intended for the impatient. Most of these
questions would resolved by working through the (on-line)
Erlang manuals, but sometimes we just want a quick answer...
</p><p>
Keep in mind that the program fragments are intended to
illustrate an idea, not serve as re-useable, robust modules!
</p>
<section><title>...compare numbers?</title>
<p>
The operators for comparing numbers are <c> >, >=, <,
=<, == and =/= </c>. Some of these are a little different
to C for historical reasons. Here are some examples:
</p>
<pre>
Eshell V4.9.1 (abort with ^G)
1> 13 > 2.
true
2> 18.2 >= 19.
false
3> 3 == 3.
true
4> 4 =/= 4.
false
5> 3 = 4.
** exited: {{badmatch,4},[{erl_eval,expr,3}]} **
</pre>
<p>
The last example is a (failed) pattern match rather than a
comparison.
</p>
</section>
<section><title>...represent a text-string?</title>
<p>
As a list of characters. You can write
</p>
<pre>A = "hello world".</pre>
<p> which is exactly the same as writing</p>
<pre>A = [104,101,108,108,111,32,119,111,114,108,100].
</pre>
<p>and also the same as writing</p>
<pre>A = [$h,$e,$l,$l,$o,$ ,$w,$o,$r,$l,$d].</pre>
<p>
Each character consumes 8 bytes of memory on a 32 bit
machine (a 32 bit integer and a 32 bit pointer) and twice
as much on 64 bit machines. Access to the nth. element takes
O(n) time. Access to the first element takes O(1) time, as
does prepending a character.
</p><p>
The bit-syntax, which is described in <url
href="http://www.erlang.org/doc/reference_manual/expressions.html#6.16">
the manual</url> provides an alternative, but somewhat
limited, way to represent strings compactly, for instance:
</p>
<pre><![CDATA[
1> A = <<"this string consumes one byte per character">>.
<<116,104,105,115,32,115,116,114,...>>
2> size(A).
43
3> <<_:8/binary,Q:3/binary,R/binary>> = A.
<<116,104,105,115,32,115,116,114,...>>
4> Q.
<<105,110,103>>
5> binary_to_list(Q).
"ing"
]]></pre>
<p>
This style of pattern-matching manipulation of strings
provides O(1) equivalents of some string operations which are
O(N) when using the list representation. Some other operations
which are O(1) in the list representation become O(N) when
using the bit-syntax.
</p><p>
There are general ways to
<seealso marker="academic#string-performance">improve
string performance.</seealso>
</p></section>
<section><title>...convert a string to lower case?</title>
<pre>
1> string:to_lower("Five Boxing WIZARDS").
"five boxing wizards"
2> string:to_upper("jump QuickLY").
"JUMP QUICKLY"
</pre>
<p>
If you have an older version of Erlang (prior to R11B-5), you can
find these two functions in the <c>httpd_util</c> module.
</p>
</section>
<section>
<marker id="howto-string-to-term"/>
<title>...convert a text string representation of a term to a term?</title>
<pre>
1> {ok,Tokens,_} = erl_scan:string("[{foo,bar},x,3].").
{ok,[{'[',1}, {'{',1}, {atom,1,foo}, {',',1}, {atom,1,bar},...
2> {ok,Term} = erl_parse:parse_term(Tokens).
{ok,[{foo,bar},x,3]}
</pre>
<p>
See also <c>file:consult/1</c>
</p></section>
<section><title>...use unicode/UTF-8?</title>
<p>
Representing strings as lists of integers also has advantages,
one of which is that unicode (and any other character encodings)
is automatically supported in the sense that you can pass around
lists of unicode characters.
</p><p>
In practice, supporting unicode means a bit more, for instance
being able to read unicode from a file and being able to sort
unicode strings.
</p><p>
There is some undocumented code in the OTP XMERL application
(the xmerl_ucs module) to help with some of that.
</p><p>
There's also <url href="http://12monkeys.co.uk/starling/">alpha
code</url> to help with basic operations over
unicode strings: comparison, concatention, substring extraction,
substring search and case conversion.
</p></section>
<section><title>...destructively update data, like in C</title>
<p>
Not being able to do this is considered a feature of Erlang. The
<url href="http://www.erlang.org/download/erlang-book-part1.pdf">
Erlang book</url> explains this in chapter 1.
<!-- REVISIT: I should read Joe's book and find a relevant passage there instead -->
</p><p>
Having said that, there are common methods to achieve
effects similar to destructive update: store the data
in a process (use messages to manipulate it), store the data
in ETS or use a data structure designed
for relatively cheap updates (the <c>dict</c> library
module is one common solution).
</p></section>
<section>
<marker id="noshell"/>
<title>...run an Erlang program directly from the unix shell?</title>
<p>
As of Erlang/OTP R11B-4, you can run Erlang code without
compiling by using Escript. The <url href="http://www.erlang.org/doc/man/escript.html">manual page</url> has a
complete example showing how to do it.
</p><p>
Escript is intended for short, "script-like" programs. A more
general way to do it, which also works for older Erlang releases,
is to start the Erlang VM without a shell and pass more switches
to the Erlang virtual machine. Here's hello world again:
</p>
<code>
-module(hello).
-export([hello_world/0]).
hello_world() ->
io:fwrite("hello, world\\n").
</code>
<p>
Save this as <c>hello.erl</c>, compile it and
run it directly from the unix (or msdos) command line:
</p>
<pre>
matthias >erl -compile hello
matthias >erl -noshell -s hello hello_world -s init stop
hello, world
</pre>
</section>
<section><title>...communicate with non-Erlang programs?</title>
<p>
Erlang has several mechanisms for communicating with programs
written in other languages, each with different tradeoffs. The
Erlang documentation includes a
<url href=
"http://www.erlang.org/doc/tutorial/part_frame.html">
guide to interfacing with other languages</url> which describes
the most common mechanisms:
</p>
<list>
<item><p>Distributed Erlang</p></item>
<item><p>Ports and linked-in drivers</p></item>
<item><p>EI (C interface to Erlang, replaces the
mostly deprecated erl_interface)</p></item>
<item><p>Jinterface (Java interface to Erlang)</p></item>
<item><p>IC (IDL compiler, used with C or Java)</p></item>
<item><p>General TCP or UDP protocols, including ASN.1</p></item>
</list>
<p>
Some of the above methods are easier to use than others. Loosely
coupled approaches, such as using a custom protocol over a TCP socket,
are easier to debug than the more tightly coupled options such as
linked-in drivers. The Erlang-questions mailing list archives
contain many desperate cries for help with memory corruption problems
caused by using Erl Interface and linked-in drivers incorrectly...
</p><p>
There are a number of user-supported methods and tools, including
</p>
<list>
<item><p><url href="http://epi.sourceforge.net/">A set of C++ classes</url></p></item>
<item><p><url href="http://www.lysator.liu.se/~tab/erlang/py_interface/">A Python implementation of an Erlang hidden node</url></p></item>
<item><p><url href="http://www.snookles.com/erlang/edtk/">
EDTK, a port-driver toolkit,</url> roughly equivalent
to Java's SWIG.</p></item>
<item><p><url href="http://dryverl.objectweb.org/">Dryverl,
an Erlang-to-C binding</url>.</p></item>
</list>
</section>
<section><title>...write a unix pipe program in Erlang?</title>
<p>
Lots of useful utilities in unix can be chained together
by piping one program's output to another program's input.
Here's an example which rot-13 encodes standard input
and sends it to standard output:
</p>
<code><![CDATA[
-module(rot13).
-export([rot13/0]).
rot13() ->
case io:get_chars('', 8192) of
eof -> init:stop();
Text ->
Rotated = [rot13(C) || C <- Text],
io:put_chars(Rotated),
rot13()
end.
rot13(C) when C >= $a, C =< $z -> $a + (C - $a + 13) rem 26;
rot13(C) when C >= $A, C =< $Z -> $A + (C - $A + 13) rem 26;
rot13(C) -> C.
]]></code>
<p>
After compiling this, you can run it from the command line:
</p>
<pre>
matthias> cat ~/.bashrc | erl -noshell -s rot13 rot13 | wc
</pre>
</section>
<section><title>...communicate with a DBMS?</title>
<p>
There are two completely different ways to communicate with
a database. One way is for Erlang to act as a database for
another system. The other way is for an Erlang program to
access a database. The former is a "research" topic. The
latter is easily accomplished by going via ODBC, which allows
you to access almost any commercial DBMS. The
<url href=
"http://www.erlang.org/doc/apps/odbc/index.html">
OTP ODBC manual</url> explains how to do this.
</p></section>
<section><title>...decode binary Erlang terms</title>
<p>
Erlang terms can be converted to and from binary representations
using bifs:
</p>
<code><![CDATA[
1> term_to_binary({3, abc, "def"}).
<<131,104,3,97,3,100,0,3,97,98,99,107,0,3,100,101,102>>
2> binary_to_term(v(1)).
{3,abc,"def"}
]]></code>
<p>
If you want to decode them using C programs, take a look
at <url href="http://erlang.org/doc/man/ei.html">EI</url>.
</p></section>
<section><title>...decode Erlang crash dumps</title>
<p>
Erlang crash dumps provide information about the state of the
system when the emulator crashed. The
<url href="http://erlang.org/doc/apps/erts/crash_dump.html">manual explains</url> how to interpret them.
</p></section>
<section><title>...estimate performance of an Erlang system?</title>
<p>
Mike Williams, one of Erlang's original developers, is fond
of saying
</p> <quote><p>"If you don't run experiments before
you start designing a <em>new</em> system, your
<em>entire system</em> will be an experiment!"</p></quote>
<p>
This philosophy is widespread around Erlang projects, in part
because the Erlang development environment encourages development
by prototyping. Such prototyping will also allow sensible
performance estimates to be made.
</p><p>
For those of you who want to leverage experience with C
and C++, some rough rules of thumb are:
</p>
<list>
<item><p>Code which involves mainly number crunching and
data processing will run about 10 times slower than
an equivalent C program. This includes almost all
"micro benchmarks"</p></item>
<item><p>Large systems which spent most of their time
communicating with other systems, recovering from faults and
making complex decisions run at least as fast as equivalent
C programs.</p></item>
</list>
<p>
Like in any other language or system, experienced developers
develop a sense of which operations are expensive and which
are cheap. Erlang newcomers accustomed to the relatively slow
interprocess communication facilities in other languages
tend to over-estimate the cost of creating Erlang processes and
passing messages between them.
</p></section>
<section><title>...measure performance of an Erlang system?</title>
<p>
The <em>timer</em> module measures the wall clock time
elapsed during execution of a function:
</p>
<code>
7> timer:tc(lists, reverse, ["hello world"]).
{27,"dlrow olleh"}
8> timer:tc(lists, reverse, ["hello world this is a longer string"]).
{34,"gnirts regnol a si siht dlrow olleh"}
</code>
<p>
The <c>eperf</c> library provides a way to profile
a system.
</p></section>
<section><title>...measure memory consumption in an Erlang system?</title>
<p>
Memory consumption is a bit of a tricky issue in Erlang. Usually,
you don't need to worry about it because the garbage collector
looks after memory management for you. But, when things go
wrong, there are several sources of information. Starting
from the most general:
</p><p>
Some operating systems provide detailed information
about process memory use with tools like top,
ps or the linux /proc filesystem:
</p>
<pre>
cat /proc/5898/status
VmSize: 7660 kB
VmLck: 0 kB
VmRSS: 5408 kB
VmData: 4204 kB
VmStk: 20 kB
VmExe: 576 kB
VmLib: 2032 kB
</pre>
<p>
This gives you a rock-solid upper-bound on the amount of
memory the entire Erlang system is using.
</p><p>
<c>erlang:system_info</c> reports
interesting things about some globally allocated structures in bytes:
</p>
<pre>
3> erlang:system_info(allocated_areas).
[{static,390265},
{atom_space,65544,49097},
{binary,13866},
{atom_table,30885},
{module_table,944},
{export_table,16064},
{register_table,240},
{loaded_code,1456353},
{process_desc,16560,15732},
{table_desc,1120,1008},
{link_desc,6480,5688},
{atom_desc,107520,107064},
{export_desc,95200,95080},
{module_desc,4800,4520},
{preg_desc,640,608},
{mesg_desc,960,0},
{plist_desc,0,0},
{fixed_deletion_desc,0,0}]
</pre>
<p>
Information about individual processes can be obtained
from <c>erlang:process_info/1</c> or
<c>erlang:process_info/2</c>:
</p>
<pre>
2> erlang:process_info(self(), memory).
{memory,1244}
</pre>
<p>
The shell's <c>i()</c> and the <c>pman
</c> tool also give useful overview information.
</p><p>
Don't expect the sum of the results from <c>process_info
</c> and <c>system_info</c> to add up to
the total memory use reported by the operating system. The Erlang
runtime also uses memory for other things.
</p><p>
A typical approach when you suspect you have memory problems
is
</p><p>
1. Confirm that there really is a memory problem by checking
that memory use as reported by the operating system is unexpectedly
high.
</p><p>
2. Use <c>pman</c> or the shell's <c>
i()</c> command to make sure there isn't an out-of-control
erlang process on the system. Out-of-control processes often have
enormous message queues. A common reason for Erlang processes to
get unexpectedly large is an endlessly looping function which isn't
tail recursive.
</p><p>
3. Check the amount of memory used for binaries (reported
by <c>system_info</c>). Normal data in Erlang is
put on the process heap, which is garbage collected. Large binaries,
on the other hand, are reference counted. This has two interesting
consequences. Firstly, binaries don't count towards a process'
memory use. Secondly, a lot of memory can be allocated in binaries
without causing a process' heap to grow much. If the heap doesn't
grow, it's likely that there won't be a garbage collection, which
may cause binaries to hang around longer than expected. A
strategically-placed call to <c>erlang:garbage_collect()
</c> will help.
</p><p>
4. If all of the above have failed to find the problem,
start the Erlang runtime system with the -instr switch.
</p></section>
<section><title>...estimate productivity in an Erlang project?</title>
<p>
A rough rule of thumb is that about the same number of lines
of code are produced per developer as in a C project.
A reasonably complex problem involving distribution and
fault tolerance will be roughly five times shorter in
<em>Erlang</em> than in C.
</p><p>
The traditional ways of slowing down projects, like adding
armies of consultants halfway through,
spending a year writing detailed design specifications
before any code is written, rigidly following a waterfall
model, spreading development across several countries and
holding team meetings to decide on the colour of the serviettes used
at lunch work just as well for Erlang as for other languages.
</p></section>
<section><title>...use multiple CPUs (or cores) in one server?</title>
<p>
As of OTP R12B, Erlang has SMP support on all the major
platforms, and it's enabled by default. Some platforms
(Solaris, Linux, MacOSX on PPC) had SMP support in earlier
versions.
</p><p>
Given that Erlang programs are naturally written in a concurrent
manner, most programs require no modification at all to take
advantage of SMP.
</p><p>
A more loosely coupled approach is to start multiple Erlang nodes
on the same server.
</p></section>
<section><title>...run distributed Erlang through a firewall?</title>
<p>
The simplest approach is to make an a-priori restriction to the
TCP ports distributed Erlang uses to communicate through by
setting the (undocumented) kernel variables 'inet_dist_listen_min'
and 'inet_dist_listen_max'. Example:
</p>
<pre>
application:set_env(kernel, inet_dist_listen_min, 9100).
application:set_env(kernel, inet_dist_listen_max, 9105).
</pre>
<p>
This forces Erlang to use only ports 9100--9105 for distributed
Erlang traffic.
In the above example, you would then need to configure your
firewall to pass ports 9100--9105 as well as port 4369 (for
the erlang port mapper).
</p><p>
There are other approaches, such as tunnelling the information
through SSH or writing your own distribution handler.
</p></section>
<section>
<marker id="crash-line-number"/>
<title>...find out which line my Erlang program crashed at?</title>
<p>
The run-time system's error reports tell you which function
crashed, but not the line number. Consider this module:
</p>
<pre>
-module(crash).
-export([f/0]).
f() ->
g().
g() ->
A = 4,
b = 5, % Error is on this line.
A.
</pre>
<pre>
3> crash:f().
** exited: {{badmatch,5},[{crash,g,0},{shell,exprs,6},{shell,eval_loop,3}]} **
</pre>
<p>
The crash message tells us that the crash was in the function
<c>crash:g/1</c>, and that it was a badmatch. For
programs with short functions, this is enough information for
an experienced programmer to find the problem.
</p><p>
In cases where you want more information, there are two options.
The first is to use the <url href="http://www.erlang.org/doc/apps/debugger/">erlang debugger</url> to single-step
the program.
</p><p>
Another option is to compile your code using the smart
exceptions package, available from
<url href='http://jungerl.cvs.sourceforge.net/jungerl/jungerl/lib/smart_exceptions/src/'>the jungerl</url>. Smart exceptions then provide line
number information:
</p>
<pre>
4> c(crash, [{parse_transform, smart_exceptions}]).
{ok,crash}
5> crash:f().
** exited: {{crash,g,0},{line,9},match,[5]} **
</pre>
</section>
<section>
<marker id="erlrt-and-sae"/>
<title>...distribute the Erlang programs I write to my friends/colleagues/users?</title>
<p>
Erlang programs only run on the Erlang VM, so every machine which
is going to run an Erlang program needs to have a copy of the
Erlang runtime installed.
</p><p>
Installing the entire Erlang system from erlang.org (or, perhaps,
indirectly via a packaging system such as Debian's or BSD's) is
the simplest option in many cases.
</p><p>
A more modular alternative is to install the
<url href="http://cean.process-one.net/">CEAN</url>
runtime.
</p><p>
A further alternative which has fallen into disuse is SAE:
stand-alone Erlang. SAE allows an Erlang program to be distributed
as just two files, totalling about 500k. SAE no longer works
on current Erlang releases, but for historical interest, there is
<url href="http://www.sics.se/~joe/sae.html">
a page about SAE</url> which shows you how to build a self-contained
system for the (somewhat outdated) R9B Erlang release.
</p></section>
<section>
<marker id="stderr"/>
<title>...write to standard error (stderr) on a unix system?</title>
<p>The simplest approach is to just open file descriptor two:</p>
<code>
P = open_port({fd,0,2}, [out]),
port_command(P, "this text goes to stderr\\n").
</code>
</section>
</chapter>