/
index.html
1053 lines (829 loc) · 49.8 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="Content-type" content="text/html; charset=utf-8">
<meta content="An article about named function expressions in Javascript" name="description">
<meta content="named function expressions, function expression, javascript, jscript identifier leak, function names in debuggers" name="keywords">
<meta content="Juriy 'kangax' Zaytsev" name="author">
<title>Named function expressions demystified</title>
<link rel="stylesheet" href="all.css" type="text/css" media="all">
<script type="text/javascript">
var _gaq = [['_setAccount', 'UA-1128111-17'],['_trackPageview']];
(function(d) {
var script = d.createElement('script'),
head = d.getElementsByTagName('head')[0] || d.documentElement;
script.async = true;
script.src = 'http://www.google-analytics.com/ga.js';
head.insertBefore(script, head.firstChild);
})(document);
</script>
</head>
<body onload="prettyPrint();">
<div class="container">
<h1 style="margin-top: 1em;">Named function expressions demystified</h1>
<p style="text-align: right;"><em>by <a href="http://thinkweb2.com/projects/prototype/" title="">Juriy "kangax" Zaytsev</a></em></p>
<ol>
<li><a href="#introduction">Introduction</a></li>
<li><a href="#expr-vs-decl">Function expressions vs. Function declarations</a></li>
<li><a href="#function-statements">Function Statements</a></li>
<li><a href="#named-expr">Named function expressions</a></li>
<li><a href="#names-in-debuggers">Function names in debuggers</a></li>
<li><a href="#jscript-bugs">JScript bugs</a></li>
<li><a href="#jscript-memory-management">JScript memory management</a></li>
<li><a href="#tests">Tests</a></li>
<li><a href="#safari-bug">Safari bug</a></li>
<li><a href="#spidermonkey-peculiarity">SpiderMonkey peculiarity</a></li>
<li><a href="#solution">Solution</a></li>
<li><a href="#alt-solution">Alternative solution</a></li>
<li><a href="#webkit-displayName">WebKit's displayName</a></li>
<li><a href="#future-considerations">Future considerations</a></li>
<li><a href="#credits">Credits</a></li>
</ol>
<h2 id="introduction">Introduction</h2>
<p>Surprisingly, a topic of named function expressions doesn’t seem to be covered well enough on the web. This is probably why there are so many misconceptions floating around. In this article, I’ll try to summarize both — theoretical and practical aspects of these wonderful Javascript constructs; the good, bad and ugly parts of them.</p>
<p>In a nutshell, named function expressions are useful for one thing only — <strong>descriptive function names in debuggers and profilers</strong>. Well, there is also a possibility of using function names for recursion, but you will soon see that this is often impractical nowadays. If you don’t care about debugging experience, you have nothing to worry about. Otherwise, read on to see some of the cross-browser glitches you would have to deal with and tips on how work around them.</p>
<p>I’ll start with a general explanation of what function expressions are how modern debuggers handle them. Feel free to skip to a <a href="#solution">final solution</a>, which explains how to use these constructs safely.</p>
<h2 id="expr-vs-decl">Function expressions vs. Function declarations</h2>
<p>One of the two most common ways to create a function object in ECMAScript is by means of either <em>Function Expression</em> or <em>Function Declaration</em>. The difference between two is <strong>rather confusing</strong>. At least it was to me. The only thing ECMA specs make clear is that <em>Function Declaration</em> must always have an <em>Identifier</em> (or a function name, if you prefer), and <em>Function Expression</em> may omit it:</p>
<blockquote>
<p>FunctionDeclaration :<br>
function Identifier ( FormalParameterList <sub>opt</sub> ){ FunctionBody }</p>
<p>FunctionExpression :<br>
function Identifier <sub>opt</sub> ( FormalParameterList <sub>opt</sub> ){ FunctionBody }</p>
</blockquote>
<p>We can see that when identifier is omitted, that “something” can only be an expression. But what if identifier is present? How can one tell whether it is a function declaration or a function expression — they look identical after all? It appears that ECMAScript differentiates between two based on a context. If a <code>function foo(){}</code> is part of, say, assignment expression, it is considered a function expression. If, on the other hand, <code>function foo(){}</code> is contained in a function body or in a (top level of) program itself — it is parsed as a function declaration.</p>
<pre lang="javascript" class="prettyprint">
function foo(){}; // declaration, since it's part of a <em>Program</em>
var bar = function foo(){}; // expression, since it's part of an <em>AssignmentExpression</em>
new function bar(){}; // expression, since it's part of a <em>NewExpression</em>
(function(){
function bar(){}; // declaration, since it's part of a <em>FunctionBody</em>
})();
</pre>
<p>A somewhat less obvious example of function expression is the one where function is wrapped with parenthesis — <code>(function foo(){})</code>. The reason it is an expression is again due to a context: "(" and ")" constitute a grouping operator and grouping operator can only contain an expression:</p>
<p>To demonstrate with examples:</p>
<pre lang="javascript" class="prettyprint">
function foo(){}; // function declaration
(function foo(){}); // function expression: due to grouping operator
try {
(var x = 5); // grouping operator can only contain expression, not a statement (which `var` is)
} catch(err) {
// SyntaxError
}
</pre>
<p>You might also recall that when evaluating JSON with <code>eval</code>, the string is usually wrapped with parenthesis — <code>eval('(' + json + ')')</code>. This is of course done for the same reason — grouping operator, which parenthesis are, forces JSON brackets to be parsed as expression rather than as a block:</p>
<pre lang="javascript" class="prettyprint">
try {
{ "x": 5 }; // "{" and "}" are parsed as a block
} catch(err) {
// SyntaxError
}
({ "x": 5 }); // grouping operator forces "{" and "}" to be parsed as object literal
</pre>
<p>There’s a subtle difference in behavior of declarations and expressions.</p>
<p>First of all, function declarations are parsed and evaluated before any other expressions are. Even if declaration is positioned last in a source, it will be evaluated <strong>foremost any other expressions</strong> contained in a scope. The following example demonstrates how <code>fn</code> function is already defined by the time <code>alert</code> is executed, even though it’s being declared right after it:</p>
<pre lang="javascript" class="prettyprint">
alert(fn());
function fn() {
return 'Hello world!';
}
</pre>
<p>Another important trait of function declarations is that declaring them conditionally is non-standardized and varies across different environments. You should never rely on functions being declared conditionally and use function expressions instead.</p>
<pre lang="javascript" class="prettyprint">
// <strong>Never do this!</strong>
// Some browsers will declare `foo` as the one returning 'first',
// while others — returning 'second'
if (true) {
function foo() {
return 'first';
}
}
else {
function foo() {
return 'second';
}
}
foo();
// Instead, use function expressions:
var foo;
if (true) {
foo = function() {
return 'first';
};
}
else {
foo = function() {
return 'second';
};
}
foo();
</pre>
<p>If you're curious about actual production rules of function declarations, read on. Otherwise, feel free to skip the following excerpt.</p>
<div id="function-declarations-in-blocks" class="excerpt">
<p>
<em>FunctionDeclaration</em>s are only allowed to appear in <em>Program</em> or <em>FunctionBody</em>.
Syntactically, they <strong>can not appear in <em>Block</em></strong> (<code>{ ... }</code>) — such as that of <code>if</code>,
<code>while</code> or <code>for</code> statements. This is because <em>Block</em>s can only contain <em>Statement</em>s,
not <em>SourceElement</em>s, which <em>FunctionDeclaration</em> is.
If we look at production rules carefully, we can see that the only way <em>Expression</em> is allowed within <em>Block</em>
is when it is part of <em>ExpressionStatement</em>. However, <em>ExpressionStatement</em> is explicitly defined
<strong>to not begin with "function" keyword</strong>, and this is exactly what makes <em>FunctionExpression</em>
invalid as part of a <em>Statement</em> or <em>Block</em> (note that <em>Block</em> is merely a list of <em>Statement</em>s).
</p>
<p>
Because of these restrictions, whenever function appears in a block (such as in previous example) it should actually be
<strong>considered a syntax error</strong>, not function declaration or expression. The problem is that none of the
implementations I've seen parse these functions per rules. They interpret them in proprietary ways instead.
</p>
</div>
<p>It's worth mentioning that as per specification, implementations are allowed to introduce <strong>syntax extensions</strong> (see section 16), yet still be fully conforming. This is exactly what happens in so many clients these days. Some of them interpret function declarations in blocks as any other function declarations — simply hoisting them to the top of the enclosing scope; Others — introduce different semantics and follow slightly more complex rules.</p>
<h2 id="function-statements">Function statements</h2>
<div>
<p>
One of such syntax extensions to ECMAScript is <strong>Function Statements</strong>,
currently implemented in Gecko-based browsers (tested in Firefox 1-3.7a1pre on Mac OS X).
Somehow, this extension doesn't seem to be widely known, either for good or bad (<a href="https://developer.mozilla.org/En/Core_JavaScript_1.5_Reference:Functions#Conditionally_defining_a_function">MDC mentions them</a>, but very briefly).
Please remember, that we are discussing it here only for learning purposes and to satisfy our curiosity;
unless you're writing scripts for specific Gecko-based environment,
<strong>I do not recommend relying on this extension</strong>.
</p>
<p>
So, here are some of the traits of these non-standard constructs:
</p>
<ol>
<li>Function statements are allowed to be anywhere where plain <em>Statement</em>s are allowed. This, of course, includes <em>Block</em>s:</li>
<pre lang="javascript" class="prettyprint">
if (true) {
function f(){ }
}
else {
function f(){ }
}
</pre>
<li>Function statements are interpreted as any other statements, including conditional execution:</li>
<pre lang="javascript" class="prettyprint">
if (true) {
function foo(){ return 1; }
}
else {
function foo(){ return 2; }
}
foo(); // 1
// Note that other clients interpet `foo` as function declaration here,
// overwriting first `foo` with the second one, and producing "2", not "1" as a result
</pre>
<li>
Function statements are NOT declared during variable instantiation. They are declared at run time, just like function expressions. However, once declared, function statement's identifier <strong>becomes available to the entire scope</strong> of the function. This identifier availability is what makes function statements different from function expressions (you will see exact behavior of named function expressions in next chapter).
<pre lang="javascript" class="prettyprint">
// at this point, `foo` is not yet declared
typeof foo; // "undefined"
if (true) {
// once block is entered, `foo` becomes declared and available to the entire scope
function foo(){ return 1; }
}
else {
// this block is never entered, and `foo` is never redeclared
function foo(){ return 2; }
}
typeof foo; // "function"
</pre>
Generally, we can emulate function statements behavior from the previous example with this standards-compliant (and unfortunately, more verbose) code:
<pre lang="javascript" class="prettyprint">
var foo;
if (true) {
foo = function foo(){ return 1; };
}
else {
foo = function foo() { return 2; };
}
</pre>
</li>
<li>
String representation of functions statements is similar to that of function declarations or named function expressions (and includes identifier — "foo" in this example):
<pre lang="javascript" class="prettyprint">
if (true) {
function foo(){ return 1; }
}
String(foo); // function foo() { return 1; }
</pre>
</li>
<li>
Finally, what appears to be a bug in earlier Gecko-based implementations (present in <= Firefox 3), is the way function statements overwrite function declarations. Earlier versions were somehow failing to overwrite function declarations with function statements:
<pre lang="javascript" class="prettyprint">
// function declaration
function foo(){ return 1; }
if (true) {
// overwritting with function statement
function foo(){ return 2; }
}
foo(); // 1 in FF<= 3, 2 in FF3.5 and later
// however, this doesn't happen when overwriting function expression
var foo = function(){ return 1; };
if (true) {
function foo(){ return 2; }
}
foo(); // 2 in all versions
</pre>
</li>
</ol>
<p id="function-statements-in-safari">
Note that older Safari (at least 1.2.3, 2.0 - 2.0.4 and 3.0.4, and possibly earlier versions too) implement function statements <strong>identically to SpiderMonkey</strong>.
All examples from this chapter, except the last "bug" one, produce same results in those versions of Safari as they do in, say, Firefox. Another browser that seems to
follow same semantics is Blackberry one (at least 8230, 9000 and 9530 models). This diversity in behavior demonstrates once again what a bad idea it is to rely on these extensions.
</p>
</div>
<h2 id="named-expr">Named function expressions</h2>
<p>Function expressions can actually be seen quite often. A common pattern in web development is to “fork” function definitions based on some kind of a feature test, allowing for the best performance. Since such forking usually happens in the same scope, it is almost always necessary to use function expressions. After all, as we know by now, function declarations should not be executed conditionally:</p>
<pre lang="javascript" class="prettyprint">
// `contains` is part of "APE Javascript library" (http://dhtmlkitchen.com/ape/) by Garrett Smith
var contains = (function() {
var docEl = document.documentElement;
if (typeof docEl.compareDocumentPosition != 'undefined') {
return function(el, b) {
return (el.compareDocumentPosition(b) & 16) !== 0;
};
}
else if (typeof docEl.contains != 'undefined') {
return function(el, b) {
return el !== b && el.contains(b);
};
}
return function(el, b) {
if (el === b) return false;
while (el != b && (b = b.parentNode) != null);
return el === b;
}
})();
</pre>
<p>Quite obviously, when a function expression has a name (technically — <em>Identifier</em>), it is called a <strong>named function expression</strong>. What you’ve seen in the very first example — <code>var bar = function foo(){};</code> — was exactly that — a named function expression with <code>foo</code> being a function name. An important detail to remember is that this name is <strong>only available in the scope of a newly-defined function</strong>; specs mandate that an identifier should not be available to an enclosing scope:</p>
<pre lang="javascript" class="prettyprint">
var f = function foo(){
return typeof foo; // "foo" is available in this inner scope
};
// `foo` is never visible "outside"
typeof foo; // "undefined"
f(); // "function"
</pre>
<p>So what’s so special about these named function expressions? Why would we want to give them names at all? </p>
<p>It appears that named functions make for a much more pleasant debugging experience. When debugging an application, having a call stack with descriptive items makes a huge difference.</p>
<h2 id="names-in-debuggers">Function names in debuggers</h2>
<p>When a function has a corresponding identifier, debuggers show that identifier as a function name, when inspecting call stack. Some debuggers (e.g. Firebug) helpfully show names of even anonymous functions — making them identical to names of variables that functions are assigned to. Unfortunately, these debuggers usually rely on simple parsing rules; Such extraction is usually quite fragile and often produces false results. </p>
<p>Let’s look at a simple example:</p>
<pre lang="javascript" class="prettyprint">
function foo(){
return bar();
}
function bar(){
return baz();
}
function baz(){
debugger;
}
foo();
// Here, we used function declarations when defining all of 3 functions
// When debugger stops at the `debugger` statement,
// the call stack (in Firebug) looks quite descriptive:
baz
bar
foo
expr_test.html()
</pre>
<p>We can see that <code>foo</code> called <code>bar</code> which in its turn called <code>baz</code> (and that <code>foo</code> itself was called from the global scope of <code>expr_test.html</code> document). What’s really nice, is that Firebug manages to parse the “name” of a function even when an anonymous expression is used:</p>
<pre lang="javascript" class="prettyprint">
function foo(){
return bar();
}
var bar = function(){
return baz();
}
function baz(){
debugger;
}
foo();
// Call stack
baz
bar()
foo
expr_test.html()
</pre>
<p>What’s not very nice, though, is that if a function expression gets any more complex (which, in real life, it almost always is) all of the debugger’s efforts turn out to be pretty useless; we end up with a shiny question mark in place of a function name:</p>
<pre lang="javascript" class="prettyprint">
function foo(){
return bar();
}
var bar = (function(){
if (window.addEventListener) {
return function(){
return baz();
}
}
else if (window.attachEvent) {
return function() {
return baz();
}
}
})();
function baz(){
debugger;
}
foo();
// Call stack
baz
(?)()
foo
expr_test.html()
</pre>
<p>Another confusion appears when a function is being assigned to more than one variable:</p>
<pre lang="javascript" class="prettyprint">
function foo(){
return baz();
}
var bar = function(){
debugger;
};
var baz = bar;
bar = function() {
alert('spoofed');
}
foo();
// Call stack:
bar()
foo
expr_test.html()
</pre>
<p>You can see call stack showing that <code>foo</code> invoked <code>bar</code>. Clearly, that’s not what has happened. The confusion is due to the fact that <code>baz</code> was “exchanged” references with another function — the one alerting “spoofed”. As you can see, such parsing — while great in simple cases — is often useless in any non-trivial script. </p>
<p>What it all boils down to is the fact that named <strong>function expressions is the only way to get a truly robust stack inspection</strong>. Let’s rewrite our previous example with named functions in mind. Notice how both of the functions returning from self-executing wrapper, are named as <code>bar</code>:</p>
<pre lang="javascript" class="prettyprint">
function foo(){
return bar();
}
var bar = (function(){
if (window.addEventListener) {
return function bar(){
return baz();
}
}
else if (window.attachEvent) {
return function bar() {
return baz();
}
}
})();
function baz(){
debugger;
}
foo();
// And, once again, we have a descriptive call stack!
baz
bar
foo
expr_test.html()
</pre>
<p>Before we start dancing happily celebrating this holy grail finding, I’d like to bring a beloved JScript into the picture.</p>
<h2 id="jscript-bugs">JScript bugs</h2>
<p>Unfortunately, JScript (i.e. Internet Explorer’s ECMAScript implementation) seriously messed up named function expressions. JScript is responsible for named function expressions <strong>being recommended against</strong> by many people these days. It's also quite sad that even <strong>last version of JScript — 5.8 — used in Internet Explorer 8, still exhibits every single quirk described below</strong></p>
<p>Let’s look at what exactly is wrong with its broken implementation. Understanding all of its issues will allow us to work around them safely. Note that I broke these discrepancies into few examples — for clarity — even though all of them are most likely a consequence of one major bug.</p>
<h3 id="example_1_function_expression_identifier_leaks_into_an_enclosing_scope">Example #1: Function expression identifier leaks into an enclosing scope</h3>
<pre lang="javascript" class="prettyprint">
var f = function g(){};
typeof g; // "function"
</pre>
<p>Remember how I mentioned that an identifier of named function expression is <strong>not available in an enclosing scope</strong>? Well, JScript doesn’t agree with specs on this one — <code>g</code> in the above example resolves to a function object. This is a most widely observed discrepancy. It’s dangerous in that it inadvertedly pollutes an enclosing scope — a scope that might as well be a global one — with an extra identifier. Such pollution can, of course, be a source of hard-to-track bugs.</p>
<h3 id="example_2_named_function_expression_is_treated_as_both_function_declaration_and_function_expression">Example #2: Named function expression is treated as BOTH — function declaration AND function expression</h3>
<pre lang="javascript" class="prettyprint">
typeof g; // "function"
var f = function g(){};
</pre>
<p>As I explained before, function declarations are parsed foremost any other expressions in a particular execution context. The above example demonstrates how <strong>JScript actually treats named function expressions as function declarations</strong>. You can see that it parses <code>g</code> before an “actual declaration” takes place. </p>
<p>This brings us to a next example:</p>
<h3 id="example_3_named_function_expression_creates_two_distinct_function_objects">Example #3: Named function expression creates TWO DISTINCT function objects!</h3>
<pre lang="javascript" class="prettyprint">
var f = function g(){};
f === g; // false
f.expando = 'foo';
g.expando; // undefined
</pre>
<p>This is where things are getting interesting. Or rather — completely nuts. Here we are seeing the dangers of having to deal with two distinct objects — augmenting one of them obviously does not modify the other one; This could be quite troublesome if you decided to employ, say, caching mechanism and store something in a property of <code>f</code>, then tried accessing it as a property of <code>g</code>, thinking that it is the same object you’re working with.</p>
<p>Let’s look at something a bit more complex.</p>
<h3 id="example_4_function_declarations_are_parsed_sequentially_and_are_not_affected_by_conditional_blocks">Example #4: Function declarations are parsed sequentially and are not affected by conditional blocks</h3>
<pre lang="javascript" class="prettyprint">
var f = function g() {
return 1;
};
if (false) {
f = function g(){
return 2;
};
}
g(); // 2
</pre>
<p>An example like this could cause even harder to track bugs. What happens here is actually quite simple. First, <code>g</code> is being parsed as a function declaration, and since declarations in JScript are independent of conditional blocks, <code>g</code> is being declared as a function from the “dead” <code>if</code> branch — <code>function g(){ return 2 }</code>. Then all of the “regular” expressions are being evaluated and <code>f</code> is being assigned another, newly created function object to. “dead” <code>if</code> branch is never entered when evaluating expressions, so <code>f</code> keeps referencing first function — <code>function g(){ return 1 }</code>. It should be clear by now, that if you’re not careful enough, and call <code>g</code> from within <code>f</code>, you’ll end up calling a completely unrelated <code>g</code> function object.</p>
<p>You might be wondering how all this mess with different function objects compares to <code>arguments.callee</code>. Does <code>callee</code> reference <code>f</code> or <code>g</code>? Let’s take a look:</p>
<pre lang="javascript" class="prettyprint">
var f = function g(){
return [
arguments.callee == f,
arguments.callee == g
];
};
f(); // [true, false]
g(); // [false, true]
</pre>
<p>As you can see, <code>arguments.callee</code> references whatever function is being invoked. This is actually good news, as you will see later on.</p>
<p>Another interesting example of "unexpected behavior" can be observed when using named <strong>function expression in undeclared assignment</strong>, but only when function is "named" the same way as identifier it's being assigned to:</p>
<pre lang="javascript" class="prettyprint">
(function(){
f = function f(){};
})();
</pre>
<p>As you might know, undeclared assignment (which is <strong>not recommended</strong> and is only used here for demonstration purposes) should result in creation of global <code>f</code> property. This is exactly what happens in conforming implementations. However, JScript bug makes things a bit more confusing. Since named function expression is parsed as function declaration (see <a href="#example_2_named_function_expression_is_treated_as_both_function_declaration_and_function_expression">example #2</a>), what happens here is that <code>f</code> becomes declared as a local variable during the phase of variable declarations. Later on, when function execution begins, assignment is no longer undeclared, so <code>function f(){}</code> on the right hand side is simply assigned to this newly created <strong>local</strong> <code>f</code> variable. Global <code>f</code> is never created.</p>
<p>This demonstrates how failing to understand JScript peculiarities can lead to drastically different behavior in code.</p>
<p>Looking at JScript deficiencies, it becomes pretty clear what exactly we need to avoid. First, we need <strong>to be aware of a leaking identifier</strong> (so that it doesn’t pollute enclosing scope). Second, we should <strong>never reference identifier used as a function name</strong>; A troublesome identifier is <code>g</code> from the previous examples. Notice how many ambiguities could have been avoided if we were to forget about <code>g</code>’s existance. Always referencing function via <code>f</code> or <code>arguments.callee</code> is the key here. If you use named expression, think of that name as something that’s only being used for debugging purposes. And finally, a bonus point is to <strong>always clean up an extraneous function</strong> created erroneously during NFE declaration.</p>
<p>I think last point needs a bit of an explanation:</p>
<h2 id="jscript-memory-management">JScript memory management</h2>
<p>Being familiar with JScript discrepancies, we can now see a potential problem with memory consumption when using these buggy constructs. Let’s look at a simple example:</p>
<pre lang="javascript" class="prettyprint">
var f = (function(){
if (true) {
return function g(){};
}
return function g(){};
})();
</pre>
<p>We know that a function returned from within this anonymous invocation — the one that has <code>g</code> identifier — is being assigned to outer <code>f</code>. We also know that named function expressions produce superfluous function object, and that this object is not the same as returned function. The memory issue here is caused by this extraneous <code>g</code> function being literally “trapped” in a closure of returning function. This happens because inner function is declared in the same scope as that pesky <code>g</code> one. Unless we <strong>explicitly break reference to <code>g</code> function</strong> it will keep consuming memory.</p>
<pre lang="javascript" class="prettyprint">
var f = (function(){
var f, g;
if (true) {
f = function g(){};
}
else {
f = function g(){};
}
// null `g`, so that it doesn't reference extraneous function any longer
g = null;
return f;
})();
</pre>
<p>Note that we explicitly declare <code>g</code> as well, so that <code>g = null</code> assignment wouldn’t create a global <code>g</code> variable in conforming clients (i.e. non-JScript ones). By <code>null</code>ing reference to <code>g</code>, we allow garbage collector to wipe off this implicitly created function object that <code>g</code> refers to.</p>
<p>When taking care of JScript NFE memory leak, I decided to run a simple series of tests to confirm that <code>null</code>ing <code>g</code> actually does free memory.</p>
<h2 id="tests">Tests</h2>
<p>The test was simple. It would simply create 10000 functions via named function expressions and store them in an array. I would then wait for about a minute and check how high the memory consumption is. After that I would null-out the reference and repeat the procedure again. Here’s a test case I used:</p>
<pre lang="javascript" class="prettyprint">
function createFn(){
return (function(){
var f;
if (true) {
f = function F(){
return 'standard';
}
}
else if (false) {
f = function F(){
return 'alternative';
}
}
else {
f = function F(){
return 'fallback';
}
}
// var F = null;
return f;
})();
}
var arr = [ ];
for (var i=0; i<10000; i++) {
arr[i] = createFn();
}
</pre>
<p>Results as seen in Process Explorer on Windows XP SP2 were:</p>
<pre lang="javascript" class="prettyprint">
IE6:
without `null`: 7.6K -> 20.3K
with `null`: 7.6K -> 18K
IE7:
without `null`: 14K -> 29.7K
with `null`: 14K -> 27K
</pre>
<p>The results somewhat confirmed my assumptions — explicitly nulling superfluous reference did free memory, but the difference in consumption was relatively insignificant. For 10000 function objects, there would be a ~3MB difference. This is definitely something that should be kept in mind when designing large-scale applications, applications that will run for either long time or on devices with limited memory (such as mobile devices). For any small script, the difference probably doesn’t matter.</p>
<p>You might think that it’s all finally over, but we are not just quite there yet :) There’s a tiny little detail that I’d like to mention and that detail is Safari 2.x</p>
<h2 id="safari-bug">Safari bug</h2>
<p>
Even less widely known bug with NFE is present in older versions of Safari; namely, Safari 2.x series. I’ve seen some <a href="http://meyerweb.com/eric/thoughts/2005/07/11/safari-syntaxerror/">claims on the web</a> that Safari 2.x does not support NFE at all. This is not true. Safari does support it, but has bugs in its implementation which you will see shortly.</p>
<p>When encountering function expression in a certain context, Safari 2.x fails to parse the program entirely. It doesn’t throw any errors (such as <code>SyntaxError</code> ones). It simply bails out:</p>
<pre lang="javascript" class="prettyprint">
(function f(){})(); // <== NFE
alert(1); // this line is never reached, since previous expression fails the entire program
</pre>
<p>After fiddling with various test cases, I came to conclusion that Safari 2.x <strong>fails to parse named function expressions, if those are not part of assignment expressions</strong>. Some examples of assignment expressions are:</p>
<pre lang="javascript" class="prettyprint">
// part of variable declaration
var f = 1;
// part of simple assignment
f = 2, g = 3;
// part of return statement
(function(){
return (f = 2);
})();
</pre>
<p>This means that putting named function expression into an assignment makes Safari “happy”:</p>
<pre lang="javascript" class="prettyprint">
(function f(){}); // fails
var f = function f(){}; // works
(function(){
return function f(){}; // fails
})();
(function(){
return (f = function f(){}); // works
})();
setTimeout(function f(){ }, 100); // fails
Person.prototype = {
say: function say() { ... } // fails
}
Person.prototype.say = function say(){ ... }; // works
</pre>
<p>It also means that we can’t use such common pattern as returning named function expression without an assignment:</p>
<pre lang="javascript" class="prettyprint">
// Instead of this non-Safari-2x-compatible syntax:
(function(){
if (featureTest) {
return function f(){};
}
return function f(){};
})();
// we should use this slightly more verbose alternative:
(function(){
var f;
if (featureTest) {
f = function f(){};
}
else {
f = function f(){};
}
return f;
})();
// or another variation of it:
(function(){
var f;
if (featureTest) {
return (f = function f(){});
}
return (f = function f(){});
})();
/*
Unfortunately, by doing so, we introduce an extra reference to a function
which gets trapped in a closure of returning function. To prevent extra memory usage,
we can assign all named function expressions to one single variable.
*/
var __temp;
(function(){
if (featureTest) {
return (__temp = function f(){});
}
return (__temp = function f(){});
})();
...
(function(){
if (featureTest2) {
return (__temp = function g(){});
}
return (__temp = function g(){});
})();
/*
Note that subsequent assignments destroy previous references,
preventing any excessive memory usage.
*/
</pre>
<p>If Safari 2.x compatibility is important, we need to make sure <strong>“incompatible” constructs do not even appear in the source</strong>. This is of course quite irritating, but is definitely possible to achieve, especially when knowing the root of the problem.</p>
<p>It’s also worth mentioning that declaring a function as NFE in Safari 2.x exhibits another minor glitch, where function representation does not contain function identifier:</p>
<pre lang="javascript" class="prettyprint">
var f = function g(){};
// Notice how function representation is lacking `g` identifier
String(f); // function () { }
</pre>
<p>This is not really a big deal. As I have already mentioned before, function decompilation is something that <a href="http://thinkweb2.com/projects/prototype/those-tricky-functions/">should not be relied upon</a> anyway. </p>
<h2 id="spidermonkey-peculiarity">SpiderMonkey peculiarity</h2>
<p>
We know that identifier of named function expression is only available to the local scope of a function.
But how does this "magic" scoping actually happen? It appears to be very simple.
When named function expression is evaluated, a <strong>special object is created</strong>.
The sole purpose of that object is to hold a property with the name corresponding to function identifier, and value corresponding to function itself.
That object is then injected into the front of the current scope chain, and this "augmented" scope chain is then used to initialize a function.
</p>
<p>
The interesting part here, however, is the way ECMA-262 defines this "special" object — the one that holds function identifier.
Spec says that an object is created <strong>"as if by expression new Object()"</strong> which, when interpreted literally,
makes this object an instance of built-in <code>Object</code> constructor. However, only one implementation — SpiderMonkey —
followed this specification requirement literally. In SpiderMonkey, it is possible to interfere with function local variables by augmenting <code>Object.prototype</code>:
</p>
<pre lang="javascript" class="prettyprint">
Object.prototype.x = 'outer';
(function(){
var x = 'inner';
/*
`foo` function here has a special object in its scope chain — to hold an identifier. That object is practically a —
`{ foo: <function object> }`. When `x` is being resolved through the scope chain, it is first searched for in
`foo`'s local context. When not found, it is searched in the next object from the scope chain. That object turns out
to be the one that holds identifier — { foo: <function object> } and since it inherits from `Object.prototype`,
`x` is found right here, and is the one that's `Object.prototype.x` (with value of 'outer'). Outer function's scope
(with x === 'inner') is never even reached.
*/
(function foo(){
alert(x); // alerts `outer`
})();
})();
</pre>
<p>
Note that later versions of SpiderMonkey actually <strong>changed this behavior</strong>,
as it was probably considered a security hole. A "special" object no longer inherits from <code>Object.prototype</code>.
You can, however, still see it in Firefox <=3.
</p>
<p id="activation-object-in-blackberry-browser">
Another environment implementing internal object as an instance of global <code>Object</code> is <strong>Blackberry browser</strong>.
Only this time, it's <em>Activation Object</em> that inherits from <code>Object.prototype</code>. Note that specification actually
doesn't codify <em>Activation Object</em> to be created "as if by expression new Object()" (as is the case with NFE's identifier holder object).
It states that <em>Activation Object</em> is merely a specification mechanism.
</p>
<p>So, let's see what happens in Blackberry browser:</p>
<pre lang="javascript" class="prettyprint">
Object.prototype.x = 'outer';
(function(){
var x = 'inner';
(function(){
/*
When `x` is being resolved against scope chain, this local function's Activation Object is searched first.
There's no `x` in it, of course. However, since Activation Object inherits from `Object.prototype`, it is
`Object.prototype` that's being searched for `x` next. `Object.prototype.x` does in fact exist and so `x`
resolves to its value — 'outer'. As in the previous example, outer function's scope (Activation Object)
with its own x === 'inner' is never even reached.
*/
alert(x); // alerts 'outer'
})();
})();
</pre>
<p>
This might look bizarre, but what's really disturbing is that there's even more chance of conflict with
already existing <code>Object.prototype</code> members:
</p>
<pre lang="javascript" class="prettyprint">
(function(){
var constructor = function(){ return 1; };
(function(){
constructor(); // evaluates to an object `{ }`, not `1`
constructor === Object.prototype.constructor; // true
toString === Object.prototype.toString; // true
// etc.
})();
})();
</pre>
<p>
Solution to this Blackberry discrepancy is obvious: avoid naming variables as <code>Object.prototype</code> properties
— <code>toString</code>, <code>valueOf</code>, <code>hasOwnProperty</code>, and so on.
</p>
<h2 id="solution">JScript solution</h2>
<pre lang="javascript" class="prettyprint">
var fn = (function(){
// declare a variable to assign function object to
var f;
// conditionally create a named function
// and assign its reference to `f`
if (true) {
f = function F(){ };
}
else if (false) {
f = function F(){ };
}
else {
f = function F(){ };
}
// Assign `null` to a variable corresponding to a function name
// This marks the function object (referred to by that identifier)
// available for garbage collection
var F = null;
// return a conditionally defined function
return f;
})();
</pre>
<p>Finally, here’s how we would apply this “techinque” in real life, when writing something like a cross-browser <code>addEvent</code> function:</p>
<pre lang="javascript" class="prettyprint">
// 1) enclose declaration with a separate scope
var addEvent = (function(){
var docEl = document.documentElement;
// 2) declare a variable to assign function to
var fn;
if (docEl.addEventListener) {
// 3) make sure to give function a descriptive identifier
fn = function addEvent(element, eventName, callback) {
element.addEventListener(eventName, callback, false);
};
}
else if (docEl.attachEvent) {
fn = function addEvent(element, eventName, callback) {
element.attachEvent('on' + eventName, callback);
};
}
else {
fn = function addEvent(element, eventName, callback) {
element['on' + eventName] = callback;
};
}
// 4) clean up `addEvent` function created by JScript
// make sure to either prepend assignment with `var`,
// or declare `addEvent` at the top of the function
var addEvent = null;
// 5) finally return function referenced by `fn`
return fn;
})();
</pre>
<h2 id="alt-solution">Alternative solution</h2>
<p>It’s worth mentioning that there actually exist alternative ways of having descriptive names in call stacks. Ways that don’t require one to use named function expressions. First of all, it is often possible to define function via declaration, rather than via expression. This option is only viable when you don’t need to create more than one function:</p>
<pre lang="javascript" class="prettyprint">
var hasClassName = (function(){
// define some private variables
var cache = { };
// use function declaration
function hasClassName(element, className) {
var _className = '(?:^|\\s+)' + className + '(?:\\s+|$)';
var re = cache[_className] || (cache[_className] = new RegExp(_className));
return re.test(element.className);
}
// return function
return hasClassName;
})();
</pre>
<p>This obviously wouldn’t work when forking function definitions. Nevertheless, there’s an interesting pattern that I first seen used by <a href="http://tobielangel.com/">Tobie Langel</a>. The way it works is by <strong>defining all functions upfront using function declarations, but giving them slightly different identifiers</strong>:</p>
<pre lang="javascript" class="prettyprint">
var addEvent = (function(){
var docEl = document.documentElement;
function addEventListener(){
/* ... */
}
function attachEvent(){
/* ... */
}
function addEventAsProperty(){
/* ... */
}
if (typeof docEl.addEventListener != 'undefined') {
return addEventListener;
}
elseif (typeof docEl.attachEvent != 'undefined') {
return attachEvent;
}
return addEventAsProperty;
})();
</pre>
<p>While it’s an elegant approach, it has its own drawbacks. First, by using different identifiers, you lose naming consistency. Whether it’s good or bad thing is not very clear. Some might prefer to have identical names, while others wouldn’t mind varying ones; after all, different names can often “speak” about implementation used. For example, seeing “attachEvent” in debugger, would let you know that it is an <code>attachEvent</code>-based implementation of <code>addEvent</code>. On the other hand, implementation-related name might not be meaningful at all. If you’re providing an API and name “inner” functions in such way, the user of API could easily get lost in all of these implementation details. </p>
<p>A solution to this problem might be to employ different naming convention. Just be careful not to introduce extra verbosity. Some alternatives that come to mind are:</p>
<pre lang="javascript" class="prettyprint">
`addEvent`, `altAddEvent` and `fallbackAddEvent`
// or
`addEvent`, `addEvent2`, `addEvent3`
// or
`addEvent_addEventListener`, `addEvent_attachEvent`, `addEvent_asProperty`
</pre>
<p>Another minor issue with this pattern is increased memory consumption. By defining all of the function variations upfront, you implicitly create N-1 unused functions. As you can see, if <code>attachEvent</code> is found in <code>document.documentElement</code>, then neither <code>addEventListener</code> nor <code>addEventAsProperty</code> are ever really used. Yet, they already consume memory; memory which is never deallocated for the same reason as with JScript’s buggy named expressions — both functions are “trapped” in a closure of returning one. </p>
<p>This increased consumption is of course hardly an issue. If a library such as Prototype.js was to use this pattern, there would be not more than 100-200 extra function objects created. As long as functions are not created in such way repeatedly (at runtime) but only once (at load time), you probably shouldn’t worry about it.</p>