1
1
= Nginx Variables (04) =
2
2
3
- Even if a Nginx variable is hooked with "get handler", it can opt-in to
4
- use the value container as cache, so that when a variable is read multiple
5
- times, "get handler" is executed only once.Here is an example:
3
+ == Value Containers for Caching & ngx_map ==
4
+
5
+ Some Nginx variables choose to use their value containers as a data cache when
6
+ the "get handler" is configured. In this setting, the "get handler" is run only
7
+ once, i.e., at the first time the variable is read, which reduces overhead when
8
+ the variable is read multiple times during its lifetime. Let's see an example
9
+ for this.
6
10
7
11
:nginx
8
12
map $args $foo {
@@ -17,141 +21,155 @@ times, "get handler" is executed only once.Here is an example:
17
21
set $orig_foo $foo;
18
22
set $args debug;
19
23
20
- echo "orginal foo: $orig_foo";
24
+ echo "original foo: $orig_foo";
21
25
echo "foo: $foo";
22
26
}
23
27
}
24
28
25
- Module L<ngx_map> and its command L<ngx_map/map> is new, let me explain.
26
- command L<ngx_map/map> in Nginx defines the mapping in between two Nginx
27
- variables. Back to our example, command L<ngx_map/map> defines the mapping
28
- from builtin variable L<ngx_core/$args> to user variable C<$foo>, in other
29
- words, the value of C<$foo> is decided by the value of L<ngx_core/$args>
30
- with the given mapping.
31
-
32
- What exactly our mapping is defined as ?
29
+ Here we use the L<ngx_map/map> directive from the standard module L<ngx_map>
30
+ for the first time, which deserves some introduction. The word C<map> here
31
+ means mapping or correspondence. For example, functions in Maths are a kind of
32
+ "mapping". And Nginx's L<ngx_map/map> directive is used to define a "mapping"
33
+ relationship between two Nginx variables, or in other words, "function
34
+ relationship". Back to this example, we use the L<ngx_map/map> directive to
35
+ define the "mapping" relationship between user variable C<$foo> and built-in
36
+ variable L<ngx_core/$args>. When using the Math function notation, C<y = f(x)>,
37
+ our C<$args> variable is effectively the "independent variable", C<x>, while
38
+ C<$foo> is the "dependent variable", C<y>. That is, the value of C<$foo>
39
+ depends on the value of L<ngx_core/$args>, or rather, we I<map> the value of
40
+ L<ngx_core/$args> onto the C<$foo> variable (in some way).
41
+
42
+ Now let's look at the exact mapping rule defined by the L<ngx_map/map>
43
+ directive in this example.
33
44
34
45
:nginx
35
46
map $args $foo {
36
47
default 0;
37
48
debug 1;
38
49
}
39
50
40
- C<default>, found in the first line within curly bracket, defines the
41
- default mapping rule. It means if no other rules can be applied, mapping
42
- executes the default one, which assigns variable C<$foo> with value C<0>.
43
- The second line in the curly bracket defines another rule, which assigns
44
- variable C<$foo> with value C<1> when builtin variable L<ngx_core/$args>
45
- equals to string C<debug>. Therefore, variable C<$foo> is either C<0> or
46
- C<1>,
47
- up to whether L<ngx_core/$args> equals to string C<debug>.
48
-
49
- It's cleared enough. Back to our C<location /test>, we saved the value
50
- of
51
- C<$foo> to another user variable C<$orig_foo> and forcefully overwrite
52
- the
53
- value of L<ngx_core/$args> as C<debug>. At last, we print both C<$orig_foo>
54
- and C<$foo> using L<ngx_echo/echo>.
55
-
56
- When L<ngx_core/$args> is forcefully overwritten as C<debug>, we might
57
- have
58
- thought C<$foo> has the value C<1> according to our L<ngx_map/map> mappings,
59
- but testing defeats us:
51
+ The first line within the curly braces is a special rule condition, that is,
52
+ this condition holds if and only if other conditions all fail. When this
53
+ "default" condition holds, the "dependent variable" C<$foo> is assigned by the
54
+ value C<0>. The second line within the curly braces means that the "dependent
55
+ variable" C<$foo> is assigned by the value C<1> if the "independent variable"
56
+ C<$args> matches the string value C<debug>. Combining these two lines, we
57
+ obtain the following complete mapping rule: if the value of L<ngx_core/$args>
58
+ is C<debug>, variable C<$foo> gets the value C<1>; otherwise C<$foo> gets the
59
+ value C<0>. So essentially, this is a conditional assignment to the variable
60
+ C<$foo>.
61
+
62
+ Now that we understand what the L<ngx_map/map> directive does, let's look at
63
+ the definition of C<location /test>. We first save the value of C<$foo> into
64
+ another user variable C<$orig_foo>, then overwrite the value of
65
+ L<ngx_core/$args> to C<debug>, and finally output the values of C<$orig_foo>
66
+ and C<$foo>, respectively.
67
+
68
+ Intuitively, after we overwrite the value of L<ngx_core/$args> to C<debug>, the
69
+ value of C<$foo> should automatically get adjusted to C<1> according to the
70
+ mapping rule defined earlier, regardless of the original value of C<$foo>. But
71
+ the test result suggests the other way around.
60
72
61
73
:bash
62
74
$ curl 'http://localhost:8080/test'
63
75
original foo: 0
64
76
foo: 0
65
77
66
- As expected, C<$orig_foo> is C<0>, since the request has no URL parameters
67
- and
68
- L<ngx_core/$args> is empty, our default mapping rule is effective, and
69
- C<$foo>
70
- gets its value C<0>.
71
-
72
- But the second output appears confusing, as L<ngx_core/args> is already
73
- overwritten
74
- as C<debug>, our mapping rule should have assigned variable C<$foo> with
75
- value C<1>,
76
- what's wrong?
77
-
78
- The reason is simple, when variable C<$foo> is needed the first time, its
79
- calculated
80
- value from the mapping algorithm is cached, as being said, Nginx module
81
- can opt-in to
82
- use value container as cache for the outcome of its "get handler". Apparently,
83
- L<ngx_map>
84
- caches the outcome to avoid further expensive calculation, so that Nginx
85
- can use the cached
86
- result for that variable in the subsequent handling for free.
87
-
88
- To verify this, we request again with an URL parameter C<debug>:
78
+ The first output line indicates that the value of C<$orig_foo> is C<0>, which
79
+ is exactly what we expected: the original request does not take a URL query
80
+ string, so the initial value of L<ngx_core/$args> is empty, leading to the C<0>
81
+ initial value of C<$foo>, according to the "default" condition in our mapping
82
+ rule.
83
+
84
+ But surprisingly, the second output line indicates that the final value of
85
+ C<$foo> is still C<0>, even after we overwrite L<ngx_core/$args> to the value
86
+ C<debug>. This apparently violates our mapping rule because when
87
+ L<ngx_core/$args> takes the value C<debug>, the value of C<$foo> should really
88
+ be C<1>. So what is happening here?
89
+
90
+ Actually the reason is pretty simple: when the first time variable C<$foo> is
91
+ read, its value computed by L<ngx_map>'s "get handler" is
92
+ cached in its value container. We already learned earlier that Nginx modules
93
+ may choose to use the value container of the variable created by themselves as
94
+ a data cache for its "get handler". Obviously, the L<ngx_map> module considers
95
+ the mapping computation between variables expensive enough and caches the result
96
+ automatically, so that the next time the same variable is read within the
97
+ lifetime of the current request, Nginx can just return the cached result
98
+ without invoking the "get handler" again.
99
+
100
+ To verify this further, we can try specifying the URL query string as C<debug>
101
+ in the original request.
89
102
90
103
:bash
91
104
$ curl 'http://localhost:8080/test?debug'
92
105
original foo: 1
93
106
foo: 1
94
107
95
- Granted, the value of C<$orig_foo> becomes C<1>. Since builtin variable
96
- L<ngx_core/$args>
97
- equals C<debug>, according to the mapping rule, variable C<$foo> is calculated
98
- as C<1>, and
99
- the calculation result is cached and remains as C<1> no matter how L<ngx_core/$args>
100
- will
101
- be modified subsequently.
102
-
103
- Command L<ngx_map/map> is really more than what it looks, the command actually
104
- hooks a
105
- "get handler" for user variables, and exposes the script interface so that
106
- exact devalue
107
- logic can be easily modified by user themselves. The price of doing this,
108
- is to restrict
109
- the logic be the mapping from one variable to another. Meanwhile, let's
110
- recall what we've
111
- learnt back in L<vartut/ (03)>, even if a variable is devalued by a "get
112
- handler", it does
113
- not necessarily uses a value container as cache, such as the L<$arg_XXX>
114
- variables.
115
-
116
- Just like module L<ngx_map>, another builtin module L<ngx_geo> uses cache
117
- for variables.
118
-
119
- We should have noticed that command L<ngx_map/map> is written in front
120
- of C<server>
121
- directive, i.e. the mappings are defined directly within C<http>. Is it
122
- possible to
123
- write it within a C<location> directive since it is used only in C<location
124
- /test> in
125
- our example, the answer is no !
126
-
127
- People who have just learnt Nginx, would argue this global configuration
128
- of
129
- mappings by L<ngx_map/map>, is likely to be inefficient since request to
130
- every C<location>
131
- will cause the mapping be repeatedly calculated. Have no worry and let us
132
- review,
133
- command L<ngx_map/map> actually defines a "get handler" for a user variable,
134
- the
135
- get handler is only executed when the variable needs to be devalued (if
136
- cache is used, the
137
- handler is executed once for all), therefore, for those requests to certain
138
- C<location>
139
- which has not used the variable, no calculation will be triggered.
140
-
141
- The technique, which only calculates till the needed moment, is called
142
- "lazy evaluation" in
143
- computing. "Lazy evaluation", contrary to "eager evaluation", is not natively
144
- supported by
145
- most programming languages, a classic one who does is Haskell. In the mini
146
- language of Nginx,
147
- "eager evaluation" is far more common, such as following statement using
148
- L<ngx_rewrite/set>:
108
+ It can be seen that the value of C<$orig_foo> becomes C<1>, complying with our
109
+ mapping rule. And subsequent readings of C<$foo> always yield the same cached
110
+ result, C<1>, regardless of the new value of L<ngx_core/$args> later on.
111
+
112
+ The L<ngx_map/map> directive is actually a unique example, because it not only
113
+ registers a "get handler" for the user variable, but also allows the user to
114
+ define the computing rule in the "get handler" directly in the Nginx
115
+ configuration file. Of course, the rule that can be defined here is limited to
116
+ simple mapping relations with another variable. Meanwhile, it must be made
117
+ clear that not all the variables using a "get handler" will cache the result.
118
+ For instance, we have already seen earlier that the L<$arg_XXX> variable does
119
+ not use its value container at all.
120
+
121
+ Similar to the L<ngx_map> module, the standard module L<ngx_geo> that we
122
+ encountered earlier also enables value caching for the variables created by its
123
+ L<ngx_geo/geo> directive.
124
+
125
+ === A Side Note for Use Contexts of Directives ===
126
+
127
+ In the previous example, we should also note that the L<ngx_map/map> directive
128
+ is put outside the C<server> configuration block, that is, it is defined
129
+ directly within the outermost C<http> configuration block. Some readers may be
130
+ curious about this setting, since we only use it in C<location /test> after
131
+ all. If we try putting the L<ngx_map/map> statement within the C<location>
132
+ block, however, we will get the following error while starting Nginx:
133
+
134
+ [emerg] "map" directive is not allowed here in ...
135
+
136
+ So it is explicitly prohibited. In fact, it is only allowed to use the
137
+ L<ngx_map/map> directive in the C<http>
138
+ block. Every configure directive does have a pre-defined set of use contexts in
139
+ the configuration file. When in doubt, always refer to the corresponding
140
+ documentation for the exact use contexts of a particular directive.
141
+
142
+ == Lazy Evaluation of Variable Values ==
143
+
144
+ Many Nginx freshmen would worry that the use of the L<ngx_map/map> directive
145
+ within the global scope (i.e., the C<http> block) will lead to unnecessary
146
+ variable value computation and assignment for all the C<location>s in all the
147
+ virtual servers even if only one C<location> block actually uses it.
148
+ Fortunately, this is I<not> what is happening here. We have already learned how
149
+ the L<ngx_map/map>
150
+ directive works. It is the "get handler" (registered by the L<ngx_map> module)
151
+ that performs the value computation and related assignment. And the "get
152
+ handler" will not run at all
153
+ unless the corresponding user variable is actually being read. Therefore, for
154
+ those requests that never access that variable, there cannot be any (useless)
155
+ computation involved.
156
+
157
+ The technique that postpones the value computation off to the point where the
158
+ value is actually needed is called "lazy evaluation" in the computing world.
159
+ Programming languages natively offering "lazy evaluation" is not very
160
+ common though. The most famous example is the Haskell programming language,
161
+ where lazy evaluation is the default semantics. In contrast with "lazy
162
+ evaluation", it is much more common to see "eager evaluation". We are lucky
163
+ to see examples of lazy evaluation here in the L<ngx_map> module, but
164
+ the "eager evaluation" semantics is also much more common in the Nginx
165
+ world. Consider the following L<ngx_rewrite/set> statement that cannot be
166
+ simpler:
149
167
150
168
:nginx
151
169
set $b "$a,$a";
152
170
153
- When variable C<$b> is declared by command L<ngx_rewrite/set>, the value
154
- of C<$b> is computed right away, the calculation won't be delayed
155
- till
156
- variable C<$b> needs to be devalued .
171
+ When running the L<ngx_rewrite/set> directive, Nginx eagerly
172
+ computes and assigns the new value for the variable C<$b> without postponing to
173
+ the point when C<$b> is actually read later on. Similarly, the
174
+ L<ngx_set_misc/set_unescape_uri> directive also evaluates eagerly .
157
175
0 commit comments