Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse code

This commit was manufactured by cvs2svn to create tag 'APACHE_1_2b7'.

git-svn-id: https://svn.apache.org/repos/asf/httpd/httpd/tags/APACHE_1_2b7@77651 13f79535-47bb-0310-9956-ffa450edef68
  • Loading branch information...
commit 3da8b1dd1d4d9f88e5d323f83b5726631ff1b90c 1 parent 484e0d2
No Author authored
BIN  docs/docroot/apache_pb.gif
100 docs/manual/bind.html.en
... ... @@ -1,100 +0,0 @@
1   -<html><head>
2   -<title>Setting which addresses and ports Apache uses</title>
3   -</head><body>
4   -
5   -<!--#include virtual="header.html" -->
6   -<h1>Setting which addresses and ports Apache uses</h1>
7   -
8   -<hr>
9   -
10   -When Apache starts, it connects to some port and address on the
11   -local machine and waits for incoming requests. By default, it
12   -listens to all addresses on the machine, and to the port
13   -as specified by the <tt>Port</tt> directive in the server configuration.
14   -However, it can be told to listen to more the one port, or to listen
15   -to only selected addresses, or a combination. This is often combined
16   -with the Virtual Host feature which determines how Apache
17   -responds to different IP addresses, hostnames and ports.<p>
18   -
19   -There are two directives used to restrict or specify which addresses
20   -and ports Apache listens to.
21   -
22   -<ul>
23   -<li><a href="#bindaddress">BindAddress</a> is used to restrict the server to listening to
24   - a single address, and can be used to permit multiple Apache servers
25   - on the same machine listening to different IP addresses.
26   -<li><a href="#listen">Listen</a> can be used to make a single Apache server listen
27   - to more than one address and/or port.
28   -</ul>
29   -
30   -<h3><a name="bindaddress">BindAddress</a></h3>
31   -<strong>Syntax:</strong> BindAddress <em>[ * | IP-address | hostname ]</em><br>
32   -<strong>Default:</strong> <code>BindAddress *</code><br>
33   -<strong>Context:</strong> server config<br>
34   -<strong>Status:</strong> Core<p>
35   -
36   -Makes the server listen to just the specified address. If the argument
37   -is *, the server listens to all addresses. The port listened to
38   -is set with the <tt>Port</tt> directive. Only one BindAddress
39   -should be used.
40   -
41   -<h3><a name="listen">Listen</a></h3>
42   -<strong>Syntax:</strong> Listen <em>[ port | IP-address:port ]</em><br>
43   -<strong>Default:</strong> <code>none</code><br>
44   -<strong>Context:</strong> server config<br>
45   -<strong>Status:</strong> Core<p>
46   -
47   -<tt>Listen</tt> can be used instead of <tt>BindAddress</tt> and
48   -<tt>Port</tt>. It tells the server to accept incoming requests on the
49   -specified port or address-and-port combination. If the first format is
50   -used, with a port number only, the server listens to the given port on
51   -all interfaces, instead of the port given by the <tt>Port</tt>
52   -directive. If an IP address is given as well as a port, the server
53   -will listen on the given port and interface. <p> Multiple Listen
54   -directives may be used to specify a number of addresses and ports to
55   -listen to. The server will respond to requests from any of the listed
56   -addresses and ports.<p>
57   -
58   -For example, to make the server accept connections on both port
59   -80 and port 8000, use:
60   -<pre>
61   - Listen 80
62   - Listen 8000
63   -</pre>
64   -
65   -To make the server accept connections on two specified
66   -interfaces and port numbers, use
67   -<pre>
68   - Listen 192.170.2.1:80
69   - Listen 192.170.2.5:8000
70   -</pre>
71   -
72   -<h2>How this works with Virtual Hosts</h2>
73   -
74   -BindAddress and Listen do not implement Virtual Hosts. They tell the
75   -main server what addresses and ports to listen to. If no
76   -&lt;VirtualHost&gt; directives are used, the server will behave the
77   -same for all accepted requests. However, &lt;VirtualHost&gt; can be
78   -used to specify a different behavour for one or more of the addresses
79   -and ports. To implement a VirtualHost, the server must first be told
80   -to listen to the address and port to be used. Then a
81   -&lt;VirtualHost&gt; section should be created for a specified address
82   -and port to set the behavior of this virtual host. Note that if the
83   -&lt;VirtualHost&gt; is set for an address and port that the server is
84   -not listening to, it cannot be accessed.
85   -
86   -<h2>See also</h2>
87   -
88   -See also the documentation on
89   -<a href="virtual-host.html">Virtual Hosts</a>,
90   -<a href="host.html">Non-IP virtual hosts</a>,
91   -<a href="mod/core.html#bindaddress">BindAddress directive</a>,
92   -<a href="mod/core.html#port">Port directive</a>
93   -and
94   -<a href="mod/core.html#virtualhost">&lt;VirtualHost&gt; section</a>.
95   -</ul>
96   -
97   -<!--#include virtual="footer.html" -->
98   -</BODY>
99   -</HTML>
100   -
84 docs/manual/cgi_path.html.en
... ... @@ -1,84 +0,0 @@
1   -<html><head>
2   -<title>PATH_INFO Changes in the CGI Environment</title>
3   -</head><body>
4   -
5   -<!--#include virtual="header.html" -->
6   -<h1>PATH_INFO Changes in the CGI Environment</h1>
7   -
8   -<hr>
9   -
10   -<h2><a name="over">Overview</a></h2>
11   -
12   -<p>As implemented in Apache 1.1.1 and earlier versions, the method
13   -Apache used to create PATH_INFO in the CGI environment was
14   -counterintiutive, and could result in crashes in certain cases. In
15   -Apache 1.2 and beyond, this behavior has changed. Although this
16   -results in some compatibility problems with certain legacy CGI
17   -applications, the Apache 1.2 behavior is still compatible with the
18   -CGI/1.1 specification, and CGI scripts can be easily modified (<a
19   -href="#compat">see below</a>).
20   -
21   -<h2><a name="prob">The Problem</a></h2>
22   -
23   -<p>Apache 1.1.1 and earlier implemented the PATH_INFO and SCRIPT_NAME
24   -environment variables by looking at the filename, not the URL. While
25   -this resulted in the correct values in many cases, when the filesystem
26   -path was overloaded to contain path information, it could result in
27   -errant behavior. For example, if the following appeared in a config
28   -file:
29   -<pre>
30   - Alias /cgi-ralph /usr/local/httpd/cgi-bin/user.cgi/ralph
31   -</pre>
32   -<p>In this case, <code>user.cgi</code> is the CGI script, the "/ralph"
33   -is information to be passed onto the CGI. If this configuration was in
34   -place, and a request came for "<code>/cgi-ralph/script/</code>", the
35   -code would set PATH_INFO to "<code>/ralph/script</code>", and
36   -SCRIPT_NAME to "<code>/cgi-</code>". Obviously, the latter is
37   -incorrect. In certain cases, this could even cause the server to
38   -crash.</p>
39   -
40   -<h2><a name="solution">The Solution</a></h2>
41   -
42   -<p>Apache 1.2 and later now determine SCRIPT_NAME and PATH_INFO by
43   -looking directly at the URL, and determining how much of the URL is
44   -client-modifiable, and setting PATH_INFO to it. To use the above
45   -example, PATH_INFO would be set to "<code>/script</code>", and
46   -SCRIPT_NAME to "<code>/cgi-ralph</code>". This makes sense and results
47   -in no server behavior problems. It also permits the script to be
48   -gauranteed that
49   -"<code>http://$SERVER_NAME:$SERVER_PORT$SCRIPT_NAME$PATH_INFO</code>"
50   -will always be an accessable URL that points to the current script,
51   -something which was not neccessarily true with previous versions of
52   -Apache.
53   -
54   -<p>However, the "<code>/ralph</code>"
55   -information from the <code>Alias</code> directive is lost. This is
56   -unfortunate, but we feel that using the filesystem to pass along this
57   -sort of information is not a recommended method, and a script making
58   -use of it "deserves" not to work. Apache 1.2b3 and later, however, do
59   -provide <a href="#compat">a workaround.</a>
60   -
61   -<h2><a name="compat">Compatibility with Previous Servers</a></h2>
62   -
63   -<p>It may be neccessary for a script that was designed for earlier
64   -versions of Apache or other servers to need the information that the
65   -old PATH_INFO variable provided. For this purpose, Apache 1.2 (1.2b3
66   -and later) sets an additional variable, FILEPATH_INFO. This
67   -environment variable contains the value that PATH_INFO would have had
68   -with Apache 1.1.1.</p>
69   -
70   -<p>A script that wishes to work with both Apache 1.2 and earlier
71   -versions can simply test for the existance of FILEPATH_INFO, and use
72   -it if available. Otherwise, it can use PATH_INFO. For example, in
73   -Perl, one might use:
74   -<pre>
75   - $path_info = $ENV{'FILEPATH_INFO'} || $ENV{'PATH_INFO'};
76   -</pre>
77   -
78   -<p>By doing this, a script can work with all servers supporting the
79   -CGI/1.1 specification, including all versions of Apache.</p>
80   -
81   -<!--#include virtual="footer.html" -->
82   -</BODY>
83   -</HTML>
84   -
419 docs/manual/content-negotiation.html.en
... ... @@ -1,419 +0,0 @@
1   -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
2   -<HTML>
3   -<HEAD>
4   -<TITLE>Apache Content Negotiation</TITLE>
5   -</HEAD>
6   -
7   -<BODY>
8   -<!--#include virtual="header.html" -->
9   -<h1>Content Negotiation</h1>
10   -
11   -Apache's support for content negotiation has been updated to meet the
12   -HTTP/1.1 specification. It can choose the best representation of a
13   -resource based on the browser-supplied preferences for media type,
14   -languages, character set and encoding. It is also implements a
15   -couple of features to give more intelligent handling of requests from
16   -browsers which send incomplete negotiation information. <p>
17   -
18   -Content negotiation is provided by the
19   -<a href="mod/mod_negotiation.html">mod_negotiation</a> module,
20   -which is compiled in by default.
21   -
22   -<hr>
23   -
24   -<h2>About Content Negotiation</h2>
25   -
26   -A resource may be available in several different representations. For
27   -example, it might be available in different languages or different
28   -media types, or a combination. One way of selecting the most
29   -appropriate choice is to give the user an index page, and let them
30   -select. However it is often possible for the server to choose
31   -automatically. This works because browsers can send as part of each
32   -request information about what representations they prefer. For
33   -example, a browser could indicate that it would like to see
34   -information in French, if possible, else English will do. Browsers
35   -indicate their preferences by headers in the request. To request only
36   -French representations, the browser would send
37   -
38   -<pre>
39   - Accept-Language: fr
40   -</pre>
41   -
42   -Note that this preference will only be applied when there is a choice
43   -of representations and they vary by language.
44   -<p>
45   -
46   -As an example of a more complex request, this browser has been
47   -configured to accept French and English, but prefer French, and to
48   -accept various media types, preferring HTML over plain text or other
49   -text types, and prefering GIF or jpeg over other media types, but also
50   -allowing any other media type as a last resort:
51   -
52   -<pre>
53   - Accept-Language: fr; q=1.0, en; q=0.5
54   - Accept: text/html; q=1.0, text/*; q=0.8, image/gif; q=0.6,
55   - image/jpeg; q=0.6, image/*; q=0.5, */*; q=0.1
56   -</pre>
57   -
58   -Apache 1.2 supports 'server driven' content negotiation, as defined in
59   -the HTTP/1.1 specification. It fully supports the Accept,
60   -Accept-Language, Accept-Charset and Accept-Encoding request headers.
61   -<p>
62   -
63   -The terms used in content negotiation are: a <b>resource</b> is an
64   -item which can be requested of a server, which might be selected as
65   -the result of a content negotiation algorithm. If a resource is
66   -available in several formats, these are called <b>representations</b>
67   -or <b>variants</b>. The ways in which the variants for a particular
68   -resource vary are called the <b>dimensions</b> of negotiation.
69   -
70   -<h2>Negotiation in Apache</h2>
71   -
72   -In order to negotiate a resource, the server needs to be given
73   -information about each of the variants. This is done in one of two
74   -ways:
75   -
76   -<ul>
77   - <li> Using a type map (i.e., a <code>*.var</code> file) which
78   - names the files containing the variants explicitly
79   - <li> Or using a 'MultiViews' search, where the server does an implicit
80   - filename pattern match, and chooses from among the results.
81   -</ul>
82   -
83   -<h3>Using a type-map file</h3>
84   -
85   -A type map is a document which is associated with the handler
86   -named <code>type-map</code> (or, for backwards-compatibility with
87   -older Apache configurations, the mime type
88   -<code>application/x-type-map</code>). Note that to use this feature,
89   -you've got to have a <code>SetHandler</code> some place which defines a
90   -file suffix as <code>type-map</code>; this is best done with a
91   -<pre>
92   -
93   - AddHandler type-map var
94   -
95   -</pre>
96   -in <code>srm.conf</code>. See comments in the sample config files for
97   -details. <p>
98   -
99   -Type map files have an entry for each available variant; these entries
100   -consist of contiguous RFC822-format header lines. Entries for
101   -different variants are separated by blank lines. Blank lines are
102   -illegal within an entry. It is conventional to begin a map file with
103   -an entry for the combined entity as a whole (although this
104   -is not required, and if present will be ignored). An example
105   -map file is:
106   -<pre>
107   -
108   - URI: foo
109   -
110   - URI: foo.en.html
111   - Content-type: text/html
112   - Content-language: en
113   -
114   - URI: foo.fr.de.html
115   - Content-type: text/html; charset=iso-8859-2
116   - Content-language: fr, de
117   -</pre>
118   -
119   -If the variants have different source qualities, that may be indicated
120   -by the "qs" parameter to the media type, as in this picture (available
121   -as jpeg, gif, or ASCII-art):
122   -<pre>
123   - URI: foo
124   -
125   - URI: foo.jpeg
126   - Content-type: image/jpeg; qs=0.8
127   -
128   - URI: foo.gif
129   - Content-type: image/gif; qs=0.5
130   -
131   - URI: foo.txt
132   - Content-type: text/plain; qs=0.01
133   -
134   -</pre>
135   -<p>
136   -
137   -qs values can vary between 0.000 and 1.000. Note that any variant with
138   -a qs value of 0.000 will never be chosen. Variants with no 'qs'
139   -parameter value are given a qs factor of 1.0. <p>
140   -
141   -The full list of headers recognized is:
142   -
143   -<dl>
144   - <dt> <code>URI:</code>
145   - <dd> uri of the file containing the variant (of the given media
146   - type, encoded with the given content encoding). These are
147   - interpreted as URLs relative to the map file; they must be on
148   - the same server (!), and they must refer to files to which the
149   - client would be granted access if they were to be requested
150   - directly.
151   - <dt> <code>Content-type:</code>
152   - <dd> media type --- charset, level and "qs" parameters may be given. These
153   - are often referred to as MIME types; typical media types are
154   - <code>image/gif</code>, <code>text/plain</code>, or
155   - <code>text/html;&nbsp;level=3</code>.
156   - <dt> <code>Content-language:</code>
157   - <dd> The languages of the variant, specified as an internet standard
158   - language code (e.g., <code>en</code> for English,
159   - <code>kr</code> for Korean, etc.).
160   - <dt> <code>Content-encoding:</code>
161   - <dd> If the file is compressed, or otherwise encoded, rather than
162   - containing the actual raw data, this says how that was done.
163   - For compressed files (the only case where this generally comes
164   - up), content encoding should be
165   - <code>x-compress</code>, or <code>x-gzip</code>, as appropriate.
166   - <dt> <code>Content-length:</code>
167   - <dd> The size of the file. Clients can ask to receive a given media
168   - type only if the variant isn't too big; specifying a content
169   - length in the map allows the server to compare against these
170   - thresholds without checking the actual file.
171   -</dl>
172   -
173   -<h3>Multiviews</h3>
174   -
175   -This is a per-directory option, meaning it can be set with an
176   -<code>Options</code> directive within a <code>&lt;Directory&gt;</code>,
177   -<code>&lt;Location&gt;</code> or <code>&lt;Files&gt;</code>
178   -section in <code>access.conf</code>, or (if <code>AllowOverride</code>
179   -is properly set) in <code>.htaccess</code> files. Note that
180   -<code>Options All</code> does not set <code>MultiViews</code>; you
181   -have to ask for it by name. (Fixing this is a one-line change to
182   -<code>http_core.h</code>).
183   -
184   -<p>
185   -
186   -The effect of <code>MultiViews</code> is as follows: if the server
187   -receives a request for <code>/some/dir/foo</code>, if
188   -<code>/some/dir</code> has <code>MultiViews</code> enabled, and
189   -<code>/some/dir/foo</code> does <em>not</em> exist, then the server reads the
190   -directory looking for files named foo.*, and effectively fakes up a
191   -type map which names all those files, assigning them the same media
192   -types and content-encodings it would have if the client had asked for
193   -one of them by name. It then chooses the best match to the client's
194   -requirements, and forwards them along.
195   -
196   -<p>
197   -
198   -This applies to searches for the file named by the
199   -<code>DirectoryIndex</code> directive, if the server is trying to
200   -index a directory; if the configuration files specify
201   -<pre>
202   -
203   - DirectoryIndex index
204   -
205   -</pre> then the server will arbitrate between <code>index.html</code>
206   -and <code>index.html3</code> if both are present. If neither are
207   -present, and <code>index.cgi</code> is there, the server will run it.
208   -
209   -<p>
210   -
211   -If one of the files found when reading the directive is a CGI script,
212   -it's not obvious what should happen. The code gives that case
213   -special treatment --- if the request was a POST, or a GET with
214   -QUERY_ARGS or PATH_INFO, the script is given an extremely high quality
215   -rating, and generally invoked; otherwise it is given an extremely low
216   -quality rating, which generally causes one of the other views (if any)
217   -to be retrieved.
218   -
219   -<h2>The Negotiation Algorithm</h2>
220   -
221   -After Apache has obtained a list of the variants for a given resource,
222   -either from a type-map file or from the filenames in the directory, it
223   -applies a algorithm to decide on the 'best' variant to return, if
224   -any. To do this it calculates a quality value for each variant in each
225   -of the dimensions of variance. It is not necessary to know any of the
226   -details of how negotaion actually takes place in order to use Apache's
227   -content negotation features. However the rest of this document
228   -explains in detail the algorithm used for those interested. <p>
229   -
230   -In some circumstances, Apache can 'fiddle' the quality factor of a
231   -particular dimension to achive a better result. The ways Apache can
232   -fiddle quality factors is explained in more detail below.
233   -
234   -<h3>Dimensions of Negotation</h3>
235   -
236   -<table>
237   -<tr><th>Dimension
238   -<th>Notes
239   -<tr><td>Media Type
240   -<td>Browser indicates preferences on Accept: header. Each item
241   -can have an associated quality factor. Variant description can also
242   -have a quality factor.
243   -<tr><td>Language
244   -<td>Browser indicates preferneces on Accept-Language: header. Each
245   -item
246   -can have a quality factor. Variants can be associated with none, one
247   -or more languages.
248   -<tr><td>Encoding
249   -<td>Browser indicates preference with Accept-Encoding: header.
250   -<tr><td>Charset
251   -<td>Browser indicates preference with Accept-Charset: header. Variants
252   -can indicate a charset as a parameter of the media type.
253   -</table>
254   -
255   -<h3>Apache Negotiation Algorithm</h3>
256   -
257   -Apache uses an algorithm to select the 'best' variant (if any) to
258   -return to the browser. This algorithm is not configurable. It operates
259   -like this:
260   -<p>
261   -
262   -<ol>
263   -<li>
264   -Firstly, for each dimension of the negotiation, the appropriate
265   -Accept header is checked and a quality assigned to this each
266   -variant. If the Accept header for any dimension means that this
267   -variant is not acceptable, eliminate it. If no variants remain, go
268   -to step 4.
269   -
270   -<li>Select the 'best' variant by a process of elimination. Each of
271   -the following tests is applied in order. Any variants not selected at
272   -each stage are eliminated. After each test, if only one variant
273   -remains, it is selected as the best match. If more than one variant
274   -remains, move onto the next test.
275   -
276   -<ol>
277   -<li>Multiply the quality factor from the Accept header with the
278   - quality-of-source factor for this variant's media type, and select
279   - the variants with the highest value
280   -
281   -<li>Select the variants with the highest language quality factor
282   -
283   -<li>Select the variants with the best language match, using either the
284   - order of languages on the <code>LanguagePriority</code> directive (if present),
285   - else the order of languages on the Accept-Language header.
286   -
287   -<li>Select the variants with the highest 'level' media parameter
288   - (used to give the version of text/html media types).
289   -
290   -<li>Select only unencoded variants, if there is a mix of encoded
291   - and non-encoded variants. If either all variants are encoded
292   - or all variants are not encoded, select all.
293   -
294   -<li>Select only variants with acceptable charset media parameters,
295   - as given on the Accept-Charset header line. Charset ISO-8859-1
296   - is always acceptable. Variants not associated with a particular
297   - charset are assumed to be in ISO-8859-1.
298   -
299   -<li>Select the variants with the smallest content length
300   -
301   -<li>Select the first variant of those remaining (this will be either the
302   -first listed in the type-map file, or the first read from the directory)
303   -and go to stage 3.
304   -
305   -</ol>
306   -
307   -<li>The algorithm has now selected one 'best' variant, so return
308   - it as the response. The HTTP response header Vary is set to indicate the
309   - dimensions of negotation (browsers and caches can use this
310   - information when caching the resource). End.
311   -
312   -<li>To get here means no variant was selected (because non are acceptable
313   - to the browser). Return a 406 status (meaning "No acceptable representation")
314   - with a response body consisting of an HTML document listing the
315   - available variants. Also set the HTTP Vary header to indicate the
316   - dimensions of variance.
317   -
318   -</ol>
319   -<h2><a name="better">Fiddling with Quality Values</a></h2>
320   -
321   -Apache sometimes changes the quality values from what would be
322   -expected by a strict interpretation of the algorithm above. This is to
323   -get a better result from the algorithm for browsers which do not send
324   -full or accurate information. Some of the most popular browsers send
325   -Accept header information which would otherwise result in the
326   -selection of the wrong variant in many cases. If a browser
327   -sends full and correct information these fiddles will not
328   -be applied.
329   -<p>
330   -
331   -<h3>Media Types and Wildcards</h3>
332   -
333   -The Accept: request header indicates preferences for media types. It
334   -can also include 'wildcard' media types, such as "image/*" or "*/*"
335   -where the * matches any string. So a request including:
336   -<pre>
337   - Accept: image/*, */*
338   -</pre>
339   -
340   -would indicate that any type starting "image/" is acceptable,
341   -as is any other type (so the first "image/*" is redundant). Some
342   -browsers routinely send wildcards in addition to explicit types they
343   -can handle. For example:
344   -<pre>
345   - Accept: text/html, text/plain, image/gif, image/jpeg, */*
346   -</pre>
347   -
348   -The intention of this is to indicate that the explicitly
349   -listed types are preferred, but if a different representation is
350   -available, that is ok too. However under the basic algorithm, as given
351   -above, the */* wildcard has exactly equal preference to all the other
352   -types, so they are not being preferred. The browser should really have
353   -sent a request with a lower quality (preference) value for *.*, such
354   -as:
355   -<pre>
356   - Accept: text/html, text/plain, image/gif, image/jpeg, */*; q=0.01
357   -</pre>
358   -
359   -The explicit types have no quality factor, so they default to a
360   -preference of 1.0 (the highest). The wildcard */* is given
361   -a low preference of 0.01, so other types will only be returned if
362   -no variant matches an explicitly listed type.
363   -<p>
364   -
365   -If the Accept: header contains <i>no</i> q factors at all, Apache sets
366   -the q value of "*/*", if present, to 0.01 to emulate the desired
367   -behaviour. It also sets the q value of wildcards of the format
368   -"type/*" to 0.02 (so these are preferred over matches against
369   -"*/*". If any media type on the Accept: header contains a q factor,
370   -these special values are <i>not</i> applied, so requests from browsers
371   -which send the correct information to start with work as expected.
372   -
373   -<h3>Variants with no Language</h3>
374   -
375   -If some of the variants for a particular resource have a language
376   -attribute, and some do not, those variants with no language
377   -are given a very low language quality factor of 0.001.<p>
378   -
379   -The reason for setting this language quality factor for
380   -variant with no language to a very low value is to allow
381   -for a default variant which can be supplied if none of the
382   -other variants match the browser's language preferences.
383   -
384   -For example, consider the situation with three variants:
385   -
386   -<ul>
387   -<li>foo.en.html, language en
388   -<li>foo.fr.html, language en
389   -<li>foo.html, no language
390   -</ul>
391   -
392   -The meaning of a variant with no language is that it is
393   -always acceptable to the browser. If the request Accept-Language
394   -header includes either en or fr (or both) one of foo.en.html
395   -or foo.fr.html will be returned. If the browser does not list
396   -either en or fr as acceptable, foo.html will be returned instead.
397   -
398   -<h2>Note on Caching</h2>
399   -
400   -When a cache stores a document, it associates it with the request URL.
401   -The next time that URL is requested, the cache can use the stored
402   -document, provided it is still within date. But if the resource is
403   -subject to content negotiation at the server, this would result in
404   -only the first requested variant being cached, and subsequent cache
405   -hits could return the wrong response. To prevent this,
406   -Apache normally marks all responses that are returned after content negotiation
407   -as non-cacheable by HTTP/1.0 clients. Apache also supports the HTTP/1.1
408   -protocol features to allow cacheing of negotiated responses. <P>
409   -
410   -For requests which come from a HTTP/1.0 compliant client (either a
411   -browser or a cache), the directive <tt>CacheNegotiatedDocs</tt> can be
412   -used to allow caching of responses which were subject to negotiation.
413   -This directive can be given in the server config or virtual host, and
414   -takes no arguments. It has no effect on requests from HTTP/1.1
415   -clients.
416   -
417   -<!--#include virtual="footer.html" -->
418   -</BODY>
419   -</HTML>
141 docs/manual/custom-error.html.en
... ... @@ -1,141 +0,0 @@
1   -<HTML>
2   -<HEAD>
3   -<TITLE>Custom error responses</TITLE>
4   -</HEAD>
5   -
6   -<BODY>
7   -<!--#include virtual="header.html" -->
8   -<H1>Custom error responses</H1>
9   -
10   -<DL>
11   -
12   -<DT>Purpose
13   -
14   - <DD>Additional functionality. Allows webmasters to configure the response of
15   - Apache to some error or problem.
16   -
17   - <P>Customizable responses can be defined to be activated in the
18   - event of a server detected error or problem.
19   -
20   - <P>e.g. if a script crashes and produces a "500 Server Error"
21   - response, then this response can be replaced with either some
22   - friendlier text or by a redirection to another URL (local or
23   - external).
24   -
25   - <P>
26   -
27   -<DT>Old behavior
28   -
29   - <DD>NCSA httpd 1.3 would return some boring old error/problem message
30   - which would often be meaningless to the user, and would provide no
31   - means of logging the symptoms which caused it.<BR>
32   -
33   - <P>
34   -
35   -<DT>New behavior
36   -
37   - <DD>The server can be asked to;
38   - <OL>
39   - <LI>Display some other text, instead of the NCSA hard coded messages, or
40   - <LI>redirect to a local URL, or
41   - <LI>redirect to an external URL.
42   - </OL>
43   -
44   - <P>Redirecting to another URL can be useful, but only if some information
45   - can be passed which can then be used to explain and/or log the error/problem
46   - more clearly.
47   -
48   - <P>To achieve this, Apache will define new CGI-like environment
49   - variables, e.g.
50   -
51   - <blockquote><code>
52   -REDIRECT_HTTP_ACCEPT=*/*, image/gif, image/x-xbitmap, image/jpeg <br>
53   -REDIRECT_HTTP_USER_AGENT=Mozilla/1.1b2 (X11; I; HP-UX A.09.05 9000/712) <br>
54   -REDIRECT_PATH=.:/bin:/usr/local/bin:/etc <br>
55   -REDIRECT_QUERY_STRING= <br>
56   -REDIRECT_REMOTE_ADDR=121.345.78.123 <br>
57   -REDIRECT_REMOTE_HOST=ooh.ahhh.com <br>
58   -REDIRECT_SERVER_NAME=crash.bang.edu <br>
59   -REDIRECT_SERVER_PORT=80 <br>
60   -REDIRECT_SERVER_SOFTWARE=Apache/0.8.15 <br>
61   -REDIRECT_URL=/cgi-bin/buggy.pl <br>
62   - </code></blockquote>
63   -
64   - <P>note the <code>REDIRECT_</code> prefix.
65   -
66   - <P>At least <code>REDIRECT_URL</code> and <code>REDIRECT_QUERY_STRING</code> will
67   - be passed to the new URL (assuming it's a cgi-script or a cgi-include). The
68   - other variables will exist only if they existed prior to the error/problem.<p>
69   -
70   -<DT>Configuration
71   -
72   - <DD> Use of "ErrorDocument" is enabled for .htaccess files when the
73   - <A HREF="mod/core.html#allowoverride">"FileInfo" override</A> is allowed.
74   -
75   - <P>Here are some examples...
76   -
77   - <blockquote><code>
78   -ErrorDocument 500 /cgi-bin/crash-recover <br>
79   -ErrorDocument 500 "Sorry, our script crashed. Oh dear<br>
80   -ErrorDocument 500 http://xxx/ <br>
81   -ErrorDocument 404 /Lame_excuses/not_found.html <br>
82   -ErrorDocument 401 /Subscription/how_to_subscribe.html
83   - </code></blockquote>
84   -
85   - <P>The syntax is,
86   -
87   - <P><code><A HREF="mod/core.html#errordocument">ErrorDocument</A></code>
88   -&lt;3-digit-code&gt; action
89   -
90   - <P>where the action can be,
91   -
92   - <OL>
93   - <LI>Text to be displayed. Prefix the text with a quote (&quot;). Whatever
94   - follows the quote is displayed. <em>Note: the (&quot;) prefix isn't
95   - displayed.</em>
96   -
97   - <LI>An external URL to redirect to.
98   -
99   - <LI>A local URL to redirect to.
100   -
101   - </OL>
102   -</DL>
103   -
104   -<P><HR><P>
105   -
106   -<h2>Custom error responses and redirects</H2>
107   -
108   -<DL>
109   -
110   -<DT>Purpose
111   -
112   - <DD>Apache's behavior to redirected URLs has been modified so that additional
113   - environment variables are available to a script/server-include.<p>
114   -
115   -<DT>Old behavior
116   -
117   - <DD>Standard CGI vars were made available to a script which has been
118   - redirected to. No indication of where the redirection came from was provided.
119   -
120   - <p>
121   -
122   -<DT>New behavior
123   - <DD>
124   -
125   -A new batch of environment variables will be initialized for use by a
126   -script which has been redirected to. Each new variable will have the
127   -prefix <code>REDIRECT_</code>. <code>REDIRECT_</code> environment
128   -variables are created from the CGI environment variables which existed
129   -prior to the redirect, they are renamed with a <code>REDIRECT_</code>
130   -prefix, i.e. <code>HTTP_USER_AGENT</code> becomes
131   -<code>REDIRECT_HTTP_USER_AGENT</code>. In addition to these new
132   -variables, Apache will define <code>REDIRECT_URL</code> and
133   -<code>REDIRECT_STATUS</code> to help the script trace its origin.
134   -Both the original URL and the URL being redirected to can be logged in
135   -the access log.
136   -
137   -</DL>
138   -
139   -<!--#include virtual="footer.html" -->
140   -</BODY>
141   -</HTML>
986 docs/manual/developer/API.html
... ... @@ -1,986 +0,0 @@
1   -<html><head>
2   -<title>Apache API notes</title>
3   -</head>
4   -<body>
5   -<!--#include virtual="header.html" -->
6   -<h1>Apache API notes</h1>
7   -
8   -These are some notes on the Apache API and the data structures you
9   -have to deal with, etc. They are not yet nearly complete, but
10   -hopefully, they will help you get your bearings. Keep in mind that
11   -the API is still subject to change as we gain experience with it.
12   -(See the TODO file for what <em>might</em> be coming). However,
13   -it will be easy to adapt modules to any changes that are made.
14   -(We have more modules to adapt than you do).
15   -<p>
16   -
17   -A few notes on general pedagogical style here. In the interest of
18   -conciseness, all structure declarations here are incomplete --- the
19   -real ones have more slots that I'm not telling you about. For the
20   -most part, these are reserved to one component of the server core or
21   -another, and should be altered by modules with caution. However, in
22   -some cases, they really are things I just haven't gotten around to
23   -yet. Welcome to the bleeding edge.<p>
24   -
25   -Finally, here's an outline, to give you some bare idea of what's
26   -coming up, and in what order:
27   -
28   -<ul>
29   -<li> <a href="#basics">Basic concepts.</a>
30   -<menu>
31   - <li> <a href="#HMR">Handlers, Modules, and Requests</a>
32   - <li> <a href="#moduletour">A brief tour of a module</a>
33   -</menu>
34   -<li> <a href="#handlers">How handlers work</a>
35   -<menu>
36   - <li> <a href="#req_tour">A brief tour of the <code>request_rec</code></a>
37   - <li> <a href="#req_orig">Where request_rec structures come from</a>
38   - <li> <a href="#req_return">Handling requests, declining, and returning error codes</a>
39   - <li> <a href="#resp_handlers">Special considerations for response handlers</a>
40   - <li> <a href="#auth_handlers">Special considerations for authentication handlers</a>
41   - <li> <a href="#log_handlers">Special considerations for logging handlers</a>
42   -</menu>
43   -<li> <a href="#pools">Resource allocation and resource pools</a>
44   -<li> <a href="#config">Configuration, commands and the like</a>
45   -<menu>
46   - <li> <a href="#per-dir">Per-directory configuration structures</a>
47   - <li> <a href="#commands">Command handling</a>
48   - <li> <a href="#servconf">Side notes --- per-server configuration, virtual servers, etc.</a>
49   -</menu>
50   -</ul>
51   -
52   -<h2><a name="basics">Basic concepts.</a></h2>
53   -
54   -We begin with an overview of the basic concepts behind the
55   -API, and how they are manifested in the code.
56   -
57   -<h3><a name="HMR">Handlers, Modules, and Requests</a></h3>
58   -
59   -Apache breaks down request handling into a series of steps, more or
60   -less the same way the Netscape server API does (although this API has
61   -a few more stages than NetSite does, as hooks for stuff I thought
62   -might be useful in the future). These are:
63   -
64   -<ul>
65   - <li> URI -&gt; Filename translation
66   - <li> Auth ID checking [is the user who they say they are?]
67   - <li> Auth access checking [is the user authorized <em>here</em>?]
68   - <li> Access checking other than auth
69   - <li> Determining MIME type of the object requested
70   - <li> `Fixups' --- there aren't any of these yet, but the phase is
71   - intended as a hook for possible extensions like
72   - <code>SetEnv</code>, which don't really fit well elsewhere.
73   - <li> Actually sending a response back to the client.
74   - <li> Logging the request
75   -</ul>
76   -
77   -These phases are handled by looking at each of a succession of
78   -<em>modules</em>, looking to see if each of them has a handler for the
79   -phase, and attempting invoking it if so. The handler can typically do
80   -one of three things:
81   -
82   -<ul>
83   - <li> <em>Handle</em> the request, and indicate that it has done so
84   - by returning the magic constant <code>OK</code>.
85   - <li> <em>Decline</em> to handle the request, by returning the magic
86   - integer constant <code>DECLINED</code>. In this case, the
87   - server behaves in all respects as if the handler simply hadn't
88   - been there.
89   - <li> Signal an error, by returning one of the HTTP error codes.
90   - This terminates normal handling of the request, although an
91   - ErrorDocument may be invoked to try to mop up, and it will be
92   - logged in any case.
93   -</ul>
94   -
95   -Most phases are terminated by the first module that handles them;
96   -however, for logging, `fixups', and non-access authentication
97   -checking, all handlers always run (barring an error). Also, the
98   -response phase is unique in that modules may declare multiple handlers
99   -for it, via a dispatch table keyed on the MIME type of the requested
100   -object. Modules may declare a response-phase handler which can handle
101   -<em>any</em> request, by giving it the key <code>*/*</code> (i.e., a
102   -wildcard MIME type specification). However, wildcard handlers are
103   -only invoked if the server has already tried and failed to find a more
104   -specific response handler for the MIME type of the requested object
105   -(either none existed, or they all declined).<p>
106   -
107   -The handlers themselves are functions of one argument (a
108   -<code>request_rec</code> structure. vide infra), which returns an
109   -integer, as above.<p>
110   -
111   -<h3><a name="moduletour">A brief tour of a module</a></h3>
112   -
113   -At this point, we need to explain the structure of a module. Our
114   -candidate will be one of the messier ones, the CGI module --- this
115   -handles both CGI scripts and the <code>ScriptAlias</code> config file
116   -command. It's actually a great deal more complicated than most
117   -modules, but if we're going to have only one example, it might as well
118   -be the one with its fingers in every place.<p>
119   -
120   -Let's begin with handlers. In order to handle the CGI scripts, the
121   -module declares a response handler for them. Because of
122   -<code>ScriptAlias</code>, it also has handlers for the name
123   -translation phase (to recognise <code>ScriptAlias</code>ed URIs), the
124   -type-checking phase (any <code>ScriptAlias</code>ed request is typed
125   -as a CGI script).<p>
126   -
127   -The module needs to maintain some per (virtual)
128   -server information, namely, the <code>ScriptAlias</code>es in effect;
129   -the module structure therefore contains pointers to a functions which
130   -builds these structures, and to another which combines two of them (in
131   -case the main server and a virtual server both have
132   -<code>ScriptAlias</code>es declared).<p>
133   -
134   -Finally, this module contains code to handle the
135   -<code>ScriptAlias</code> command itself. This particular module only
136   -declares one command, but there could be more, so modules have
137   -<em>command tables</em> which declare their commands, and describe
138   -where they are permitted, and how they are to be invoked. <p>
139   -
140   -A final note on the declared types of the arguments of some of these
141   -commands: a <code>pool</code> is a pointer to a <em>resource pool</em>
142   -structure; these are used by the server to keep track of the memory
143   -which has been allocated, files opened, etc., either to service a
144   -particular request, or to handle the process of configuring itself.
145   -That way, when the request is over (or, for the configuration pool,
146   -when the server is restarting), the memory can be freed, and the files
147   -closed, <i>en masse</i>, without anyone having to write explicit code to
148   -track them all down and dispose of them. Also, a
149   -<code>cmd_parms</code> structure contains various information about
150   -the config file being read, and other status information, which is
151   -sometimes of use to the function which processes a config-file command
152   -(such as <code>ScriptAlias</code>).
153   -
154   -With no further ado, the module itself:
155   -
156   -<pre>
157   -/* Declarations of handlers. */
158   -
159   -int translate_scriptalias (request_rec *);
160   -int type_scriptalias (request_rec *);
161   -int cgi_handler (request_rec *);
162   -
163   -/* Subsidiary dispatch table for response-phase handlers, by MIME type */
164   -
165   -handler_rec cgi_handlers[] = {
166   -{ "application/x-httpd-cgi", cgi_handler },
167   -{ NULL }
168   -};
169   -
170   -/* Declarations of routines to manipulate the module's configuration
171   - * info. Note that these are returned, and passed in, as void *'s;
172   - * the server core keeps track of them, but it doesn't, and can't,
173   - * know their internal structure.
174   - */
175   -
176   -void *make_cgi_server_config (pool *);
177   -void *merge_cgi_server_config (pool *, void *, void *);
178   -
179   -/* Declarations of routines to handle config-file commands */
180   -
181   -extern char *script_alias(cmd_parms *, void *per_dir_config, char *fake,
182   - char *real);
183   -
184   -command_rec cgi_cmds[] = {
185   -{ "ScriptAlias", script_alias, NULL, RSRC_CONF, TAKE2,
186   - "a fakename and a realname"},
187   -{ NULL }
188   -};
189   -
190   -module cgi_module = {
191   - STANDARD_MODULE_STUFF,
192   - NULL, /* initializer */
193   - NULL, /* dir config creator */
194   - NULL, /* dir merger --- default is to override */
195   - make_cgi_server_config, /* server config */
196   - merge_cgi_server_config, /* merge server config */
197   - cgi_cmds, /* command table */
198   - cgi_handlers, /* handlers */
199   - translate_scriptalias, /* filename translation */
200   - NULL, /* check_user_id */
201   - NULL, /* check auth */
202   - NULL, /* check access */
203   - type_scriptalias, /* type_checker */
204   - NULL, /* fixups */
205   - NULL /* logger */
206   -};
207   -</pre>
208   -
209   -<h2><a name="handlers">How handlers work</a></h2>
210   -
211   -The sole argument to handlers is a <code>request_rec</code> structure.
212   -This structure describes a particular request which has been made to
213   -the server, on behalf of a client. In most cases, each connection to
214   -the client generates only one <code>request_rec</code> structure.<p>
215   -
216   -<h3><a name="req_tour">A brief tour of the <code>request_rec</code></a></h3>
217   -
218   -The <code>request_rec</code> contains pointers to a resource pool
219   -which will be cleared when the server is finished handling the
220   -request; to structures containing per-server and per-connection
221   -information, and most importantly, information on the request itself.<p>
222   -
223   -The most important such information is a small set of character
224   -strings describing attributes of the object being requested, including
225   -its URI, filename, content-type and content-encoding (these being filled
226   -in by the translation and type-check handlers which handle the
227   -request, respectively). <p>
228   -
229   -Other commonly used data items are tables giving the MIME headers on
230   -the client's original request, MIME headers to be sent back with the
231   -response (which modules can add to at will), and environment variables
232   -for any subprocesses which are spawned off in the course of servicing
233   -the request. These tables are manipulated using the
234   -<code>table_get</code> and <code>table_set</code> routines. <p>
235   -
236   -Finally, there are pointers to two data structures which, in turn,
237   -point to per-module configuration structures. Specifically, these
238   -hold pointers to the data structures which the module has built to
239   -describe the way it has been configured to operate in a given
240   -directory (via <code>.htaccess</code> files or
241   -<code>&lt;Directory&gt;</code> sections), for private data it has
242   -built in the course of servicing the request (so modules' handlers for
243   -one phase can pass `notes' to their handlers for other phases). There
244   -is another such configuration vector in the <code>server_rec</code>
245   -data structure pointed to by the <code>request_rec</code>, which
246   -contains per (virtual) server configuration data.<p>
247   -
248   -Here is an abridged declaration, giving the fields most commonly used:<p>
249   -
250   -<pre>
251   -struct request_rec {
252   -
253   - pool *pool;
254   - conn_rec *connection;
255   - server_rec *server;
256   -
257   - /* What object is being requested */
258   -
259   - char *uri;
260   - char *filename;
261   - char *path_info;
262   - char *args; /* QUERY_ARGS, if any */
263   - struct stat finfo; /* Set by server core;
264   - * st_mode set to zero if no such file */
265   -
266   - char *content_type;
267   - char *content_encoding;
268   -
269   - /* MIME header environments, in and out. Also, an array containing
270   - * environment variables to be passed to subprocesses, so people can
271   - * write modules to add to that environment.
272   - *
273   - * The difference between headers_out and err_headers_out is that
274   - * the latter are printed even on error, and persist across internal
275   - * redirects (so the headers printed for ErrorDocument handlers will
276   - * have them).
277   - */
278   -
279   - table *headers_in;
280   - table *headers_out;
281   - table *err_headers_out;
282   - table *subprocess_env;
283   -
284   - /* Info about the request itself... */
285   -
286   - int header_only; /* HEAD request, as opposed to GET */
287   - char *protocol; /* Protocol, as given to us, or HTTP/0.9 */
288   - char *method; /* GET, HEAD, POST, etc. */
289   - int method_number; /* M_GET, M_POST, etc. */
290   -
291   - /* Info for logging */
292   -
293   - char *the_request;
294   - int bytes_sent;
295   -
296   - /* A flag which modules can set, to indicate that the data being
297   - * returned is volatile, and clients should be told not to cache it.
298   - */
299   -
300   - int no_cache;
301   -
302   - /* Various other config info which may change with .htaccess files
303   - * These are config vectors, with one void* pointer for each module
304   - * (the thing pointed to being the module's business).
305   - */
306   -
307   - void *per_dir_config; /* Options set in config files, etc. */
308   - void *request_config; /* Notes on *this* request */
309   -
310   -};
311   -
312   -</pre>
313   -
314   -<h3><a name="req_orig">Where request_rec structures come from</a></h3>
315   -
316   -Most <code>request_rec</code> structures are built by reading an HTTP
317   -request from a client, and filling in the fields. However, there are
318   -a few exceptions:
319   -
320   -<ul>
321   - <li> If the request is to an imagemap, a type map (i.e., a
322   - <code>*.var</code> file), or a CGI script which returned a
323   - local `Location:', then the resource which the user requested
324   - is going to be ultimately located by some URI other than what
325   - the client originally supplied. In this case, the server does
326   - an <em>internal redirect</em>, constructing a new
327   - <code>request_rec</code> for the new URI, and processing it
328   - almost exactly as if the client had requested the new URI
329   - directly. <p>
330   -
331   - <li> If some handler signaled an error, and an
332   - <code>ErrorDocument</code> is in scope, the same internal
333   - redirect machinery comes into play.<p>
334   -
335   - <li> Finally, a handler occasionally needs to investigate `what
336   - would happen if' some other request were run. For instance,
337   - the directory indexing module needs to know what MIME type
338   - would be assigned to a request for each directory entry, in
339   - order to figure out what icon to use.<p>
340   -
341   - Such handlers can construct a <em>sub-request</em>, using the
342   - functions <code>sub_req_lookup_file</code> and
343   - <code>sub_req_lookup_uri</code>; this constructs a new
344   - <code>request_rec</code> structure and processes it as you
345   - would expect, up to but not including the point of actually
346   - sending a response. (These functions skip over the access
347   - checks if the sub-request is for a file in the same directory
348   - as the original request).<p>
349   -
350   - (Server-side includes work by building sub-requests and then
351   - actually invoking the response handler for them, via the
352   - function <code>run_sub_request</code>).
353   -</ul>
354   -
355   -<h3><a name="req_return">Handling requests, declining, and returning error codes</a></h3>
356   -
357   -As discussed above, each handler, when invoked to handle a particular
358   -<code>request_rec</code>, has to return an <code>int</code> to
359   -indicate what happened. That can either be
360   -
361   -<ul>
362   - <li> OK --- the request was handled successfully. This may or may
363   - not terminate the phase.
364   - <li> DECLINED --- no erroneous condition exists, but the module
365   - declines to handle the phase; the server tries to find another.
366   - <li> an HTTP error code, which aborts handling of the request.
367   -</ul>
368   -
369   -Note that if the error code returned is <code>REDIRECT</code>, then
370   -the module should put a <code>Location</code> in the request's
371   -<code>headers_out</code>, to indicate where the client should be
372   -redirected <em>to</em>. <p>
373   -
374   -<h3><a name="resp_handlers">Special considerations for response handlers</a></h3>
375   -
376   -Handlers for most phases do their work by simply setting a few fields
377   -in the <code>request_rec</code> structure (or, in the case of access
378   -checkers, simply by returning the correct error code). However,
379   -response handlers have to actually send a request back to the client. <p>
380   -
381   -They should begin by sending an HTTP response header, using the
382   -function <code>send_http_header</code>. (You don't have to do
383   -anything special to skip sending the header for HTTP/0.9 requests; the
384   -function figures out on its own that it shouldn't do anything). If
385   -the request is marked <code>header_only</code>, that's all they should
386   -do; they should return after that, without attempting any further
387   -output. <p>
388   -
389   -Otherwise, they should produce a request body which responds to the
390   -client as appropriate. The primitives for this are <code>rputc</code>
391   -and <code>rprintf</code>, for internally generated output, and
392   -<code>send_fd</code>, to copy the contents of some <code>FILE *</code>
393   -straight to the client. <p>
394   -
395   -At this point, you should more or less understand the following piece
396   -of code, which is the handler which handles <code>GET</code> requests
397   -which have no more specific handler; it also shows how conditional
398   -<code>GET</code>s can be handled, if it's desirable to do so in a
399   -particular response handler --- <code>set_last_modified</code> checks
400   -against the <code>If-modified-since</code> value supplied by the
401   -client, if any, and returns an appropriate code (which will, if
402   -nonzero, be USE_LOCAL_COPY). No similar considerations apply for
403   -<code>set_content_length</code>, but it returns an error code for
404   -symmetry.<p>
405