forked from mojolicious/mojo
-
Notifications
You must be signed in to change notification settings - Fork 0
/
rfc3875.txt
2019 lines (1356 loc) · 78.8 KB
/
rfc3875.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Network Working Group D. Robinson
Request for Comments: 3875 K. Coar
Category: Informational The Apache Software Foundation
October 2004
The Common Gateway Interface (CGI) Version 1.1
Status of this Memo
This memo provides information for the Internet community. It does
not specify an Internet standard of any kind. Distribution of this
memo is unlimited.
Copyright Notice
Copyright (C) The Internet Society (2004).
IESG Note
This document is not a candidate for any level of Internet Standard.
The IETF disclaims any knowledge of the fitness of this document for
any purpose, and in particular notes that it has not had IETF review
for such things as security, congestion control or inappropriate
interaction with deployed protocols. The RFC Editor has chosen to
publish this document at its discretion. Readers of this document
should exercise caution in evaluating its value for implementation
and deployment.
Abstract
The Common Gateway Interface (CGI) is a simple interface for running
external programs, software or gateways under an information server
in a platform-independent manner. Currently, the supported
information servers are HTTP servers.
The interface has been in use by the World-Wide Web (WWW) since 1993.
This specification defines the 'current practice' parameters of the
'CGI/1.1' interface developed and documented at the U.S. National
Centre for Supercomputing Applications. This document also defines
the use of the CGI/1.1 interface on UNIX(R) and other, similar
systems.
Robinson & Coar Informational [Page 1]
RFC 3875 CGI Version 1.1 October 2004
Table of Contents
1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1. Purpose . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2. Requirements . . . . . . . . . . . . . . . . . . . . . . 4
1.3. Specifications . . . . . . . . . . . . . . . . . . . . . 4
1.4. Terminology . . . . . . . . . . . . . . . . . . . . . . 5
2. Notational Conventions and Generic Grammar. . . . . . . . . . 5
2.1. Augmented BNF . . . . . . . . . . . . . . . . . . . . . 5
2.2. Basic Rules . . . . . . . . . . . . . . . . . . . . . . 6
2.3. URL Encoding . . . . . . . . . . . . . . . . . . . . . . 7
3. Invoking the Script . . . . . . . . . . . . . . . . . . . . . 8
3.1. Server Responsibilities . . . . . . . . . . . . . . . . 8
3.2. Script Selection . . . . . . . . . . . . . . . . . . . . 9
3.3. The Script-URI . . . . . . . . . . . . . . . . . . . . . 9
3.4. Execution . . . . . . . . . . . . . . . . . . . . . . . 10
4. The CGI Request . . . . . . . . . . . . . . . . . . . . . . . 10
4.1. Request Meta-Variables . . . . . . . . . . . . . . . . . 10
4.1.1. AUTH_TYPE. . . . . . . . . . . . . . . . . . . . 11
4.1.2. CONTENT_LENGTH . . . . . . . . . . . . . . . . . 12
4.1.3. CONTENT_TYPE . . . . . . . . . . . . . . . . . . 12
4.1.4. GATEWAY_INTERFACE. . . . . . . . . . . . . . . . 13
4.1.5. PATH_INFO. . . . . . . . . . . . . . . . . . . . 13
4.1.6. PATH_TRANSLATED. . . . . . . . . . . . . . . . . 14
4.1.7. QUERY_STRING . . . . . . . . . . . . . . . . . . 15
4.1.8. REMOTE_ADDR. . . . . . . . . . . . . . . . . . . 15
4.1.9. REMOTE_HOST. . . . . . . . . . . . . . . . . . . 16
4.1.10. REMOTE_IDENT . . . . . . . . . . . . . . . . . . 16
4.1.11. REMOTE_USER. . . . . . . . . . . . . . . . . . . 16
4.1.12. REQUEST_METHOD . . . . . . . . . . . . . . . . . 17
4.1.13. SCRIPT_NAME. . . . . . . . . . . . . . . . . . . 17
4.1.14. SERVER_NAME. . . . . . . . . . . . . . . . . . . 17
4.1.15. SERVER_PORT. . . . . . . . . . . . . . . . . . . 18
4.1.16. SERVER_PROTOCOL. . . . . . . . . . . . . . . . . 18
4.1.17. SERVER_SOFTWARE. . . . . . . . . . . . . . . . . 19
4.1.18. Protocol-Specific Meta-Variables . . . . . . . . 19
4.2. Request Message-Body . . . . . . . . . . . . . . . . . . 20
4.3. Request Methods . . . . . . . . . . . . . . . . . . . . 20
4.3.1. GET. . . . . . . . . . . . . . . . . . . . . . . 20
4.3.2. POST . . . . . . . . . . . . . . . . . . . . . . 21
4.3.3. HEAD . . . . . . . . . . . . . . . . . . . . . . 21
4.3.4. Protocol-Specific Methods. . . . . . . . . . . . 21
4.4. The Script Command Line. . . . . . . . . . . . . . . . . 21
Robinson & Coar Informational [Page 2]
RFC 3875 CGI Version 1.1 October 2004
5. NPH Scripts . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.1. Identification . . . . . . . . . . . . . . . . . . . . . 22
5.2. NPH Response . . . . . . . . . . . . . . . . . . . . . . 22
6. CGI Response. . . . . . . . . . . . . . . . . . . . . . . . . 23
6.1. Response Handling. . . . . . . . . . . . . . . . . . . . 23
6.2. Response Types . . . . . . . . . . . . . . . . . . . . . 23
6.2.1. Document Response. . . . . . . . . . . . . . . . 23
6.2.2. Local Redirect Response. . . . . . . . . . . . . 24
6.2.3. Client Redirect Response . . . . . . . . . . . . 24
6.2.4. Client Redirect Response with Document . . . . . 24
6.3. Response Header Fields . . . . . . . . . . . . . . . . . 25
6.3.1. Content-Type . . . . . . . . . . . . . . . . . . 25
6.3.2. Location . . . . . . . . . . . . . . . . . . . . 26
6.3.3. Status . . . . . . . . . . . . . . . . . . . . . 26
6.3.4. Protocol-Specific Header Fields. . . . . . . . . 27
6.3.5. Extension Header Fields. . . . . . . . . . . . . 27
6.4. Response Message-Body. . . . . . . . . . . . . . . . . . 28
7. System Specifications . . . . . . . . . . . . . . . . . . . . 28
7.1. AmigaDOS . . . . . . . . . . . . . . . . . . . . . . . . 28
7.2. UNIX . . . . . . . . . . . . . . . . . . . . . . . . . . 28
7.3. EBCDIC/POSIX . . . . . . . . . . . . . . . . . . . . . . 29
8. Implementation. . . . . . . . . . . . . . . . . . . . . . . . 29
8.1. Recommendations for Servers. . . . . . . . . . . . . . . 29
8.2. Recommendations for Scripts. . . . . . . . . . . . . . . 30
9. Security Considerations . . . . . . . . . . . . . . . . . . . 30
9.1. Safe Methods . . . . . . . . . . . . . . . . . . . . . . 30
9.2. Header Fields Containing Sensitive Information . . . . . 31
9.3. Data Privacy . . . . . . . . . . . . . . . . . . . . . . 31
9.4. Information Security Model . . . . . . . . . . . . . . . 31
9.5. Script Interference with the Server. . . . . . . . . . . 31
9.6. Data Length and Buffering Considerations . . . . . . . . 32
9.7. Stateless Processing . . . . . . . . . . . . . . . . . . 32
9.8. Relative Paths . . . . . . . . . . . . . . . . . . . . . 33
9.9. Non-parsed Header Output . . . . . . . . . . . . . . . . 33
10. Acknowledgements. . . . . . . . . . . . . . . . . . . . . . . 33
11. References. . . . . . . . . . . . . . . . . . . . . . . . . . 33
11.1. Normative References. . . . . . . . . . . . . . . . . . 33
11.2. Informative References. . . . . . . . . . . . . . . . . 34
12. Authors' Addresses. . . . . . . . . . . . . . . . . . . . . . 35
13. Full Copyright Statement. . . . . . . . . . . . . . . . . . . 36
Robinson & Coar Informational [Page 3]
RFC 3875 CGI Version 1.1 October 2004
1. Introduction
1.1. Purpose
The Common Gateway Interface (CGI) [22] allows an HTTP [1], [4]
server and a CGI script to share responsibility for responding to
client requests. The client request comprises a Uniform Resource
Identifier (URI) [11], a request method and various ancillary
information about the request provided by the transport protocol.
The CGI defines the abstract parameters, known as meta-variables,
which describe a client's request. Together with a concrete
programmer interface this specifies a platform-independent interface
between the script and the HTTP server.
The server is responsible for managing connection, data transfer,
transport and network issues related to the client request, whereas
the CGI script handles the application issues, such as data access
and document processing.
1.2. Requirements
The key words 'MUST', 'MUST NOT', 'REQUIRED', 'SHALL', 'SHALL NOT',
'SHOULD', 'SHOULD NOT', 'RECOMMENDED', 'MAY' and 'OPTIONAL' in this
document are to be interpreted as described in BCP 14, RFC 2119 [3].
An implementation is not compliant if it fails to satisfy one or more
of the 'must' requirements for the protocols it implements. An
implementation that satisfies all of the 'must' and all of the
'should' requirements for its features is said to be 'unconditionally
compliant'; one that satisfies all of the 'must' requirements but not
all of the 'should' requirements for its features is said to be
'conditionally compliant'.
1.3. Specifications
Not all of the functions and features of the CGI are defined in the
main part of this specification. The following phrases are used to
describe the features that are not specified:
'system-defined'
The feature may differ between systems, but must be the same for
different implementations using the same system. A system will
usually identify a class of operating systems. Some systems are
defined in section 7 of this document. New systems may be defined
by new specifications without revision of this document.
Robinson & Coar Informational [Page 4]
RFC 3875 CGI Version 1.1 October 2004
'implementation-defined'
The behaviour of the feature may vary from implementation to
implementation; a particular implementation must document its
behaviour.
1.4. Terminology
This specification uses many terms defined in the HTTP/1.1
specification [4]; however, the following terms are used here in a
sense which may not accord with their definitions in that document,
or with their common meaning.
'meta-variable'
A named parameter which carries information from the server to the
script. It is not necessarily a variable in the operating
system's environment, although that is the most common
implementation.
'script'
The software that is invoked by the server according to this
interface. It need not be a standalone program, but could be a
dynamically-loaded or shared library, or even a subroutine in the
server. It might be a set of statements interpreted at run-time,
as the term 'script' is frequently understood, but that is not a
requirement and within the context of this specification the term
has the broader definition stated.
'server'
The application program that invokes the script in order to
service requests from the client.
2. Notational Conventions and Generic Grammar
2.1. Augmented BNF
All of the mechanisms specified in this document are described in
both prose and an augmented Backus-Naur Form (BNF) similar to that
used by RFC 822 [13]. Unless stated otherwise, the elements are
case-sensitive. This augmented BNF contains the following
constructs:
name = definition
The name of a rule and its definition are separated by the equals
character ('='). Whitespace is only significant in that
continuation lines of a definition are indented.
Robinson & Coar Informational [Page 5]
RFC 3875 CGI Version 1.1 October 2004
"literal"
Double quotation marks (") surround literal text, except for a
literal quotation mark, which is surrounded by angle-brackets ('<'
and '>').
rule1 | rule2
Alternative rules are separated by a vertical bar ('|').
(rule1 rule2 rule3)
Elements enclosed in parentheses are treated as a single element.
*rule
A rule preceded by an asterisk ('*') may have zero or more
occurrences. The full form is 'n*m rule' indicating at least n
and at most m occurrences of the rule. n and m are optional
decimal values with default values of 0 and infinity respectively.
[rule]
An element enclosed in square brackets ('[' and ']') is optional,
and is equivalent to '*1 rule'.
N rule
A rule preceded by a decimal number represents exactly N
occurrences of the rule. It is equivalent to 'N*N rule'.
2.2. Basic Rules
This specification uses a BNF-like grammar defined in terms of
characters. Unlike many specifications which define the bytes
allowed by a protocol, here each literal in the grammar corresponds
to the character it represents. How these characters are represented
in terms of bits and bytes within a system are either system-defined
or specified in the particular context. The single exception is the
rule 'OCTET', defined below.
The following rules are used throughout this specification to
describe basic parsing constructs.
alpha = lowalpha | hialpha
lowalpha = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" |
"i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" |
"q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" |
"y" | "z"
hialpha = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" |
"I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" |
"Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" |
"Y" | "Z"
Robinson & Coar Informational [Page 6]
RFC 3875 CGI Version 1.1 October 2004
digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
"8" | "9"
alphanum = alpha | digit
OCTET = <any 8-bit byte>
CHAR = alpha | digit | separator | "!" | "#" | "$" |
"%" | "&" | "'" | "*" | "+" | "-" | "." | "`" |
"^" | "_" | "{" | "|" | "}" | "~" | CTL
CTL = <any control character>
SP = <space character>
HT = <horizontal tab character>
NL = <newline>
LWSP = SP | HT | NL
separator = "(" | ")" | "<" | ">" | "@" | "," | ";" | ":" |
"\" | <"> | "/" | "[" | "]" | "?" | "=" | "{" |
"}" | SP | HT
token = 1*<any CHAR except CTLs or separators>
quoted-string = <"> *qdtext <">
qdtext = <any CHAR except <"> and CTLs but including LWSP>
TEXT = <any printable character>
Note that newline (NL) need not be a single control character, but
can be a sequence of control characters. A system MAY define TEXT to
be a larger set of characters than <any CHAR excluding CTLs but
including LWSP>.
2.3. URL Encoding
Some variables and constructs used here are described as being
'URL-encoded'. This encoding is described in section 2 of RFC 2396
[2]. In a URL-encoded string an escape sequence consists of a
percent character ("%") followed by two hexadecimal digits, where the
two hexadecimal digits form an octet. An escape sequence represents
the graphic character that has the octet as its code within the
US-ASCII [9] coded character set, if it exists. Currently there is
no provision within the URI syntax to identify which character set
non-ASCII codes represent, so CGI handles this issue on an ad-hoc
basis.
Note that some unsafe (reserved) characters may have different
semantics when encoded. The definition of which characters are
unsafe depends on the context; see section 2 of RFC 2396 [2], updated
by RFC 2732 [7], for an authoritative treatment. These reserved
characters are generally used to provide syntactic structure to the
character string, for example as field separators. In all cases, the
string is first processed with regard to any reserved characters
present, and then the resulting data can be URL-decoded by replacing
"%" escape sequences by their character values.
Robinson & Coar Informational [Page 7]
RFC 3875 CGI Version 1.1 October 2004
To encode a character string, all reserved and forbidden characters
are replaced by the corresponding "%" escape sequences. The string
can then be used in assembling a URI. The reserved characters will
vary from context to context, but will always be drawn from this set:
reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" | "$" |
"," | "[" | "]"
The last two characters were added by RFC 2732 [7]. In any
particular context, a sub-set of these characters will be reserved;
the other characters from this set MUST NOT be encoded when a string
is URL-encoded in that context. Other basic rules used to describe
URI syntax are:
hex = digit | "A" | "B" | "C" | "D" | "E" | "F" | "a" | "b"
| "c" | "d" | "e" | "f"
escaped = "%" hex hex
unreserved = alpha | digit | mark
mark = "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")"
3. Invoking the Script
3.1. Server Responsibilities
The server acts as an application gateway. It receives the request
from the client, selects a CGI script to handle the request, converts
the client request to a CGI request, executes the script and converts
the CGI response into a response for the client. When processing the
client request, it is responsible for implementing any protocol or
transport level authentication and security. The server MAY also
function in a 'non-transparent' manner, modifying the request or
response in order to provide some additional service, such as media
type transformation or protocol reduction.
The server MUST perform translations and protocol conversions on the
client request data required by this specification. Furthermore, the
server retains its responsibility to the client to conform to the
relevant network protocol even if the CGI script fails to conform to
this specification.
If the server is applying authentication to the request, then it MUST
NOT execute the script unless the request passes all defined access
controls.
Robinson & Coar Informational [Page 8]
RFC 3875 CGI Version 1.1 October 2004
3.2. Script Selection
The server determines which CGI is script to be executed based on a
generic-form URI supplied by the client. This URI includes a
hierarchical path with components separated by "/". For any
particular request, the server will identify all or a leading part of
this path with an individual script, thus placing the script at a
particular point in the path hierarchy. The remainder of the path,
if any, is a resource or sub-resource identifier to be interpreted by
the script.
Information about this split of the path is available to the script
in the meta-variables, described below. Support for non-hierarchical
URI schemes is outside the scope of this specification.
3.3. The Script-URI
The mapping from client request URI to choice of script is defined by
the particular server implementation and its configuration. The
server may allow the script to be identified with a set of several
different URI path hierarchies, and therefore is permitted to replace
the URI by other members of this set during processing and generation
of the meta-variables. The server
1. MAY preserve the URI in the particular client request; or
2. it MAY select a canonical URI from the set of possible values
for each script; or
3. it can implement any other selection of URI from the set.
From the meta-variables thus generated, a URI, the 'Script-URI', can
be constructed. This MUST have the property that if the client had
accessed this URI instead, then the script would have been executed
with the same values for the SCRIPT_NAME, PATH_INFO and QUERY_STRING
meta-variables. The Script-URI has the structure of a generic URI as
defined in section 3 of RFC 2396 [2], with the exception that object
parameters and fragment identifiers are not permitted. The various
components of the Script-URI are defined by some of the
meta-variables (see below);
script-URI = <scheme> "://" <server-name> ":" <server-port>
<script-path> <extra-path> "?" <query-string>
where <scheme> is found from SERVER_PROTOCOL, <server-name>,
<server-port> and <query-string> are the values of the respective
meta-variables. The SCRIPT_NAME and PATH_INFO values, URL-encoded
with ";", "=" and "?" reserved, give <script-path> and <extra-path>.
Robinson & Coar Informational [Page 9]
RFC 3875 CGI Version 1.1 October 2004
See section 4.1.5 for more information about the PATH_INFO
meta-variable.
The scheme and the protocol are not identical as the scheme
identifies the access method in addition to the application protocol.
For example, a resource accessed using Transport Layer Security (TLS)
[14] would have a request URI with a scheme of https when using the
HTTP protocol [19]. CGI/1.1 provides no generic means for the script
to reconstruct this, and therefore the Script-URI as defined includes
the base protocol used. However, a script MAY make use of
scheme-specific meta-variables to better deduce the URI scheme.
Note that this definition also allows URIs to be constructed which
would invoke the script with any permitted values for the path-info
or query-string, by modifying the appropriate components.
3.4. Execution
The script is invoked in a system-defined manner. Unless specified
otherwise, the file containing the script will be invoked as an
executable program. The server prepares the CGI request as described
in section 4; this comprises the request meta-variables (immediately
available to the script on execution) and request message data. The
request data need not be immediately available to the script; the
script can be executed before all this data has been received by the
server from the client. The response from the script is returned to
the server as described in sections 5 and 6.
In the event of an error condition, the server can interrupt or
terminate script execution at any time and without warning. That
could occur, for example, in the event of a transport failure between
the server and the client; so the script SHOULD be prepared to handle
abnormal termination.
4. The CGI Request
Information about a request comes from two different sources; the
request meta-variables and any associated message-body.
4.1. Request Meta-Variables
Meta-variables contain data about the request passed from the server
to the script, and are accessed by the script in a system-defined
manner. Meta-variables are identified by case-insensitive names;
there cannot be two different variables whose names differ in case
only. Here they are shown using a canonical representation of
capitals plus underscore ("_"). A particular system can define a
different representation.
Robinson & Coar Informational [Page 10]
RFC 3875 CGI Version 1.1 October 2004
meta-variable-name = "AUTH_TYPE" | "CONTENT_LENGTH" |
"CONTENT_TYPE" | "GATEWAY_INTERFACE" |
"PATH_INFO" | "PATH_TRANSLATED" |
"QUERY_STRING" | "REMOTE_ADDR" |
"REMOTE_HOST" | "REMOTE_IDENT" |
"REMOTE_USER" | "REQUEST_METHOD" |
"SCRIPT_NAME" | "SERVER_NAME" |
"SERVER_PORT" | "SERVER_PROTOCOL" |
"SERVER_SOFTWARE" | scheme |
protocol-var-name | extension-var-name
protocol-var-name = ( protocol | scheme ) "_" var-name
scheme = alpha *( alpha | digit | "+" | "-" | "." )
var-name = token
extension-var-name = token
Meta-variables with the same name as a scheme, and names beginning
with the name of a protocol or scheme (e.g., HTTP_ACCEPT) are also
defined. The number and meaning of these variables may change
independently of this specification. (See also section 4.1.18.)
The server MAY set additional implementation-defined extension meta-
variables, whose names SHOULD be prefixed with "X_".
This specification does not distinguish between zero-length (NULL)
values and missing values. For example, a script cannot distinguish
between the two requests http://host/script and http://host/script?
as in both cases the QUERY_STRING meta-variable would be NULL.
meta-variable-value = "" | 1*<TEXT, CHAR or tokens of value>
An optional meta-variable may be omitted (left unset) if its value is
NULL. Meta-variable values MUST be considered case-sensitive except
as noted otherwise. The representation of the characters in the
meta-variables is system-defined; the server MUST convert values to
that representation.
4.1.1. AUTH_TYPE
The AUTH_TYPE variable identifies any mechanism used by the server to
authenticate the user. It contains a case-insensitive value defined
by the client protocol or server implementation.
For HTTP, if the client request required authentication for external
access, then the server MUST set the value of this variable from the
'auth-scheme' token in the request Authorization header field.
Robinson & Coar Informational [Page 11]
RFC 3875 CGI Version 1.1 October 2004
AUTH_TYPE = "" | auth-scheme
auth-scheme = "Basic" | "Digest" | extension-auth
extension-auth = token
HTTP access authentication schemes are described in RFC 2617 [5].
4.1.2. CONTENT_LENGTH
The CONTENT_LENGTH variable contains the size of the message-body
attached to the request, if any, in decimal number of octets. If no
data is attached, then NULL (or unset).
CONTENT_LENGTH = "" | 1*digit
The server MUST set this meta-variable if and only if the request is
accompanied by a message-body entity. The CONTENT_LENGTH value must
reflect the length of the message-body after the server has removed
any transfer-codings or content-codings.
4.1.3. CONTENT_TYPE
If the request includes a message-body, the CONTENT_TYPE variable is
set to the Internet Media Type [6] of the message-body.
CONTENT_TYPE = "" | media-type
media-type = type "/" subtype *( ";" parameter )
type = token
subtype = token
parameter = attribute "=" value
attribute = token
value = token | quoted-string
The type, subtype and parameter attribute names are not
case-sensitive. Parameter values may be case sensitive. Media types
and their use in HTTP are described section 3.7 of the HTTP/1.1
specification [4].
There is no default value for this variable. If and only if it is
unset, then the script MAY attempt to determine the media type from
the data received. If the type remains unknown, then the script MAY
choose to assume a type of application/octet-stream or it may reject
the request with an error (as described in section 6.3.3).
Each media-type defines a set of optional and mandatory parameters.
This may include a charset parameter with a case-insensitive value
defining the coded character set for the message-body. If the
Robinson & Coar Informational [Page 12]
RFC 3875 CGI Version 1.1 October 2004
charset parameter is omitted, then the default value should be
derived according to whichever of the following rules is the first to
apply:
1. There MAY be a system-defined default charset for some
media-types.
2. The default for media-types of type "text" is ISO-8859-1 [4].
3. Any default defined in the media-type specification.
4. The default is US-ASCII.
The server MUST set this meta-variable if an HTTP Content-Type field
is present in the client request header. If the server receives a
request with an attached entity but no Content-Type header field, it
MAY attempt to determine the correct content type, otherwise it
should omit this meta-variable.
4.1.4. GATEWAY_INTERFACE
The GATEWAY_INTERFACE variable MUST be set to the dialect of CGI
being used by the server to communicate with the script. Syntax:
GATEWAY_INTERFACE = "CGI" "/" 1*digit "." 1*digit
Note that the major and minor numbers are treated as separate
integers and hence each may be incremented higher than a single
digit. Thus CGI/2.4 is a lower version than CGI/2.13 which in turn
is lower than CGI/12.3. Leading zeros MUST be ignored by the script
and MUST NOT be generated by the server.
This document defines the 1.1 version of the CGI interface.
4.1.5. PATH_INFO
The PATH_INFO variable specifies a path to be interpreted by the CGI
script. It identifies the resource or sub-resource to be returned by
the CGI script, and is derived from the portion of the URI path
hierarchy following the part that identifies the script itself.
Unlike a URI path, the PATH_INFO is not URL-encoded, and cannot
contain path-segment parameters. A PATH_INFO of "/" represents a
single void path segment.
PATH_INFO = "" | ( "/" path )
path = lsegment *( "/" lsegment )
lsegment = *lchar
lchar = <any TEXT or CTL except "/">
Robinson & Coar Informational [Page 13]
RFC 3875 CGI Version 1.1 October 2004
The value is considered case-sensitive and the server MUST preserve
the case of the path as presented in the request URI. The server MAY
impose restrictions and limitations on what values it permits for
PATH_INFO, and MAY reject the request with an error if it encounters
any values considered objectionable. That MAY include any requests
that would result in an encoded "/" being decoded into PATH_INFO, as
this might represent a loss of information to the script. Similarly,
treatment of non US-ASCII characters in the path is system-defined.
URL-encoded, the PATH_INFO string forms the extra-path component of
the Script-URI (see section 3.3) which follows the SCRIPT_NAME part
of that path.
4.1.6. PATH_TRANSLATED
The PATH_TRANSLATED variable is derived by taking the PATH_INFO
value, parsing it as a local URI in its own right, and performing any
virtual-to-physical translation appropriate to map it onto the
server's document repository structure. The set of characters
permitted in the result is system-defined.
PATH_TRANSLATED = *<any character>
This is the file location that would be accessed by a request for
<scheme> "://" <server-name> ":" <server-port> <extra-path>
where <scheme> is the scheme for the original client request and
<extra-path> is a URL-encoded version of PATH_INFO, with ";", "=" and
"?" reserved. For example, a request such as the following:
http://somehost.com/cgi-bin/somescript/this%2eis%2epath%3binfo
would result in a PATH_INFO value of
/this.is.the.path;info
An internal URI is constructed from the scheme, server location and
the URL-encoded PATH_INFO:
http://somehost.com/this.is.the.path%3binfo
This would then be translated to a location in the server's document
repository, perhaps a filesystem path something like this:
/usr/local/www/htdocs/this.is.the.path;info
The value of PATH_TRANSLATED is the result of the translation.
Robinson & Coar Informational [Page 14]
RFC 3875 CGI Version 1.1 October 2004
The value is derived in this way irrespective of whether it maps to a
valid repository location. The server MUST preserve the case of the
extra-path segment unless the underlying repository supports case-
insensitive names. If the repository is only case-aware, case-
preserving, or case-blind with regard to document names, the server
is not required to preserve the case of the original segment through
the translation.
The translation algorithm the server uses to derive PATH_TRANSLATED
is implementation-defined; CGI scripts which use this variable may
suffer limited portability.
The server SHOULD set this meta-variable if the request URI includes
a path-info component. If PATH_INFO is NULL, then the
PATH_TRANSLATED variable MUST be set to NULL (or unset).
4.1.7. QUERY_STRING
The QUERY_STRING variable contains a URL-encoded search or parameter
string; it provides information to the CGI script to affect or refine
the document to be returned by the script.
The URL syntax for a search string is described in section 3 of RFC
2396 [2]. The QUERY_STRING value is case-sensitive.
QUERY_STRING = query-string
query-string = *uric
uric = reserved | unreserved | escaped
When parsing and decoding the query string, the details of the
parsing, reserved characters and support for non US-ASCII characters
depends on the context. For example, form submission from an HTML
document [18] uses application/x-www-form-urlencoded encoding, in
which the characters "+", "&" and "=" are reserved, and the ISO
8859-1 encoding may be used for non US-ASCII characters.
The QUERY_STRING value provides the query-string part of the
Script-URI. (See section 3.3).
The server MUST set this variable; if the Script-URI does not include
a query component, the QUERY_STRING MUST be defined as an empty
string ("").
4.1.8. REMOTE_ADDR
The REMOTE_ADDR variable MUST be set to the network address of the
client sending the request to the server.
Robinson & Coar Informational [Page 15]
RFC 3875 CGI Version 1.1 October 2004
REMOTE_ADDR = hostnumber
hostnumber = ipv4-address | ipv6-address
ipv4-address = 1*3digit "." 1*3digit "." 1*3digit "." 1*3digit
ipv6-address = hexpart [ ":" ipv4-address ]
hexpart = hexseq | ( [ hexseq ] "::" [ hexseq ] )
hexseq = 1*4hex *( ":" 1*4hex )
The format of an IPv6 address is described in RFC 3513 [15].
4.1.9. REMOTE_HOST
The REMOTE_HOST variable contains the fully qualified domain name of
the client sending the request to the server, if available, otherwise
NULL. Fully qualified domain names take the form as described in
section 3.5 of RFC 1034 [17] and section 2.1 of RFC 1123 [12].
Domain names are not case sensitive.
REMOTE_HOST = "" | hostname | hostnumber
hostname = *( domainlabel "." ) toplabel [ "." ]
domainlabel = alphanum [ *alphahypdigit alphanum ]
toplabel = alpha [ *alphahypdigit alphanum ]
alphahypdigit = alphanum | "-"
The server SHOULD set this variable. If the hostname is not
available for performance reasons or otherwise, the server MAY
substitute the REMOTE_ADDR value.
4.1.10. REMOTE_IDENT
The REMOTE_IDENT variable MAY be used to provide identity information
reported about the connection by an RFC 1413 [20] request to the
remote agent, if available. The server may choose not to support
this feature, or not to request the data for efficiency reasons, or
not to return available identity data.
REMOTE_IDENT = *TEXT
The data returned may be used for authentication purposes, but the
level of trust reposed in it should be minimal.
4.1.11. REMOTE_USER
The REMOTE_USER variable provides a user identification string
supplied by client as part of user authentication.
REMOTE_USER = *TEXT
Robinson & Coar Informational [Page 16]
RFC 3875 CGI Version 1.1 October 2004
If the client request required HTTP Authentication [5] (e.g., the
AUTH_TYPE meta-variable is set to "Basic" or "Digest"), then the
value of the REMOTE_USER meta-variable MUST be set to the user-ID
supplied.
4.1.12. REQUEST_METHOD
The REQUEST_METHOD meta-variable MUST be set to the method which
should be used by the script to process the request, as described in
section 4.3.
REQUEST_METHOD = method
method = "GET" | "POST" | "HEAD" | extension-method
extension-method = "PUT" | "DELETE" | token
The method is case sensitive. The HTTP methods are described in
section 5.1.1 of the HTTP/1.0 specification [1] and section 5.1.1 of
the HTTP/1.1 specification [4].
4.1.13. SCRIPT_NAME
The SCRIPT_NAME variable MUST be set to a URI path (not URL-encoded)
which could identify the CGI script (rather than the script's
output). The syntax is the same as for PATH_INFO (section 4.1.5)
SCRIPT_NAME = "" | ( "/" path )
The leading "/" is not part of the path. It is optional if the path
is NULL; however, the variable MUST still be set in that case.
The SCRIPT_NAME string forms some leading part of the path component
of the Script-URI derived in some implementation-defined manner. No
PATH_INFO segment (see section 4.1.5) is included in the SCRIPT_NAME
value.
4.1.14. SERVER_NAME
The SERVER_NAME variable MUST be set to the name of the server host
to which the client request is directed. It is a case-insensitive
hostname or network address. It forms the host part of the
Script-URI.
SERVER_NAME = server-name
server-name = hostname | ipv4-address | ( "[" ipv6-address "]" )
Robinson & Coar Informational [Page 17]
RFC 3875 CGI Version 1.1 October 2004
A deployed server can have more than one possible value for this
variable, where several HTTP virtual hosts share the same IP address.
In that case, the server would use the contents of the request's Host
header field to select the correct virtual host.
4.1.15. SERVER_PORT
The SERVER_PORT variable MUST be set to the TCP/IP port number on
which this request is received from the client. This value is used
in the port part of the Script-URI.
SERVER_PORT = server-port
server-port = 1*digit
Note that this variable MUST be set, even if the port is the default
port for the scheme and could otherwise be omitted from a URI.
4.1.16. SERVER_PROTOCOL
The SERVER_PROTOCOL variable MUST be set to the name and version of
the application protocol used for this CGI request. This MAY differ
from the protocol version used by the server in its communication
with the client.
SERVER_PROTOCOL = HTTP-Version | "INCLUDED" | extension-version
HTTP-Version = "HTTP" "/" 1*digit "." 1*digit
extension-version = protocol [ "/" 1*digit "." 1*digit ]
protocol = token
Here, 'protocol' defines the syntax of some of the information
passing between the server and the script (the 'protocol-specific'
features). It is not case sensitive and is usually presented in
upper case. The protocol is not the same as the scheme part of the
script URI, which defines the overall access mechanism used by the
client to communicate with the server. For example, a request that
reaches the script with a protocol of "HTTP" may have used an "https"
scheme.
A well-known value for SERVER_PROTOCOL which the server MAY use is
"INCLUDED", which signals that the current document is being included
as part of a composite document, rather than being the direct target
of the client request. The script should treat this as an HTTP/1.0