-
Notifications
You must be signed in to change notification settings - Fork 4
/
590.txt
1141 lines (893 loc) · 46.9 KB
/
590.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
;; [3] [[符号化語]]も参照。
- Network Working Group
- Request for Comments: 2047
- Obsoletes: 1521, 1522, 1590
- Category: Standards Track
- K. Moore
- University of Tennessee
- November 1996
* MIME (Multipurpose Internet Mail Extensions) Part Three: Message Header Extensions for Non-ASCII Text [INS[MIME (多目的 Internet メイル拡張) 第3部: 非 ASCII 文用メッセージ頭拡張]]
** Status of this Memo [INS[このメモの位置付け]]
> This document specifies an Internet standards track protocol for the
Internet community, and requests discussion and suggestions for
improvements. Please refer to the current edition of the "Internet
Official Protocol Standards" (STD 1) for the standardization state
and status of this protocol. Distribution of this memo is unlimited.
** Abstract [INS[概要]]
> STD 11, RFC 822, defines a message representation protocol specifying
considerable detail about US-ASCII message headers, and leaves the
message content, or message body, as flat US-ASCII text. This set of
documents, collectively called the Multipurpose Internet Mail
Extensions, or MIME, redefines the format of messages to allow for
= (1) textual message bodies in character sets other than US-ASCII,
= (2) an extensible set of different formats for non-textual message
bodies,
= (3) multi-part message bodies, and
= (4) textual header information in character sets other than US-ASCII.
> These documents are based on earlier work documented in RFC 934, STD
11, and RFC 1049, but extends and revises them. Because RFC 822 said
so little about message bodies, these documents are largely
orthogonal to (rather than a revision of) RFC 822.
> This particular document is the third document in the series. It
describes extensions to RFC 822 to allow non-US-ASCII text data in
Internet mail header fields.
> Other documents in this series include:
[PRE[
+ RFC 2045, which specifies the various headers used to describe
the structure of MIME messages.
]PRE]
[PRE[
+ RFC 2046, which defines the general structure of the MIME media
typing system and defines an initial set of media types,
]PRE]
[PRE[
+ RFC 2048, which specifies various IANA registration procedures
for MIME-related facilities, and
]PRE]
[PRE[
+ RFC 2049, which describes MIME conformance criteria and
provides some illustrative examples of MIME message formats,
acknowledgements, and the bibliography.
]PRE]
> These documents are revisions of RFCs 1521, 1522, and 1590, which
themselves were revisions of RFCs 1341 and 1342. An appendix in RFC
2049 describes differences and changes from previous versions.
** 1. Introduction [INS[はじめに]]
[PRE[
RFC 2045 describes a mechanism for denoting textual body parts which
are coded in various character sets, as well as methods for encoding
such body parts as sequences of printable US-ASCII characters. This
memo describes similar techniques to allow the encoding of non-ASCII
text in various portions of a RFC 822 [2] message header, in a manner
which is unlikely to confuse existing message handling software.
]PRE]
[PRE[
Like the encoding techniques described in RFC 2045, the techniques
outlined here were designed to allow the use of non-ASCII characters
in message headers in a way which is unlikely to be disturbed by the
quirks of existing Internet mail handling programs. In particular,
some mail relaying programs are known to (a) delete some message
header fields while retaining others, (b) rearrange the order of
addresses in To or Cc fields, (c) rearrange the (vertical) order of
header fields, and/or (d) "wrap" message headers at different places
than those in the original message. In addition, some mail reading
programs are known to have difficulty correctly parsing message
headers which, while legal according to RFC 822, make use of
backslash-quoting to "hide" special characters such as "<", ",", or
":", or which exploit other infrequently-used features of that
specification.
]PRE]
[PRE[
While it is unfortunate that these programs do not correctly
interpret RFC 822 headers, to "break" these programs would cause
severe operational problems for the Internet mail system. The
extensions described in this memo therefore do not rely on little-
used features of RFC 822.
]PRE]
[PRE[
Instead, certain sequences of "ordinary" printable ASCII characters
(known as "encoded-words") are reserved for use as encoded data. The
syntax of encoded-words is such that they are unlikely to
"accidentally" appear as normal text in message headers.
Furthermore, the characters used in encoded-words are restricted to
those which do not have special meanings in the context in which the
encoded-word appears.
]PRE]
[PRE[
Generally, an "encoded-word" is a sequence of printable ASCII
characters that begins with "=?", ends with "?=", and has two "?"s in
between. It specifies a character set and an encoding method, and
also includes the original text encoded as graphic ASCII characters,
according to the rules for that encoding method.
]PRE]
[PRE[
A mail composer that implements this specification will provide a
means of inputting non-ASCII text in header fields, but will
translate these fields (or appropriate portions of these fields) into
encoded-words before inserting them into the message header.
]PRE]
[PRE[
A mail reader that implements this specification will recognize
encoded-words when they appear in certain portions of the message
header. Instead of displaying the encoded-word "as is", it will
reverse the encoding and display the original text in the designated
character set.
]PRE]
*** NOTES [INS[注意]]
[PRE[
This memo relies heavily on notation and terms defined RFC 822 and
RFC 2045. In particular, the syntax for the ABNF used in this memo
is defined in RFC 822, as well as many of the terminal or nonterminal
symbols from RFC 822 are used in the grammar for the header
extensions defined here. Among the symbols defined in RFC 822 and
referenced in this memo are: 'addr-spec', 'atom', 'CHAR', 'comment',
'CTLs', 'ctext', 'linear-white-space', 'phrase', 'quoted-pair'.
'quoted-string', 'SPACE', and 'word'. Successful implementation of
this protocol extension requires careful attention to the RFC 822
definitions of these terms.
]PRE]
[PRE[
When the term "ASCII" appears in this memo, it refers to the "7-Bit
American Standard Code for Information Interchange", ANSI X3.4-1986.
The MIME charset name for this character set is "US-ASCII". When not
specifically referring to the MIME charset name, this document uses
the term "ASCII", both for brevity and for consistency with RFC 822.
However, implementors are warned that the character set name must be
spelled "US-ASCII" in MIME message and body part headers.
]PRE]
[PRE[
This memo specifies a protocol for the representation of non-ASCII
text in message headers. It specifically DOES NOT define any
translation between "8-bit headers" and pure ASCII headers, nor is
any such translation assumed to be possible.
]PRE]
** 2. Syntax of encoded-words [INS[encoded-word (符号化語) の構文]]
> An 'encoded-word' is defined by the following ABNF grammar. The
notation of RFC 822 is used, with the exception that white space
characters MUST NOT appear between components of an 'encoded-word'.
「encoded-word」は次の [[ABNF]] 文法で定義します。
RFC 822 の記法を使いますが、[[空白間隔]]文字が「encoded-word」
の部品間に現れては'''いけません'''。
[PRE[
encoded-word = "=?" charset "?" encoding "?" encoded-text "?="
]PRE]
[PRE[
charset = token ; see section 3
]PRE]
[PRE[
encoding = token ; see section 4
]PRE]
[PRE[
token = 1*<Any CHAR except SPACE, CTLs, and especials>
]PRE]
[PRE[
especials = "(" / ")" / "<" / ">" / "@" / "," / ";" / ":" / "
<"> / "/" / "[" / "]" / "?" / "." / "="
]PRE]
[PRE[
encoded-text = 1*<Any printable ASCII character other than "?"
or SPACE>
; (but see "Use of encoded-words in message
; headers", section 5)
]PRE]
> Both 'encoding' and 'charset' names are case-independent. Thus the
charset name "ISO-8859-1" is equivalent to "iso-8859-1", and the
encoding named "Q" may be spelled either "Q" or "q".
「encoding」と「charset」の名前はどちらも大文字・小文字を
区別しません。ですから charset 名「ISO-8859-1」は
「iso-8859-1」と同じで、符号化方式名「Q」は「Q」とも「q」とも
綴ることが出来ます。
> An 'encoded-word' may not be more than 75 characters long, including
'charset', 'encoding', 'encoded-text', and delimiters. If it is
desirable to encode more text than will fit in an 'encoded-word' of
75 characters, multiple 'encoded-word's (separated by CRLF SPACE) may
be used.
「encoded-word」は「charset」, 「encoding」, 「encoded-text」
と区切りを含めて75文字長より長くなってはなりません。
「encoded-word」で75文字を超える文字を符号化したい時は、
複数の「encoded-word」(CRLF SPACE で区切る) を使うことが出来ます。
> While there is no limit to the length of a multiple-line header
field, each line of a header field that contains one or more
'encoded-word's is limited to 76 characters.
複数行頭領域の長さの制限はないので、頭領域の各行は1つ以上の「encoded-word」
を含む頭領域の各行は76文字に制限します。
> The length restrictions are included both to ease interoperability
through internetwork mail gateways, and to impose a limit on the
amount of lookahead a header parser must employ (while looking for a
final ?= delimiter) before it can decide whether a token is an
"encoded-word" or something else.
長さの制限をするはネットワーク間メイル関門での相互通信を容易にし、
頭解析器の (最後の ?= 区切りを探して) token が 「encoded-word」
か他の何かか決める前の先読みの量を制限するためです。
> IMPORTANT: 'encoded-word's are designed to be recognized as 'atom's
by an RFC 822 parser. As a consequence, unencoded white space
characters (such as SPACE and HTAB) are FORBIDDEN within an
'encoded-word'. For example, the character sequence
重要: 「encoded-word」は RFC 822 解析器には「atom」と
みなされるように設計されています。ですから、符号化されていない
空白間隔文字 (SPACE や HTAB) は「encoded-word」では''禁止''します。
例えば、文字列
[PRE[
=?iso-8859-1?q?this is some text?=
]PRE]
> would be parsed as four 'atom's, rather than as a single 'atom' (by
an RFC 822 parser) or 'encoded-word' (by a parser which understands
'encoded-words'). The correct way to encode the string "this is some
text" is to encode the SPACE characters as well, e.g.
は (RFC 822 解析器には) 1つの「atom」や (「encoded-word」を理解する解析器には)
「encoded-word」ではなく、4つの「atom」と解析されます。
文字列「This is some text」を符号化する正しい方法は、
SPACE 文字を同様に符号化して次のようにするものです。
[PRE[
=?iso-8859-1?q?this=20is=20some=20text?=
]PRE]
> The characters which may appear in 'encoded-text' are further
restricted by the rules in section 5.
「encoded-text」に現れる文字は第5章の規則で更に制限します。
** 3. Character sets [INS[文字集合]]
> The 'charset' portion of an 'encoded-word' specifies the character
set associated with the unencoded text. A 'charset' can be any of
the character set names allowed in an MIME "charset" parameter of a
"text/plain" body part, or any character set name registered with
IANA for use with the MIME text/plain content-type.
> Some character sets use code-switching techniques to switch between
"ASCII mode" and other modes. If unencoded text in an 'encoded-word'
contains a sequence which causes the charset interpreter to switch
out of ASCII mode, it MUST contain additional control codes such that
ASCII mode is again selected at the end of the 'encoded-word'. (This
rule applies separately to each 'encoded-word', including adjacent
'encoded-word's within a single header field.)
幾つかの文字集合は「ASCII 状態」と他の状態を切り替えるのに
符号切り替え技術を使います。「encoded-word」中の符号化されていない文が、
charset 解釈者に ASCII 状態の他に切り替えさせる列である場合、
「encoded-word」の終わりで再び ASCII 状態に切り替える制御符号
を加えて入れなければ''なりません''。 (この規則は各「encoded-word」
にそれぞれ適用されます。単一頭領域内の隣接した「encoded-word」も含みます。)
> When there is a possibility of using more than one character set to
represent the text in an 'encoded-word', and in the absence of
private agreements between sender and recipients of a message, it is
recommended that members of the ISO-8859-* series be used in
preference to other character sets.
「encoded-word」中の文を表現する文字集合を複数使用できるときは、
メッセージの送信者と受信者の間の私的な合意が無い限り、
ISO-8859-* 系列のどれかを他の文字集合に優先して使用することを推奨します。
** 4. Encodings [INS[符号化]]
[PRE[
Initially, the legal values for "encoding" are "Q" and "B". These
encodings are described below. The "Q" encoding is recommended for
use when most of the characters to be encoded are in the ASCII
character set; otherwise, the "B" encoding should be used.
Nevertheless, a mail reader which claims to recognize 'encoded-word's
MUST be able to accept either encoding for any character set which it
supports.
]PRE]
[PRE[
Only a subset of the printable ASCII characters may be used in
'encoded-text'. Space and tab characters are not allowed, so that
the beginning and end of an 'encoded-word' are obvious. The "?"
character is used within an 'encoded-word' to separate the various
portions of the 'encoded-word' from one another, and thus cannot
appear in the 'encoded-text' portion. Other characters are also
illegal in certain contexts. For example, an 'encoded-word' in a
'phrase' preceding an address in a From header field may not contain
any of the "specials" defined in RFC 822. Finally, certain other
characters are disallowed in some contexts, to ensure reliability for
messages that pass through internetwork mail gateways.
]PRE]
[PRE[
The "B" encoding automatically meets these requirements. The "Q"
encoding allows a wide range of printable characters to be used in
non-critical locations in the message header (e.g., Subject), with
fewer characters available for use in other locations.
]PRE]
*** 4.1. The "B" encoding [INS["B" 符号化方式]]
> The "B" encoding is identical to the "BASE64" encoding defined by RFC 2045.
「B」符号化方式は RFC 2045 で定義された "BASE64"
符号化方式と同じです。
*** 4.2. The "Q" encoding [INS["Q" 符号化方式]]
> The "Q" encoding is similar to the "Quoted-Printable" content-
transfer-encoding defined in RFC 2045. It is designed to allow text
containing mostly ASCII characters to be decipherable on an ASCII
terminal without decoding.
[PRE[
(1) Any 8-bit value may be represented by a "=" followed by two
hexadecimal digits. For example, if the character set in use
were ISO-8859-1, the "=" character would thus be encoded as
"=3D", and a SPACE by "=20". (Upper case should be used for
hexadecimal digits "A" through "F".)
]PRE]
(2) The 8-bit hexadecimal value 20 (e.g., ISO-8859-1 SPACE) may be
represented as "_" (underscore, ASCII 95.). (This character may
not pass through some internetwork mail gateways, but its use
will greatly enhance readability of "Q" encoded data with mail
readers that do not support this encoding.) Note that the "_"
always represents hexadecimal 20, even if the SPACE character
occupies a different code position in the character set in use.
(2) 8ビットの16進値で 20 (ISO-8859-1 では SPACE (間隔)) は
「_」 (下線, ASCII 95) えで表現できます。 (この文字は
幾つかのネットワーク間メイル関門を通過できないかもしれませんが、
この符号化法に対応していないメイル読者の「Q」符号化データ可読性を
非常に高めることになります。) なお、「_」は SPACE 文字が使われる
文字集合において他の符号位置にある場合であっても常に16進20を表現します。
(3) 8-bit values which correspond to printable ASCII characters other
than "=", "?", and "_" (underscore), MAY be represented as those
characters. (But see section 5 for restrictions.) In
particular, SPACE and TAB MUST NOT be represented as themselves
within encoded words.
(3) 「=」, 「?」, 「_」(下線) を除く印字可能な ASCII 文字に対応する
8ビット値はその文字で表現しても''構いません''。 (但し第5章の
制限も参照して下さい。) 特に、 SPACE と TAB は符号化語内で
自身を表現しては''なりません''。
** 5. Use of encoded-words in message headers [INS[encoded-word のメッセージ頭内での使用]]
> An 'encoded-word' may appear in a message header or body part header
according to the following rules:
> (1) An 'encoded-word' may replace a 'text' token (as defined by RFC 822)
in any Subject or Comments header field, any extension message
header field, or any MIME body part field for which the field body
is defined as '*text'. An 'encoded-word' may also appear in any
user-defined ("X-") message or body part header field.
(1) 「encoded-word」は Subject (主題), Comments (注釈)
頭欄や欄本体が「*text」と定義された拡張メッセージ頭欄, MIME
本体部分欄の「text」 token (RFC 822 で定義) を置き換えることが出来ます。
「encoded-word」は利用者定義 (「X-」) メッセージ/本体部分頭領域
に出現することも出来ます。
[INS[
訳注: 「拡張メッセージ頭欄, ''欄本体が『*text』と定義された'' MIME
本体部分欄」ではありません。 [[ietf-822]] の
<mid:200210141709.g9EH9c015818@astro.cs.utk.edu> を参照。
]INS]
> Ordinary ASCII text and 'encoded-word's may appear together in the
same header field. However, an 'encoded-word' that appears in a
header field defined as '*text' MUST be separated from any adjacent
'encoded-word' or 'text' by 'linear-white-space'.
普通の ASCII 文と「encoded-word」は同じ頭欄に同時に出現出来ます。
しかし、「*text」と定義された頭欄内に出現する「encoded-word」
は他のどの隣接する「encoded-word」または「text」とも
「linear-white-space」で区切らなければ''なりません''。
> (2) An 'encoded-word' may appear within a 'comment' delimited by "(" and
")", i.e., wherever a 'ctext' is allowed. More precisely, the RFC
822 ABNF definition for 'comment' is amended as follows:
(2) 「encoded-word」は「(」と「)」で区切られた「comment」(注釈)内、
つまり「ctext」が認められる場所で出現出来ます。より正確に言うと、
「comment」の RFC 822 ABNF 定義は次のように改訂されます。
[PRE[
comment = "(" *(ctext / quoted-pair / comment / encoded-word) ")"
]PRE]
A "Q"-encoded 'encoded-word' which appears in a 'comment' MUST NOT
contain the characters "(", ")" or "
'encoded-word' that appears in a 'comment' MUST be separated from
any adjacent 'encoded-word' or 'ctext' by 'linear-white-space'.
「comment」中の「Q」符号化された「encoded-word」では
文字「(」, 「)」, 「\」は使っては''いけません''。
「comment」中に現れる「encoded-word」は他のどの「encoded-word」
や「ctext」とも「linear-white-space」で区切らなければ''なりません''。
;; 訳注: 原文「\」欠落
It is important to note that 'comment's are only recognized inside
"structured" field bodies. In fields whose bodies are defined as
'*text', "(" and ")" are treated as ordinary characters rather than
comment delimiters, and rule (1) of this section applies. (See RFC
822, sections 3.1.2 and 3.1.3)
「comment」は「構造化」領域本体でのみ認識されることに
注意するのが重要です。本体が「*text」と定義された領域では
「(」や「)」は注釈区切りではなく普通の文字として扱われるので、
この節の規則(1)が適用されます。 (RFC 822 第3.1.2節, 第3.1.3節参照)
(3) As a replacement for a 'word' entity within a 'phrase', for example,
one that precedes an address in a From, To, or Cc header. The ABNF
definition for 'phrase' from RFC 822 thus becomes:
(3) 「phrase」中の「word」実体を置き換えます。例えば
From (送信元), To (送信先), Cc (同報) 頭中のアドレスの前に
来るものです。 RFC 822 の「phrase」の ABNF 定義はですから次のようになります。
[PRE[
phrase = 1*( encoded-word / word )
]PRE]
In this case the set of characters that may be used in a "Q"-encoded
'encoded-word' is restricted to: <upper and lower case ASCII
letters, decimal digits, "!", "*", "+", "-", "/", "=", and "_"
(underscore, ASCII 95.)>. An 'encoded-word' that appears within a
'phrase' MUST be separated from any adjacent 'word', 'text' or
'special' by 'linear-white-space'.
この場合に「Q」符号化された「encoded-word」で使用可能な文字の集合は
<ASCII の大文字・小文字, 10進数字, 「!」, 「*」, 「+」, 「+」, 「-」,
「/」, 「=」, 「_」(下線, ASCII 95)> に制限されます。
「phrase」中に現れる「encoded-word」は他のどの隣接する「word」,
「text」, 「special」とも「linear-white-space」で区切らなければ
''なりません''。
These are the ONLY locations where an 'encoded-word' may appear. In particular:
これ''だけ''が「encoded-word」が出現できる場所です。
特に、
- An 'encoded-word' MUST NOT appear in any portion of an 'addr-spec'.
- An 'encoded-word' MUST NOT appear within a 'quoted-string'.
- An 'encoded-word' MUST NOT be used in a Received header field.
- An 'encoded-word' MUST NOT be used in parameter of a MIME Content-Type or Content-Disposition field, or in any structured field body except within a 'comment' or 'phrase'.
- 「encoded-word」は「addr-spec」のどこにも出現しては''いけません''。
- 「encoded-word」は「quoted-string」中に出現しては''いけません''。
- 「encoded-word」は Received (受信) 頭領域中で使用しては''いけません''。
- 「encoded-word」は MIME Content-Type (内容型), Content-Disposition 領域の parameter 中や「comment」や「phrase」内を除くどの構造化領域本体でも使用しては''いけません''。
[PRE[
The 'encoded-text' in an 'encoded-word' must be self-contained;
'encoded-text' MUST NOT be continued from one 'encoded-word' to
another. This implies that the 'encoded-text' portion of a "B"
'encoded-word' will be a multiple of 4 characters long; for a "Q"
'encoded-word', any "=" character that appears in the 'encoded-text'
portion will be followed by two hexadecimal characters.
]PRE]
[PRE[
Each 'encoded-word' MUST encode an integral number of octets. The
'encoded-text' in each 'encoded-word' must be well-formed according
to the encoding specified; the 'encoded-text' may not be continued in
the next 'encoded-word'. (For example, "=?charset?Q?=?=
=?charset?Q?AB?=" would be illegal, because the two hex digits "AB"
must follow the "=" in the same 'encoded-word'.)
]PRE]
[PRE[
Each 'encoded-word' MUST represent an integral number of characters.
A multi-octet character may not be split across adjacent 'encoded-
word's.
]PRE]
[PRE[
Only printable and white space character data should be encoded using
this scheme. However, since these encoding schemes allow the
encoding of arbitrary octet values, mail readers that implement this
decoding should also ensure that display of the decoded data on the
recipient's terminal will not cause unwanted side-effects.
]PRE]
[PRE[
Use of these methods to encode non-textual data (e.g., pictures or
sounds) is not defined by this memo. Use of 'encoded-word's to
represent strings of purely ASCII characters is allowed, but
discouraged. In rare cases it may be necessary to encode ordinary
text that looks like an 'encoded-word'.
]PRE]
** 6. Support of 'encoded-word's by mail readers
** 6. 「encoded-word」のメイル読者による対応
*** 6.1. Recognition of 'encoded-word's in message headers
*** 6.1. メッセージ頭中の「encoded-word」の認識
[PRE[
A mail reader must parse the message and body part headers according
to the rules in RFC 822 to correctly recognize 'encoded-word's.
]PRE]
[PRE[
'encoded-word's are to be recognized as follows:
]PRE]
[PRE[
(1) Any message or body part header field defined as '*text', or any
user-defined header field, should be parsed as follows: Beginning
at the start of the field-body and immediately following each
occurrence of 'linear-white-space', each sequence of up to 75
printable characters (not containing any 'linear-white-space')
should be examined to see if it is an 'encoded-word' according to
the syntax rules in section 2. Any other sequence of printable
characters should be treated as ordinary ASCII text.
]PRE]
(1) 「*text」と定義されたメッセージ/本体部分頭領域, または利用者定義の頭領域
では次のように解析します: 領域本体の最初から初めて、
すぐに「linear-white-space」が各々続き、75字までの印字可能文字
(「linear-white-space」を含まない) の各列を第2章の構文規則に従った
「encoded-wprd」か調べます。他の印字可能な文字の列は普通の
ASCII 文として扱います。
[PRE[
(2) Any header field not defined as '*text' should be parsed
according to the syntax rules for that header field. However,
any 'word' that appears within a 'phrase' should be treated as an
'encoded-word' if it meets the syntax rules in section 2.
Otherwise it should be treated as an ordinary 'word'.
]PRE]
(2) 「*text」と定義されていない頭領域はその頭領域の構文規則に
従って解析します。しかし、「phrase」中に現れる「word」
は第2章の構文規則に合致すれば「encoded-word」として扱います。
そうでなければ普通の「word」として扱います。
[PRE[
(3) Within a 'comment', any sequence of up to 75 printable characters
(not containing 'linear-white-space'), that meets the syntax
rules in section 2, should be treated as an 'encoded-word'.
Otherwise it should be treated as normal comment text.
]PRE]
(3) 「COMMENT」中では、75字までの印字可能な文字 (「linear-white-space」
を含まない) で第2章の構文規則に合致するものを
「encoded-word」として扱います。そうでなければ普通の comment 文として
扱います。
[PRE[
(4) A MIME-Version header field is NOT required to be present for
'encoded-word's to be interpreted according to this
specification. One reason for this is that the mail reader is
not expected to parse the entire message header before displaying
lines that may contain 'encoded-word's.
]PRE]
(4) MIME-Version 頭領域は「encoded-word」をこの仕様書に従って
解釈するのにある必要は''ありません''。この理由の一つは
メイル読者がメッセージ頭全体を「encoded-word」を含む行を表示する前に
解析するとか限らないからです。
*** 6.2. Display of 'encoded-word's
*** 6.2. 「encoded-word」の表示
Any 'encoded-word's so recognized are decoded, and if possible, the
resulting unencoded text is displayed in the original character set.
認識された「encoded-word」は復号し、可能ならその結果の
符号化されていない文を元の文字集合で表示します。
NOTE: Decoding and display of encoded-words occurs *after* a
structured field body is parsed into tokens. It is therefore
possible to hide 'special' characters in encoded-words which, when
displayed, will be indistinguishable from 'special' characters in the
surrounding text. For this and other reasons, it is NOT generally
possible to translate a message header containing 'encoded-word's to
an unencoded form which can be parsed by an RFC 822 mail reader.
参考: encoded-word の復号と表示は構造化領域本体が token に
解析された'後'にします。そうしなければ、 encoded-word 中に
周りの文中の「special」文字と区別がつかなくなる「special」文字が隠れて
いるかもしれないのです。この理由や他の理由により、
「encoded-word」を含むメッセージ頭を RFC 822 メイル読者が
解析できる符号化されていないものに変換することは
通常出来''ない''のです。
[PRE[
When displaying a particular header field that contains multiple
'encoded-word's, any 'linear-white-space' that separates a pair of
adjacent 'encoded-word's is ignored. (This is to allow the use of
multiple 'encoded-word's to represent long strings of unencoded text,
without having to separate 'encoded-word's where spaces occur in the
unencoded text.)
]PRE]
複数の「encoded-word」を含む頭領域を表示する場合、
隣接する「encoded-word」を区切る「linear-white-space」は無視します。
(これは長い文字列の符号化されていない文を、
符号化されていない文中で空白間隔で区切った「encoded-word」を使わずに
表現するのに「encoded-word」を利用できるようにするためです。)
In the event other encodings are defined in the future, and the mail
reader does not support the encoding used, it may either (a) display
the 'encoded-word' as ordinary text, or (b) substitute an appropriate
message indicating that the text could not be decoded.
将来定義される他の符号化方式であった時、メイル読者が対応していない
符号化法式が使われていたときは、 (a) 「encoded-word」を普通の文で
表示または (b) 復号できなかった文を示す適切なメッセージを
代わりに使うかのどちらかを採っても構いません。
If the mail reader does not support the character set used, it may
(a) display the 'encoded-word' as ordinary text (i.e., as it appears
in the header), (b) make a "best effort" to display using such
characters as are available, or (c) substitute an appropriate message
indicating that the decoded text could not be displayed.
メイル読者が対応していない文字集合が使われていた場合、
(a) 「encoded-word」を普通の文として表示 (頭中に出てくるままに),
(b) 使われている文字が表示出来るように「最大の努力」を払う,
(c) 復号できなかった文を示す適切なメッセージを
代わりに使うのいずれかを採っても構いません。
If the character set being used employs code-switching techniques,
display of the encoded text implicitly begins in "ASCII mode". In
addition, the mail reader must ensure that the output device is once
again in "ASCII mode" after the 'encoded-word' is displayed.
符号切り替え技術を用いる文字集合が使われている場合、
符号化文は暗黙の内に「ASCII 状態」から始まります。
加えて、メイル読者は「encoded-word」が表示された後再び
「ASCII 状態」出力機器を念のため戻さなければなりません。
*** 6.3. Mail reader handling of incorrectly formed 'encoded-word's
*** 6.3. 不正形式の「encoded-word」のメイル読者の取り扱い
It is possible that an 'encoded-word' that is legal according to the
syntax defined in section 2, is incorrectly formed according to the
rules for the encoding being used. For example:
[PRE[
(1) An 'encoded-word' which contains characters which are not legal
for a particular encoding (for example, a "-" in the "B"
encoding, or a SPACE or HTAB in either the "B" or "Q" encoding),
is incorrectly formed.
]PRE]
(1) その符号化方式で妥当でない文字
(例えば、「B」符号化方式の「-」
や「B」・「Q」いずれの符号化方式においても SPACE (間隔) や HTAB (水平タブ))
を含む「encoded-word」は不正確に形成されています。
[PRE[
(2) Any 'encoded-word' which encodes a non-integral number of
characters or octets is incorrectly formed.
]PRE]
(2) 非整数個の文字やオクテットを符号化した「encoded-word」は不正確に
形成されています。
[PRE[
A mail reader need not attempt to display the text associated with an
'encoded-word' that is incorrectly formed. However, a mail reader
MUST NOT prevent the display or handling of a message because an
'encoded-word' is incorrectly formed.
]PRE]
メイル読者は不正確に形成された「encoded-word」に関連付けられた文を表示
しようとする必要はありません。しかし、メイル読者は「encoded-word」が
不正確に形成されているからといってメッセージの表示や取り扱いを
やめては''いけません''。
** 7. Conformance
** 7. 適合性
[PRE[
A mail composing program claiming compliance with this specification
MUST ensure that any string of non-white-space printable ASCII
characters within a '*text' or '*ctext' that begins with "=?" and
ends with "?=" be a valid 'encoded-word'. ("begins" means: at the
start of the field-body, immediately following 'linear-white-space',
or immediately following a "(" for an 'encoded-word' within '*ctext';
"ends" means: at the end of the field-body, immediately preceding
'linear-white-space', or immediately preceding a ")" for an
'encoded-word' within '*ctext'.) In addition, any 'word' within a
'phrase' that begins with "=?" and ends with "?=" must be a valid
'encoded-word'.
]PRE]
この使用に適合すると主張するメイル作成プログラムは
「*text」や「*ctext」中の「=?」で始まり「?=」で終わる
非空白間隔印字可能 ASCII 文字から成る文字列を妥当な「encoded-word」
としなければ''なりません''。 (「始まる」の意味: 領域本体の開始時に、
直ぐに「linear-white-space」が来るか
「*ctext」中の「encoded-word」の場合直ぐに「(」が来る。
「終わる」の意味: 領域本体の終わりで、すぐに「linear-white-space」
が続くか「*ctext」中の「encoded-word」の場合直ぐに「)」が来る。)
加えて、「phrase」中の「word」で「=?」で始まり「?=」で終わるものを全て
妥当な「encoded-word」としなければなりません。
[PRE[
A mail reading program claiming compliance with this specification
must be able to distinguish 'encoded-word's from 'text', 'ctext', or
'word's, according to the rules in section 6, anytime they appear in
appropriate places in message headers. It must support both the "B"
and "Q" encodings for any character set which it supports. The
program must be able to display the unencoded text if the character
set is "US-ASCII". For the ISO-8859-* character sets, the mail
reading program must at least be able to display the characters which
are also in the ASCII set.
]PRE]
この仕様書に適合すると主張するメイルを読むプログラムは
「encoded-word」を第6章の規則に従い「text」, 「ctext」, 「word」から
区別できなければなりません。また、「B」符号化方式と
「Q」符号化方式の両方を対応しているどの文字集合に就いても
対応しなければなりません。プログラムは、文字集合が「US-ASCII」なら
符号化されていない文を表示できなければなりません。ISO-8859-*
文字集合については、メイルを読むプログラムは少なくても
ASCII 集合と同じ部分は表示できなければなりません。
** 8. Examples
The following are examples of message headers containing 'encoded-word's:
次に挙げるのは「encoded-word」を含むメッセージ頭の例です。
[PRE[
From: =?US-ASCII?Q?Keith_Moore?= <moore@cs.utk.edu>
To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= <keld@dkuug.dk>
CC: =?ISO-8859-1?Q?Andr=E9?= Pirard <PIRARD@vm1.ulg.ac.be>
Subject: =?ISO-8859-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?=
=?ISO-8859-2?B?dSB1bmRlcnN0YW5kIHRoZSBleGFtcGxlLg==?=
]PRE]
[PRE[
Note: In the first 'encoded-word' of the Subject field above, the
last "=" at the end of the 'encoded-text' is necessary because each
'encoded-word' must be self-contained (the "=" character completes a
group of 4 base64 characters representing 2 octets). An additional
octet could have been encoded in the first 'encoded-word' (so that
the encoded-word would contain an exact multiple of 3 encoded
octets), except that the second 'encoded-word' uses a different
'charset' than the first one.
]PRE]
参考: 上の Subject (主題) 領域の最初の「encoded-word」で、
「encoded-text」の最後の「=」は、各「encoded-word」が
自己完結していなければならない (「=」文字は2オクテットを表現する
4つの base64 文字の集団を完成させる) ので必要です。...
[PRE[
From: =?ISO-8859-1?Q?Olle_J=E4rnefors?= <ojarnef@admin.kth.se>
To: ietf-822@dimacs.rutgers.edu, ojarnef@admin.kth.se
Subject: Time for ISO 10646?
]PRE]
[PRE[
To: Dave Crocker <dcrocker@mordor.stanford.edu>
Cc: ietf-822@dimacs.rutgers.edu, paf@comsol.se
From: =?ISO-8859-1?Q?Patrik_F=E4ltstr=F6m?= <paf@nada.kth.se>
Subject: Re: RFC-HDR care and feeding
]PRE]
[PRE[
From: Nathaniel Borenstein <nsb@thumper.bellcore.com>
(=?iso-8859-8?b?7eXs+SDv4SDp7Oj08A==?=)
To: Greg Vaudreuil <gvaudre@NRI.Reston.VA.US>, Ned Freed
<ned@innosoft.com>, Keith Moore <moore@cs.utk.edu>
Subject: Test of new header generator
MIME-Version: 1.0
Content-type: text/plain; charset=ISO-8859-1
]PRE]
[PRE[
The following examples illustrate how text containing 'encoded-word's
which appear in a structured field body. The rules are slightly
different for fields defined as '*text' because "(" and ")" are not
recognized as 'comment' delimiters. [Section 5, paragraph (1)].
]PRE]
[PRE[
In each of the following examples, if the same sequence were to occur
in a '*text' field, the "displayed as" form would NOT be treated as
encoded words, but be identical to the "encoded form". This is
because each of the encoded-words in the following examples is
adjacent to a "(" or ")" character.
]PRE]
[PRE[
encoded form displayed as
---------------------------------------------------------------------
(=?ISO-8859-1?Q?a?=) (a)
]PRE]
[PRE[
(=?ISO-8859-1?Q?a?= b) (a b)
]PRE]
[PRE[
Within a 'comment', white space MUST appear between an
'encoded-word' and surrounding text. [Section 5,
paragraph (2)]. However, white space is not needed between
the initial "(" that begins the 'comment', and the
'encoded-word'.
]PRE]
[PRE[
(=?ISO-8859-1?Q?a?= =?ISO-8859-1?Q?b?=) (ab)
]PRE]
[PRE[
White space between adjacent 'encoded-word's is not
displayed.
]PRE]
[PRE[
(=?ISO-8859-1?Q?a?= =?ISO-8859-1?Q?b?=) (ab)
]PRE]
[PRE[
Even multiple SPACEs between 'encoded-word's are ignored
for the purpose of display.
]PRE]
[PRE[
(=?ISO-8859-1?Q?a?= (ab)
=?ISO-8859-1?Q?b?=)
]PRE]
[PRE[
Any amount of linear-space-white between 'encoded-word's,
even if it includes a CRLF followed by one or more SPACEs,
is ignored for the purposes of display.
]PRE]
[PRE[
(=?ISO-8859-1?Q?a_b?=) (a b)
]PRE]
[PRE[
In order to cause a SPACE to be displayed within a portion
of encoded text, the SPACE MUST be encoded as part of the
'encoded-word'.
]PRE]
[PRE[
(=?ISO-8859-1?Q?a?= =?ISO-8859-2?Q?_b?=) (a b)
]PRE]
[PRE[
In order to cause a SPACE to be displayed between two strings
of encoded text, the SPACE MAY be encoded as part of one of
the 'encoded-word's.
]PRE]
** 9. References
** 9. 参考文献
[PRE[
[RFC 822] Crocker, D., "Standard for the Format of ARPA Internet Text
Messages", STD 11, RFC 822, UDEL, August 1982.
]PRE]
[PRE[
[RFC 2049] Borenstein, N., and N. Freed, "Multipurpose Internet Mail
Extensions (MIME) Part Five: Conformance Criteria and Examples",
RFC 2049, November 1996.
]PRE]
[PRE[
[RFC 2045] Borenstein, N., and N. Freed, "Multipurpose Internet Mail
Extensions (MIME) Part One: Format of Internet Message Bodies",
RFC 2045, November 1996.
]PRE]
[PRE[
[RFC 2046] Borenstein N., and N. Freed, "Multipurpose Internet Mail
Extensions (MIME) Part Two: Media Types", RFC 2046,
November 1996.
]PRE]
[PRE[
[RFC 2048] Freed, N., Klensin, J., and J. Postel, "Multipurpose
Internet Mail Extensions (MIME) Part Four: Registration
Procedures", RFC 2048, November 1996.
]PRE]
** 10. Security Considerations
** 10. 保安に関して
[PRE[
Security issues are not discussed in this memo.
]PRE]
** 11. Acknowledgements
** 11. 謝辞
[PRE[
The author wishes to thank Nathaniel Borenstein, Issac Chan, Lutz
Donnerhacke, Paul Eggert, Ned Freed, Andreas M. Kirchwitz, Olle
Jarnefors, Mike Rosin, Yutaka Sato, Bart Schaefer, and Kazuhiko
Yamamoto, for their helpful advice, insightful comments, and
illuminating questions in response to earlier versions of this
specification.
]PRE]
** 12. Author's Address
** 12. 著者の連絡先