-
Notifications
You must be signed in to change notification settings - Fork 0
/
doc.go
39 lines (22 loc) · 1 KB
/
doc.go
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
/*
Package utf8n implements functions and constants to support normalizing text encoded in UTF-8.
This package is similar to the Go built-in "unicode/utf8" package,
except it normalizes ‘line separator’ and ‘paragraph separator’ characters.
So that it transforms:
CR LF ⇒ LS
LF ⇒ LS
CR ⇒ LS
NEL ⇒ LS
And then after (conceptually) doing that, transforms:
LS LS ⇒ PS
The meanings of LF, CR, NEL, LS, and PS are:
LF = “line feed” = U+000A = '\u000A' = '\n'
CR = “carriage return” = U+000D = '\u000D' = '\r'
NEL = “next line” = U+0085 = '\u0085'
LS = “line separator” = U+2028 = '\u2028'
PS = “paragraph separator” = U+2029 = '\u2029'
The result of these transformations is that:
№1: ‘line separator’, and ‘paragraph separator’ characters are always represented by a single rune,
№2: ‘line separator’, and ‘paragraph separator’ characters are always represented by the same runes.
*/
package utf8n