Skip to content

Commit

Permalink
Preserve Unicode characters (#679)
Browse files Browse the repository at this point in the history
Fixes dhall-lang/dhall-lang#267

According to the standard, Unicode characters up to `0x10FFFF` do not
require escaping.  See:

https://github.com/dhall-lang/dhall-lang/blob/33cab24f8edd81b167942847a3281306204f0109/standard/dhall.abnf#L192

... so we can preserve them when pretty-printing Dhall expressions.

Note that the current code still does not comply with the standard for Unicode
characters beyond `0x10FFFF`, but I'll defer fixing that to a subsequent
change.
  • Loading branch information
Gabriella439 committed Nov 12, 2018
1 parent 5db1051 commit b3968f6
Show file tree
Hide file tree
Showing 4 changed files with 49 additions and 13 deletions.
26 changes: 13 additions & 13 deletions dhall/src/Dhall/Pretty/Internal.hs
Original file line number Diff line number Diff line change
Expand Up @@ -910,22 +910,22 @@ escapeText :: Text -> Text
escapeText text = Text.concatMap adapt text
where
adapt c
| '\x20' <= c && c <= '\x21' = Text.singleton c
| '\x20' <= c && c <= '\x21' = Text.singleton c
-- '\x22' == '"'
| '\x23' == c = Text.singleton c
| '\x23' == c = Text.singleton c
-- '\x24' == '$'
| '\x25' <= c && c <= '\x5B' = Text.singleton c
| '\x25' <= c && c <= '\x5B' = Text.singleton c
-- '\x5C' == '\\'
| '\x5D' <= c && c <= '\x7F' = Text.singleton c
| c == '"' = "\\\""
| c == '$' = "\\$"
| c == '\\' = "\\\\"
| c == '\b' = "\\b"
| c == '\f' = "\\f"
| c == '\n' = "\\n"
| c == '\r' = "\\r"
| c == '\t' = "\\t"
| otherwise = "\\u" <> showDigits (Data.Char.ord c)
| '\x5D' <= c && c <= '\x10FFFF' = Text.singleton c
| c == '"' = "\\\""
| c == '$' = "\\$"
| c == '\\' = "\\\\"
| c == '\b' = "\\b"
| c == '\f' = "\\f"
| c == '\n' = "\\n"
| c == '\r' = "\\r"
| c == '\t' = "\\t"
| otherwise = "\\u" <> showDigits (Data.Char.ord c)

showDigits r0 = Text.pack (map showDigit [q1, q2, q3, r3])
where
Expand Down
4 changes: 4 additions & 0 deletions dhall/tests/Format.hs
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,10 @@ formatTests =
ASCII
"be able to format with ASCII characters"
"ascii"
, should
Unicode
"preserve Unicode characters"
"unicode"
]

should :: CharacterSet -> Text -> Text -> TestTree
Expand Down
16 changes: 16 additions & 0 deletions dhall/tests/format/unicodeA.dhall
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
λ(isActive : Bool)
{ barLeftEnd =
[ "" ] : Optional Text
, barRightEnd =
[ "" ] : Optional Text
, separator =
[ "" ] : Optional Text
, alignment =
< ToTheLeft = {=} | ToTheRight : {} | Centered : {} >
: ./Alignment.dhall
, barWidth =
[] : Optional Natural
, barSegments =
[ "index", "command", "path", "title" ]
}
: ./Bar.dhall
16 changes: 16 additions & 0 deletions dhall/tests/format/unicodeB.dhall
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
λ(isActive : Bool)
{ barLeftEnd =
[ "" ] : Optional Text
, barRightEnd =
[ "" ] : Optional Text
, separator =
[ "" ] : Optional Text
, alignment =
< ToTheLeft = {=} | ToTheRight : {} | Centered : {} >
: ./Alignment.dhall
, barWidth =
[] : Optional Natural
, barSegments =
[ "index", "command", "path", "title" ]
}
: ./Bar.dhall

0 comments on commit b3968f6

Please sign in to comment.