Skip to content

Commit

Permalink
Fix embarassing thinko in encodeStringUtf8
Browse files Browse the repository at this point in the history
TODO: we need more unit-tests to catch this; also I noticed
that `decodeStringUtf8` (which was previously named `fromUTF8BSImpl`)
appears to have been broken for quite some time.
  • Loading branch information
hvr committed Oct 4, 2016
1 parent e14da30 commit a87fcd1
Showing 1 changed file with 14 additions and 11 deletions.
25 changes: 14 additions & 11 deletions Cabal/Distribution/Utils/String.hs
Original file line number Diff line number Diff line change
Expand Up @@ -65,18 +65,21 @@ decodeStringUtf8 = go
encodeStringUtf8 :: String -> [Word8]
encodeStringUtf8 [] = []
encodeStringUtf8 (c:cs)
| c <= '\x07F' = w
| c <= '\x07F' = w8
: encodeStringUtf8 cs
| c <= '\x7FF' = (0xC0 .|. (w `shiftR` 6))
: (0x80 .|. (w .&. 0x3F))
| c <= '\x7FF' = (0xC0 .|. w8ShiftR 6 )
: (0x80 .|. (w8 .&. 0x3F))
: encodeStringUtf8 cs
| c <= '\xFFFF'= (0xE0 .|. (w `shiftR` 12))
: (0x80 .|. ((w `shiftR` 6) .&. 0x3F))
: (0x80 .|. (w .&. 0x3F))
| c <= '\xFFFF'= (0xE0 .|. w8ShiftR 12 )
: (0x80 .|. (w8ShiftR 6 .&. 0x3F))
: (0x80 .|. (w8 .&. 0x3F))
: encodeStringUtf8 cs
| otherwise = (0xf0 .|. (w `shiftR` 18))
: (0x80 .|. ((w `shiftR` 12) .&. 0x3F))
: (0x80 .|. ((w `shiftR` 6) .&. 0x3F))
: (0x80 .|. (w .&. 0x3F))
| otherwise = (0xf0 .|. w8ShiftR 18 )
: (0x80 .|. (w8ShiftR 12 .&. 0x3F))
: (0x80 .|. (w8ShiftR 6 .&. 0x3F))
: (0x80 .|. (w8 .&. 0x3F))
: encodeStringUtf8 cs
where w = fromIntegral (ord c) :: Word8
where
w8 = fromIntegral (ord c) :: Word8
w8ShiftR :: Int -> Word8
w8ShiftR = fromIntegral . shiftR (ord c)

0 comments on commit a87fcd1

Please sign in to comment.