Unicode value decoding bug #105

Closed
nikita-volkov opened this Issue Dec 29, 2012 · 3 comments

Comments

Projects
None yet
1 participant
Contributor

nikita-volkov commented Dec 29, 2012

This code

{-# LANGUAGE DeriveDataTypeable #-}
import qualified Data.ByteString.Lazy.Char8 as LBS
import qualified Data.Text.Lazy as LText
import qualified Data.Text.Lazy.Encoding as LText
import qualified Data.Aeson.Generic as GenericAeson
import Data.Generics

data A = A { a :: String } deriving (Data, Typeable, Show)

json = "{\"a\": \"Ёжик лижет мёд.\"}"

main = do
  let jsonLBS = LText.encodeUtf8 $ LText.pack json
  LBS.putStrLn jsonLBS
  let Just a = GenericAeson.decode jsonLBS :: Maybe A
  print a

outputs

{"a": "Ёжик лижет мёд."}
A {a = "\1025\1078\1080\1082 \1083\1080\1078\1077\1090 \1084\1105\1076."}

As you can see the A gets constructed with an encoded value.

Contributor

nikita-volkov commented Dec 29, 2012

Output is the same for TH-based version:

{-# LANGUAGE DeriveDataTypeable, TemplateHaskell #-}
import qualified Data.ByteString.Lazy.Char8 as LBS
import qualified Data.Text.Lazy as LText
import qualified Data.Text.Lazy.Encoding as LText
import qualified Data.Aeson as Aeson
import qualified Data.Aeson.TH as Aeson
import Data.Generics

data A = A { a :: String } deriving (Show)
$(Aeson.deriveJSON id ''A)

json = "{\"a\": \"Ёжик лижет мёд.\"}"

main = do
  let jsonLBS = LText.encodeUtf8 $ LText.pack json
  LBS.putStrLn jsonLBS
  let Just a = Aeson.decode jsonLBS :: Maybe A
  print a
Contributor

nikita-volkov commented Dec 29, 2012

And simply for JSON, which proves that the problem is deep:

import qualified Data.ByteString.Lazy.Char8 as LBS
import qualified Data.Text.Lazy as LText
import qualified Data.Text.Lazy.Encoding as LText
import qualified Data.Aeson as Aeson

json = "{\"a\": \"Ёжик лижет мёд.\"}"

main = do
  let jsonLBS = LText.encodeUtf8 $ LText.pack json
  LBS.putStrLn jsonLBS
  let Just a = Aeson.decode jsonLBS :: Maybe Aeson.Value
  print a

outputting

{"a": "Ёжик лижет мёд."}
Object fromList [("a",String "\1025\1078\1080\1082 \1083\1080\1078\1077\1090 \1084\1105\1076.")]
Contributor

nikita-volkov commented Dec 29, 2012

Seems to be a false alarm. See http://stackoverflow.com/a/14084097/485115

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment