problem with decoding unicode #27

Open
lloyd opened this Issue Apr 20, 2011 · 3 comments

Comments

Projects
None yet
3 participants
Contributor

lloyd commented Apr 20, 2011

(move over here from yajl proper)

lloyd/yajl#20

jerzyk commented Apr 20, 2011

for the completion, here is original ticket content:

while using unicode data library is throwing exceptions:

In [1]: a = u'[{"data":"Podstawow\u0105 opiek\u0119 zdrowotn\u0105"}]'
In [2]: import yajl
In [3]: yajl.loads(a)
---------------------------------------------------------------------------
UnicodeEncodeError                        Traceback (most recent call last)
/home/cms/sobre-cms/<ipython console> in <module>()
UnicodeEncodeError: 'ascii' codec can't encode character u'\u0105' in position 19: ordinal not in range(128)

In [4]: import simplejson
In [5]: simplejson.loads(a)
Out[5]: [{u'data': u'Podstawow\u0105 opiek\u0119 zdrowotn\u0105'}]

same code for different libraries (here simplejson) working fine

Contributor

teepark commented Apr 20, 2011

fix at least for Decoder.decode() in cafdd07

http://docs.python.org/c-api/arg.html the issue is using the z# formatter for argument parsing. the docs there don't say how it encodes a unicode object into a char buffer, but it's clearly not "use utf8", so the change just gets explicit about that.

Contributor

teepark commented Apr 20, 2011

aaaand that change isn't decrefing the pybuffer in the success case. wait for a formal pull request pls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment