Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeEncodeError #8

Closed
un1t opened this issue Mar 22, 2013 · 5 comments
Closed

UnicodeEncodeError #8

un1t opened this issue Mar 22, 2013 · 5 comments
Assignees

Comments

@un1t
Copy link

un1t commented Mar 22, 2013

Hi!
Cool project. I was looking for something like this. I came across a bug.

I have some xml with unicode chars:

<?xml version="1.0" encoding="UTF-8"?>
<page>
    <menu>
    <name>Привет мир</name>
    <items>
        <item>
            <name>Пункт 1</name>
            <url>http://example1.com</url>
        </item>
        <item>
            <name>Пункт 2</name>
            <url>http://example2.com</url>
        </item>
    </items>
    </menu>
</page>


>>> obj = untangle.parse("1.xml")
>>> obj.page.menu.name
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 46-51: ordinal not in range(128)
@drowolath
Copy link

2013/3/22 Ilya Shalyapin notifications@github.com

Hi!
Cool project. I was looking for something like this. I came across a bug.

I have some xml with unicode chars:

ðÒÉ×ÅÔ ÍÉÒ ðÕÎËÔ 1 http://example1.com ðÕÎËÔ 2 http://example2.com

obj = untangle.parse("1.xml")
obj.page.menu.name
Traceback (most recent call last):
File "", line 1, in
UnicodeEncodeError: 'ascii' codec can't encode characters in position 46-51: ordinal not in range(128)

Reply to this email directly or view it on GitHubhttps://github.com//issues/8
.

Hi,

I did repeat your example, but got no error (I work with python 2.7.3)
I seem to recall that with python 2.6.6 the default encoding is set to
"ascii"
Try to set you default encoding to "utf-8" first.

import sys
reload(sys)
sys.setdefaultencoding('utf-8')

Thomas

@ghost ghost assigned stchris Mar 22, 2013
@stchris
Copy link
Owner

stchris commented Mar 22, 2013

Thanks, @drowolath !
@un1t : which python version are you using?

@un1t
Copy link
Author

un1t commented Mar 22, 2013

Python 2.7.3 (default, Aug  1 2012, 05:14:39)

getdefaultencoding returns 'ascii' , but I don't think it'i a problem. lxml.etree works fine with such files.

@step21
Copy link
Contributor

step21 commented Jun 9, 2014

I had the same problem with German and Hebrew characters especially. Importing sys and setting defaultencoding fixed it though. This should really be added to docs.

@stchris
Copy link
Owner

stchris commented Jun 11, 2014

Docs updated by @step21 in #11 . I hope that's good enough. Thanks, guys!

@stchris stchris closed this as completed Jun 11, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants