Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parser fails on non ASCII strings on python 2 #9

Closed
marnunez opened this issue Mar 28, 2018 · 3 comments
Closed

Parser fails on non ASCII strings on python 2 #9

marnunez opened this issue Mar 28, 2018 · 3 comments
Labels

Comments

@marnunez
Copy link

Under python2, using non ASCII characters raises UnicodeEncodeError.
How would you like this to be addressed?

@gatkin gatkin added the bug label Mar 28, 2018
@gatkin
Copy link
Owner

gatkin commented Mar 28, 2018

Good catch, we're definitely missing tests for unicode handling. To fix this, I think it would be best to create a failing test here for the issue, and then we will probably have to fix the unicode handling when reading XML data from a file in parse_from_file(). Most likely, it will probably be best to just use the ElementTree.parse function to read the file data and let ElementTree handle unicode for us.

I can probably get to this within the next week or two and push out a new release to PyPi, but if you are able to open a pull request with the fix, that would be great as well!

@marnunez
Copy link
Author

Ohh alright, trust ET.parse to pick up the right encoding from the XML header, right? And fallback to utf-8 if missing, I assume. I made a few fast modifications, I need some results for a project I'm working on (the urgent never leaves time for the important, right?), so I ended up adding the encoding as an optional parameter. Nevertheless, I would leave something like this as an option too, in case of a wrongly formatted XML.
First time dealing with the infamous Unicode pain. Hopefully the last one!

@gatkin
Copy link
Owner

gatkin commented Jun 2, 2018

Resolved in #10

@gatkin gatkin closed this as completed Jun 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants