Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python - string representation of objects should be unicodes, not strs #2277

Closed
jerjou opened this issue Oct 21, 2016 · 3 comments
Closed

Python - string representation of objects should be unicodes, not strs #2277

jerjou opened this issue Oct 21, 2016 · 3 comments
Assignees
Labels

Comments

@jerjou
Copy link

jerjou commented Oct 21, 2016

I have a script that makes grpc calls, that return text that's potentially in a different language. In the code, I print out the list of objects who have the attribute transcript, that is the unicode with the text. However, if the text is in fact a double-byte character, it's printed out as escape sequences. Printing out the transcript attribute directly correctly prints out the desired character.

I suspect this is because the str of the protoc-generated response objects are str objects instead of unicode objects. Given that in python3, all string objects default to unicode objects, it might be prudent for the string representations of grpc-generated objects also be unicode objects.

@xfxyjwf
Copy link
Contributor

xfxyjwf commented Oct 22, 2016

I'm pretty sure that's working as intended: by default we print all non-ascii characters in escaped sequence. To print non-ascii character without escaping, you can use the text_format utility:

from google.protobuf import text_format
# The default behavior.
print text_format.MessageToString(msg)
# The behavior you want.
print text_format.MessageToString(msg, as_utf8 = True)

@jerjou
Copy link
Author

jerjou commented Oct 24, 2016

Wouldn't it be better if it was unicode, though? Wouldn't that be more consistent with python3, which defaults to unicode? As well as being more friendly to non-english users?u

@anandolee anandolee self-assigned this Mar 8, 2017
@anandolee
Copy link
Contributor

It's not nice to change the default behavior which breaks many existing projects. I think Feng's suggestion is good enough for your case.

Closing it for clean up. Feel free to reopen if it is still an issue for you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants