Don't treat bytes objects as json-safe #769

Merged
merged 1 commit into from Sep 6, 2011

Conversation

Projects
None yet
3 participants
@minrk
Member

minrk commented Sep 6, 2011

json_clean passed bytes objects through as safe, which is incorrect. This decodes them with defaultencoding().

Should close #767

@ellisonbg

This comment has been minimized.

Show comment
Hide comment
@ellisonbg

ellisonbg Sep 6, 2011

Member

This looks fine, I will merge.

Member

ellisonbg commented Sep 6, 2011

This looks fine, I will merge.

ellisonbg added a commit that referenced this pull request Sep 6, 2011

Merge pull request #769 from minrk/jsonclean
Don't treat bytes objects as json-safe

@ellisonbg ellisonbg merged commit 14867db into ipython:master Sep 6, 2011

@takluyver

This comment has been minimized.

Show comment
Hide comment
@takluyver

takluyver Sep 6, 2011

Member

Hang on, on Python 2, sys.getdefaultencoding() is ascii. So, going back to @jstenar's test case, any non-ascii characters in a docstring get mangled into the replacement character. Surely we can have a better guess at the encoding used, e.g. utf-8, or whatever sys.stdin.encoding is?

Also, after running Jörgen's test script, I notice that even with this fix, doing b? at the Qt console still crashes the kernel with a unicode error in dumping JSON.

Member

takluyver commented Sep 6, 2011

Hang on, on Python 2, sys.getdefaultencoding() is ascii. So, going back to @jstenar's test case, any non-ascii characters in a docstring get mangled into the replacement character. Surely we can have a better guess at the encoding used, e.g. utf-8, or whatever sys.stdin.encoding is?

Also, after running Jörgen's test script, I notice that even with this fix, doing b? at the Qt console still crashes the kernel with a unicode error in dumping JSON.

@minrk

This comment has been minimized.

Show comment
Hide comment
@minrk

minrk Sep 6, 2011

Member

Yes, it should do the same stdin.encoding guess we do elsewhere, though that will still not help in the many situations where stdin.encoding is None for the subprocess.

Member

minrk commented Sep 6, 2011

Yes, it should do the same stdin.encoding guess we do elsewhere, though that will still not help in the many situations where stdin.encoding is None for the subprocess.

@takluyver

This comment has been minimized.

Show comment
Hide comment
@takluyver

takluyver Sep 6, 2011

Member

Is there some better way to get the system code page on Windows? Or should we guess UTF-8, because most docstrings will probably be in saved Python code, which I think is mostly UTF-8 encoded. Then again, most good code should be using unicode strings if it needs non-ascii characters.

Member

takluyver commented Sep 6, 2011

Is there some better way to get the system code page on Windows? Or should we guess UTF-8, because most docstrings will probably be in saved Python code, which I think is mostly UTF-8 encoded. Then again, most good code should be using unicode strings if it needs non-ascii characters.

@minrk

This comment has been minimized.

Show comment
Hide comment
@minrk

minrk Sep 6, 2011

Member

reopened as #770

Member

minrk commented Sep 6, 2011

reopened as #770

@minrk

This comment has been minimized.

Show comment
Hide comment
@minrk

minrk Sep 6, 2011

Member

We should probably centralize our guessed encoding, so we don't have these sys.stdin.encoding or sys.getdefaultencoding() lines all over the place. That would also make it less painful if/when we find better ways to guess.

Member

minrk commented Sep 6, 2011

We should probably centralize our guessed encoding, so we don't have these sys.stdin.encoding or sys.getdefaultencoding() lines all over the place. That would also make it less painful if/when we find better ways to guess.

@takluyver

This comment has been minimized.

Show comment
Hide comment
@takluyver

takluyver Sep 6, 2011

Member

Note that in some places we also use sys.getfilesystemencoding().

Member

takluyver commented Sep 6, 2011

Note that in some places we also use sys.getfilesystemencoding().

@minrk

This comment has been minimized.

Show comment
Hide comment
@minrk

minrk Sep 6, 2011

Member

Yes, but only for filenames (hopefully).

Member

minrk commented Sep 6, 2011

Yes, but only for filenames (hopefully).

@minrk minrk referenced this pull request Apr 9, 2014

Merged

#769 (reopened) #770

mattvonrocketstein pushed a commit to mattvonrocketstein/ipython that referenced this pull request Nov 3, 2014

Merge pull request #769 from minrk/jsonclean
Don't treat bytes objects as json-safe
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment