Don't treat bytes objects as json-safe #769

Merged
merged 1 commit into from Sep 6, 2011

Projects

None yet

3 participants

@minrk
IPython member

json_clean passed bytes objects through as safe, which is incorrect. This decodes them with defaultencoding().

Should close #767

@ellisonbg
IPython member

This looks fine, I will merge.

@ellisonbg ellisonbg merged commit 14867db into ipython:master Sep 6, 2011
@takluyver
IPython member

Hang on, on Python 2, sys.getdefaultencoding() is ascii. So, going back to @jstenar's test case, any non-ascii characters in a docstring get mangled into the replacement character. Surely we can have a better guess at the encoding used, e.g. utf-8, or whatever sys.stdin.encoding is?

Also, after running Jörgen's test script, I notice that even with this fix, doing b? at the Qt console still crashes the kernel with a unicode error in dumping JSON.

@minrk
IPython member

Yes, it should do the same stdin.encoding guess we do elsewhere, though that will still not help in the many situations where stdin.encoding is None for the subprocess.

@takluyver
IPython member

Is there some better way to get the system code page on Windows? Or should we guess UTF-8, because most docstrings will probably be in saved Python code, which I think is mostly UTF-8 encoded. Then again, most good code should be using unicode strings if it needs non-ascii characters.

@minrk
IPython member

reopened as #770

@minrk
IPython member

We should probably centralize our guessed encoding, so we don't have these sys.stdin.encoding or sys.getdefaultencoding() lines all over the place. That would also make it less painful if/when we find better ways to guess.

@takluyver
IPython member

Note that in some places we also use sys.getfilesystemencoding().

@minrk
IPython member
@minrk minrk referenced this pull request Apr 9, 2014
Merged

#769 (reopened) #770

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment