Skip to content

Loading…

Don't treat bytes objects as json-safe #769

Merged
merged 1 commit into from

3 participants

@minrk
IPython member

json_clean passed bytes objects through as safe, which is incorrect. This decodes them with defaultencoding().

Should close #767

@ellisonbg
IPython member

This looks fine, I will merge.

@ellisonbg ellisonbg merged commit 14867db into ipython:master
@takluyver
IPython member

Hang on, on Python 2, sys.getdefaultencoding() is ascii. So, going back to @jstenar's test case, any non-ascii characters in a docstring get mangled into the replacement character. Surely we can have a better guess at the encoding used, e.g. utf-8, or whatever sys.stdin.encoding is?

Also, after running Jörgen's test script, I notice that even with this fix, doing b? at the Qt console still crashes the kernel with a unicode error in dumping JSON.

@minrk
IPython member

Yes, it should do the same stdin.encoding guess we do elsewhere, though that will still not help in the many situations where stdin.encoding is None for the subprocess.

@takluyver
IPython member

Is there some better way to get the system code page on Windows? Or should we guess UTF-8, because most docstrings will probably be in saved Python code, which I think is mostly UTF-8 encoded. Then again, most good code should be using unicode strings if it needs non-ascii characters.

@minrk
IPython member

reopened as #770

@minrk
IPython member

We should probably centralize our guessed encoding, so we don't have these sys.stdin.encoding or sys.getdefaultencoding() lines all over the place. That would also make it less painful if/when we find better ways to guess.

@takluyver
IPython member

Note that in some places we also use sys.getfilesystemencoding().

@minrk
IPython member
@minrk minrk referenced this pull request
Merged

#769 (reopened) #770

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Sep 6, 2011
  1. @minrk
Showing with 6 additions and 2 deletions.
  1. +6 −2 IPython/utils/jsonutil.py
View
8 IPython/utils/jsonutil.py
@@ -12,6 +12,7 @@
#-----------------------------------------------------------------------------
# stdlib
import re
+import sys
import types
from datetime import datetime
@@ -121,14 +122,17 @@ def json_clean(obj):
"""
# types that are 'atomic' and ok in json as-is. bool doesn't need to be
# listed explicitly because bools pass as int instances
- atomic_ok = (basestring, int, float, types.NoneType)
+ atomic_ok = (unicode, int, float, types.NoneType)
# containers that we need to convert into lists
container_to_list = (tuple, set, types.GeneratorType)
if isinstance(obj, atomic_ok):
return obj
-
+
+ if isinstance(obj, bytes):
+ return obj.decode(sys.getdefaultencoding(), 'replace')
+
if isinstance(obj, container_to_list) or (
hasattr(obj, '__iter__') and hasattr(obj, 'next')):
obj = list(obj)
Something went wrong with that request. Please try again.