Skip to content


Subversion checkout URL

You can clone with
Download ZIP


Don't treat bytes objects as json-safe #769

merged 1 commit into from

3 participants


json_clean passed bytes objects through as safe, which is incorrect. This decodes them with defaultencoding().

Should close #767


This looks fine, I will merge.

@ellisonbg ellisonbg merged commit 14867db into ipython:master

Hang on, on Python 2, sys.getdefaultencoding() is ascii. So, going back to @jstenar's test case, any non-ascii characters in a docstring get mangled into the replacement character. Surely we can have a better guess at the encoding used, e.g. utf-8, or whatever sys.stdin.encoding is?

Also, after running Jörgen's test script, I notice that even with this fix, doing b? at the Qt console still crashes the kernel with a unicode error in dumping JSON.


Yes, it should do the same stdin.encoding guess we do elsewhere, though that will still not help in the many situations where stdin.encoding is None for the subprocess.


Is there some better way to get the system code page on Windows? Or should we guess UTF-8, because most docstrings will probably be in saved Python code, which I think is mostly UTF-8 encoded. Then again, most good code should be using unicode strings if it needs non-ascii characters.


reopened as #770


We should probably centralize our guessed encoding, so we don't have these sys.stdin.encoding or sys.getdefaultencoding() lines all over the place. That would also make it less painful if/when we find better ways to guess.


Note that in some places we also use sys.getfilesystemencoding().

@ellisonbg ellisonbg referenced this pull request from a commit
Commit has since been removed from the repository and is no longer available.
@minrk minrk referenced this pull request

#769 (reopened) #770

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Sep 6, 2011
  1. @minrk
This page is out of date. Refresh to see the latest.
Showing with 6 additions and 2 deletions.
  1. +6 −2 IPython/utils/
8 IPython/utils/
@@ -12,6 +12,7 @@
# stdlib
import re
+import sys
import types
from datetime import datetime
@@ -121,14 +122,17 @@ def json_clean(obj):
# types that are 'atomic' and ok in json as-is. bool doesn't need to be
# listed explicitly because bools pass as int instances
- atomic_ok = (basestring, int, float, types.NoneType)
+ atomic_ok = (unicode, int, float, types.NoneType)
# containers that we need to convert into lists
container_to_list = (tuple, set, types.GeneratorType)
if isinstance(obj, atomic_ok):
return obj
+ if isinstance(obj, bytes):
+ return obj.decode(sys.getdefaultencoding(), 'replace')
if isinstance(obj, container_to_list) or (
hasattr(obj, '__iter__') and hasattr(obj, 'next')):
obj = list(obj)
Something went wrong with that request. Please try again.