Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Don't treat bytes objects as json-safe #769

Merged
merged 1 commit into from

3 participants

@minrk
Owner

json_clean passed bytes objects through as safe, which is incorrect. This decodes them with defaultencoding().

Should close #767

@ellisonbg
Owner

This looks fine, I will merge.

@ellisonbg ellisonbg merged commit 14867db into from
@takluyver
Owner

Hang on, on Python 2, sys.getdefaultencoding() is ascii. So, going back to @jstenar's test case, any non-ascii characters in a docstring get mangled into the replacement character. Surely we can have a better guess at the encoding used, e.g. utf-8, or whatever sys.stdin.encoding is?

Also, after running Jörgen's test script, I notice that even with this fix, doing b? at the Qt console still crashes the kernel with a unicode error in dumping JSON.

@minrk
Owner

Yes, it should do the same stdin.encoding guess we do elsewhere, though that will still not help in the many situations where stdin.encoding is None for the subprocess.

@takluyver
Owner

Is there some better way to get the system code page on Windows? Or should we guess UTF-8, because most docstrings will probably be in saved Python code, which I think is mostly UTF-8 encoded. Then again, most good code should be using unicode strings if it needs non-ascii characters.

@minrk
Owner

reopened as #770

@minrk
Owner

We should probably centralize our guessed encoding, so we don't have these sys.stdin.encoding or sys.getdefaultencoding() lines all over the place. That would also make it less painful if/when we find better ways to guess.

@takluyver
Owner

Note that in some places we also use sys.getfilesystemencoding().

@minrk
Owner
@ellisonbg ellisonbg referenced this pull request from a commit
Commit has since been removed from the repository and is no longer available.
@minrk minrk referenced this pull request
Merged

#769 (reopened) #770

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Sep 6, 2011
  1. @minrk
This page is out of date. Refresh to see the latest.
Showing with 6 additions and 2 deletions.
  1. +6 −2 IPython/utils/jsonutil.py
View
8 IPython/utils/jsonutil.py
@@ -12,6 +12,7 @@
#-----------------------------------------------------------------------------
# stdlib
import re
+import sys
import types
from datetime import datetime
@@ -121,14 +122,17 @@ def json_clean(obj):
"""
# types that are 'atomic' and ok in json as-is. bool doesn't need to be
# listed explicitly because bools pass as int instances
- atomic_ok = (basestring, int, float, types.NoneType)
+ atomic_ok = (unicode, int, float, types.NoneType)
# containers that we need to convert into lists
container_to_list = (tuple, set, types.GeneratorType)
if isinstance(obj, atomic_ok):
return obj
-
+
+ if isinstance(obj, bytes):
+ return obj.decode(sys.getdefaultencoding(), 'replace')
+
if isinstance(obj, container_to_list) or (
hasattr(obj, '__iter__') and hasattr(obj, 'next')):
obj = list(obj)
Something went wrong with that request. Please try again.