Skip to content
This repository

Don't treat bytes objects as json-safe #769

merged 1 commit into from over 2 years ago

3 participants

Min RK Brian E. Granger Thomas Kluyver
Min RK

json_clean passed bytes objects through as safe, which is incorrect. This decodes them with defaultencoding().

Should close #767

Brian E. Granger

This looks fine, I will merge.

Brian E. Granger ellisonbg merged commit 14867db into from September 06, 2011
Brian E. Granger ellisonbg closed this September 06, 2011
Thomas Kluyver

Hang on, on Python 2, sys.getdefaultencoding() is ascii. So, going back to @jstenar's test case, any non-ascii characters in a docstring get mangled into the replacement character. Surely we can have a better guess at the encoding used, e.g. utf-8, or whatever sys.stdin.encoding is?

Also, after running Jörgen's test script, I notice that even with this fix, doing b? at the Qt console still crashes the kernel with a unicode error in dumping JSON.

Min RK

Yes, it should do the same stdin.encoding guess we do elsewhere, though that will still not help in the many situations where stdin.encoding is None for the subprocess.

Thomas Kluyver

Is there some better way to get the system code page on Windows? Or should we guess UTF-8, because most docstrings will probably be in saved Python code, which I think is mostly UTF-8 encoded. Then again, most good code should be using unicode strings if it needs non-ascii characters.

Min RK

reopened as #770

Min RK

We should probably centralize our guessed encoding, so we don't have these sys.stdin.encoding or sys.getdefaultencoding() lines all over the place. That would also make it less painful if/when we find better ways to guess.

Thomas Kluyver

Note that in some places we also use sys.getfilesystemencoding().

Min RK
Brian E. Granger ellisonbg referenced this pull request from a commit January 10, 2012
Commit has since been removed from the repository and is no longer available.
Min RK minrk referenced this pull request April 09, 2014

#769 (reopened) #770

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Showing 1 unique commit by 1 author.

Sep 06, 2011
Min RK Don't treat bytes objects as json-safe
closes gh-767
This page is out of date. Refresh to see the latest.

Showing 1 changed file with 6 additions and 2 deletions. Show diff stats Hide diff stats

  1. 8  IPython/utils/
8  IPython/utils/
@@ -12,6 +12,7 @@
12 12
13 13
 # stdlib
14 14
 import re
+import sys
15 16
 import types
16 17
 from datetime import datetime
17 18
@@ -121,14 +122,17 @@ def json_clean(obj):
121 122
122 123
     # types that are 'atomic' and ok in json as-is.  bool doesn't need to be
123 124
     # listed explicitly because bools pass as int instances
-    atomic_ok = (basestring, int, float, types.NoneType)
+    atomic_ok = (unicode, int, float, types.NoneType)
125 126
126 127
     # containers that we need to convert into lists
127 128
     container_to_list = (tuple, set, types.GeneratorType)
128 129
129 130
     if isinstance(obj, atomic_ok):
130 131
         return obj
+    if isinstance(obj, bytes):
+        return obj.decode(sys.getdefaultencoding(), 'replace')
132 136
     if isinstance(obj, container_to_list) or (
133 137
         hasattr(obj, '__iter__') and hasattr(obj, 'next')):
134 138
         obj = list(obj)

Tip: You can add notes to lines in a file. Hover to the left of a line to make a note

Something went wrong with that request. Please try again.