Skip to content
This repository

Don't treat bytes objects as json-safe #769

Merged
merged 1 commit into from over 2 years ago

3 participants

Min RK Brian E. Granger Thomas Kluyver
Min RK
Owner

json_clean passed bytes objects through as safe, which is incorrect. This decodes them with defaultencoding().

Should close #767

Brian E. Granger
Owner

This looks fine, I will merge.

Brian E. Granger ellisonbg merged commit 14867db into from September 06, 2011
Brian E. Granger ellisonbg closed this September 06, 2011
Thomas Kluyver
Collaborator

Hang on, on Python 2, sys.getdefaultencoding() is ascii. So, going back to @jstenar's test case, any non-ascii characters in a docstring get mangled into the replacement character. Surely we can have a better guess at the encoding used, e.g. utf-8, or whatever sys.stdin.encoding is?

Also, after running Jörgen's test script, I notice that even with this fix, doing b? at the Qt console still crashes the kernel with a unicode error in dumping JSON.

Min RK
Owner

Yes, it should do the same stdin.encoding guess we do elsewhere, though that will still not help in the many situations where stdin.encoding is None for the subprocess.

Thomas Kluyver
Collaborator

Is there some better way to get the system code page on Windows? Or should we guess UTF-8, because most docstrings will probably be in saved Python code, which I think is mostly UTF-8 encoded. Then again, most good code should be using unicode strings if it needs non-ascii characters.

Min RK
Owner

reopened as #770

Min RK
Owner

We should probably centralize our guessed encoding, so we don't have these sys.stdin.encoding or sys.getdefaultencoding() lines all over the place. That would also make it less painful if/when we find better ways to guess.

Thomas Kluyver
Collaborator

Note that in some places we also use sys.getfilesystemencoding().

Min RK
Owner
Brian E. Granger ellisonbg referenced this pull request from a commit January 10, 2012
Commit has since been removed from the repository and is no longer available.
Min RK minrk referenced this pull request April 09, 2014
Merged

#769 (reopened) #770

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Showing 1 unique commit by 1 author.

Sep 06, 2011
Min RK Don't treat bytes objects as json-safe
closes gh-767
6cc41ec
This page is out of date. Refresh to see the latest.

Showing 1 changed file with 6 additions and 2 deletions. Show diff stats Hide diff stats

  1. 8  IPython/utils/jsonutil.py
8  IPython/utils/jsonutil.py
@@ -12,6 +12,7 @@
12 12
 #-----------------------------------------------------------------------------
13 13
 # stdlib
14 14
 import re
  15
+import sys
15 16
 import types
16 17
 from datetime import datetime
17 18
 
@@ -121,14 +122,17 @@ def json_clean(obj):
121 122
     """
122 123
     # types that are 'atomic' and ok in json as-is.  bool doesn't need to be
123 124
     # listed explicitly because bools pass as int instances
124  
-    atomic_ok = (basestring, int, float, types.NoneType)
  125
+    atomic_ok = (unicode, int, float, types.NoneType)
125 126
     
126 127
     # containers that we need to convert into lists
127 128
     container_to_list = (tuple, set, types.GeneratorType)
128 129
     
129 130
     if isinstance(obj, atomic_ok):
130 131
         return obj
131  
-
  132
+    
  133
+    if isinstance(obj, bytes):
  134
+        return obj.decode(sys.getdefaultencoding(), 'replace')
  135
+    
132 136
     if isinstance(obj, container_to_list) or (
133 137
         hasattr(obj, '__iter__') and hasattr(obj, 'next')):
134 138
         obj = list(obj)
Commit_comment_tip

Tip: You can add notes to lines in a file. Hover to the left of a line to make a note

Something went wrong with that request. Please try again.