We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I use pandas and since a recent update (sorry I don't know what the old version was, but now I'm at ipython 3.0.0 and pandas 0.15.2) I need to set
import sys reload(sys) sys.setdefaultencoding('utf-8')
in order to view the html version of a dataframe, otherwise I get UnicodeDecodeError.
UnicodeDecodeError
Unfortunately, this workaround has the side-effect that I no longer see output from print statements and %%time.
print
%%time
Per rkern, more specifically what pandas content breaks in my case:
pd.DataFrame({'x':[u'water, 38.71 mg/L @ 25 \xb0C (est), water, 14.1 mg/L @ 25 \xb0C (exp)']}) --------------------------------------------------------------------------- UnicodeDecodeError Traceback (most recent call last) <ipython-input-29-eaddc9d8e339> in <module>() ----> 1 pd.DataFrame({'x':[u'water, 38.71 mg/L @ 25 \xb0C (est), water, 14.1 mg/L @ 25 \xb0C (exp)']}) /Users/nathan/.virtualenvs/A1/lib/python2.7/site-packages/IPython/core/displayhook.pyc in __call__(self, result) 236 self.write_format_data(format_dict, md_dict) 237 self.log_output(format_dict) --> 238 self.finish_displayhook() 239 240 def cull_cache(self): /Users/nathan/.virtualenvs/A1/lib/python2.7/site-packages/IPython/kernel/zmq/displayhook.pyc in finish_displayhook(self) 70 sys.stderr.flush() 71 if self.msg['content']['data']: ---> 72 self.session.send(self.pub_socket, self.msg, ident=self.topic) 73 self.msg = None 74 /Users/nathan/.virtualenvs/A1/lib/python2.7/site-packages/IPython/kernel/zmq/session.pyc in send(self, stream, msg_or_type, content, parent, ident, buffers, track, header, metadata) 647 if self.adapt_version: 648 msg = adapt(msg, self.adapt_version) --> 649 to_send = self.serialize(msg, ident) 650 to_send.extend(buffers) 651 longest = max([ len(s) for s in to_send ]) /Users/nathan/.virtualenvs/A1/lib/python2.7/site-packages/IPython/kernel/zmq/session.pyc in serialize(self, msg, ident) 551 content = self.none 552 elif isinstance(content, dict): --> 553 content = self.pack(content) 554 elif isinstance(content, bytes): 555 # content is already packed, as in a relayed message /Users/nathan/.virtualenvs/A1/lib/python2.7/site-packages/IPython/kernel/zmq/session.pyc in <lambda>(obj) 83 # disallow nan, because it's not actually valid JSON 84 json_packer = lambda obj: jsonapi.dumps(obj, default=date_default, ---> 85 ensure_ascii=False, allow_nan=False, 86 ) 87 json_unpacker = lambda s: jsonapi.loads(s) /Users/nathan/.virtualenvs/A1/lib/python2.7/site-packages/zmq/utils/jsonapi.pyc in dumps(o, **kwargs) 38 kwargs['separators'] = (',', ':') 39 ---> 40 s = jsonmod.dumps(o, **kwargs) 41 42 if isinstance(s, unicode): /usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.pyc in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, encoding, default, sort_keys, **kw) 248 check_circular=check_circular, allow_nan=allow_nan, indent=indent, 249 separators=separators, encoding=encoding, default=default, --> 250 sort_keys=sort_keys, **kw).encode(obj) 251 252 /usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.pyc in encode(self, o) 208 if not isinstance(chunks, (list, tuple)): 209 chunks = list(chunks) --> 210 return ''.join(chunks) 211 212 def iterencode(self, o, _one_shot=False): UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 81: ordinal not in range(128)
Add the workaround:
import sys reload(sys) sys.setdefaultencoding('utf-8') pd.DataFrame({'x':[u'water, 38.71 mg/L @ 25 \xb0C (est), water, 14.1 mg/L @ 25 \xb0C (exp)']})
and I get the table with the text as you would expect.
The text was updated successfully, but these errors were encountered:
Please provide an example of how the HTML version of a dataframe gives you a UnicodeDecodeError. That's the bug to fix. The bug may be in Pandas.
sys.setdefaultencoding() should never be used, so the fact that setting it breaks other stuff is not something to worry about.
sys.setdefaultencoding()
Sorry, something went wrong.
I think we're already tracking the issue with displaying DataFrames as #6799.
I agree with @rkern that sys.setdefaultencoding() should be expected to break stuff.
And it looks like Min already fixed that, so non-ascii data frames should work in 3.1.
Or you can upgrade to Python 3, where unicode generally isn't a problem.
Thanks for clearing that up!
No branches or pull requests
I use pandas and since a recent update (sorry I don't know what the old version was, but now I'm at ipython 3.0.0 and pandas 0.15.2) I need to set
in order to view the html version of a dataframe, otherwise I get
UnicodeDecodeError
.Unfortunately, this workaround has the side-effect that I no longer see output from
print
statements and%%time
.Per rkern, more specifically what pandas content breaks in my case:
Add the workaround:
and I get the table with the text as you would expect.
The text was updated successfully, but these errors were encountered: