#769 (reopened) #770

Merged
merged 3 commits into from Sep 12, 2011

Projects

None yet

3 participants

@minrk
Member
minrk commented Sep 6, 2011

reopens #769 with further fixes

Ensures all replies from ipkernel are clean for json (not just oinfo), and guesses stdin.encoding before using sys.getdefaultencoding in json_clean.

@minrk
Member
minrk commented Sep 7, 2011

@takluyver, does this seem more reasonable for the json_clean?

It now performs the same check as elsewhere, which still fails to get an answer in plenty of situations, and cleans more than just oinfo messages.

@minrk
Member
minrk commented Sep 7, 2011

@takluyver perhaps we should have an IPython.zmq.defaultencoding that starts out as just sys.stdin.encoding or sys.getdefaultencoding(), but people can change its value. Since the kernel can be started without there ever being a terminal associated with it (e.g. as a GUI script), it makes sense for there to be somewhere to store an encoding that should be used to interpret bytes. I think there just isn't a reliable way for us to always get the right answer, and when that's the case, it makes sense to let advanced users make the choice.

@minrk
Member
minrk commented Sep 7, 2011

It seems like locale.getpreferredencoding() is a less conservative choice for fallback than sys.getdefaultencoding(). Perhaps we should use that.

@minrk
Member
minrk commented Sep 7, 2011

updated with IPython.utils.text.getdefaultencoding(), which also fixes the issue described in #775 (at least on my OSX machine).

@ivanov
Member
ivanov commented Sep 7, 2011

confirming that issue described in #775 is fixed by this PL

@minrk
Member
minrk commented Sep 8, 2011

Confirmed as closing #768

minrk added some commits Sep 6, 2011
@minrk minrk ensure replies from ipkernel are clean for JSON 5a3b97e
@minrk minrk add text.getdefaultencoding() for central default encoding guess
This is a central location for the many places we call sys.stdin.encoding or sys.getdefaultencoding(), which
now adds locale.getpreferredencoding(False) after stdin.encoding,
which should be a better guess when stdin.encoding is None.
6392ceb
@minrk minrk json_clean zmqshell replies
closes gh-535
bc4e206
@fperez
Member
fperez commented Sep 12, 2011

Just a note, Min: this one doesn't seem to solve the weird errors on %debug in the console I was mentioning today. To reproduce those:

  1. start a notebook, type anything in a cell that would cause an exception (but from code executed in the cell, not by using %run).
  2. open a qt console to the notebook's kernel
  3. type %debug in the console

At 3, I see little triangle junk characters in the traceback printout. If the traceback was generated from %run, there's no problem.

@minrk
Member
minrk commented Sep 12, 2011

@fperez, too bad it's not this. Can you get info on the characters that are being printed? I should note that I can't actually reproduce what you describe by following your instructions (with a 1/0 error).

@fperez
Member
fperez commented Sep 12, 2011

Weird, here's a screenshot of what I get: http://imgur.com/RWa10

I've tried it with a few different fonts in the Qt widget and I get the same thing, so it doesn't seem to be font-related... Any ideas? I can hop on irc if you want...

@minrk
Member
minrk commented Sep 12, 2011

@fperez

I got to my Ubuntu machine today, and I can reproduce what you describe without ever invoking a notebook. Just a single qtconsole, raise an error and invoke %debug, and I see the weird triangles. It happend with every invocation of %debug in the qtconsole, no need for multiple clients. And it's reproducible all the way back to 0.11 release, so it has nothing to do with recent unicode fixes like I thought.

This is using PyQt4 (Ubuntu 10.04 LTS, qt4/pyqt4 from apt: PyQt4 4.7.2, Qt 4.6.2).

As we discussed on IRC, this only affects PyQt, and not PySide, and the reason it appears new is that your PySide is 1.0.0, and PR #725 made the minimum PySide version 1.0.3, effectively switching your default from PySide to PyQt.

@fperez
Member
fperez commented Sep 12, 2011

Yup, you're right, I see it too. At least it's good to know it's an old problem we simply hadn't noticed and not something we broke recently... I'll open an issue and ping Evan about it. Thanks for the extra info!

@minrk
Member
minrk commented Sep 12, 2011

I added debug statements to the frontend, and I can see the difference between the normal traceback sent, and the one that's drawn wrong: null characters. Each triangle corresponds to a (\x00) char that is somehow added to the color-code only in the source, and only from ipdb.

Raising the error with 222/0, the line in the traceback is:
\x1b[0;32m----> 1\x1b[0;31m \x1b[0;36m222\x1b[0m\x1b[0;34m/\x1b[0m\x1b[0;36m0\x1b[0m\x1b[0;34m\x1b[0m\x1b[0m

Whereas the same line, colored by ipdb is:
\x1b[0;32m -1 \x1b[0;31m\x1b[0;36m2\x1b[0m\x1b[0;31m\x00\x1b[0m\x1b[0;36m2\x1b[0m\x1b[0;31m\x00\x1b[0m\x1b[0;36m2\x1b[0m\x1b[0;31m\x00\x1b[0m\x1b[0;34m/\x1b[0m\x1b[0;31m\x00\x1b[0m\x1b[0;36m0\x1b[0m\x1b[0;31m\x00\x1b[0m\x1b[0;34m\x1b[0m\x1b[0m

It would appear that PySide just ignores the \x00 null chars, whereas PyQt draws them as triangles. You can see this by simply doing print '\x00'. You will see a triangle with pyqt, and nothing in every other context I can find.

So this really seems like a bug in ipdb, just one that doesn't actually matter anywhere but in a pyqt-console.

@minrk
Member
minrk commented Sep 12, 2011

I should clarify, not a bug in ipdb, but rather a bug in pycolorize, which doesn't like unicode input. See the output of:

In [20]: from IPython.utils import PyColorize
    ...: p = PyColorize.Parser()

In [21]: p.format('5', 'str')
Out[21]: '\x1b[0;36m5\x1b[0m\x1b[0;34m\x1b[0m\x1b[0m\n'

In [22]: p.format(u'5', 'str')
Out[22]: '\x1b[0;36m5\x1b[0m\x1b[0;31m\x00\x1b[0m\x1b[0;34m\x1b[0m\x1b[0m\n'
@fperez
Member
fperez commented Sep 12, 2011

Nailed, thanks! Now at least we know where the problem is coming from...

I'll have a go at it now.

@minrk
Member
minrk commented Sep 12, 2011

Ah, easy fix: use StringIO instead of cStringIO. StringIO is unicode aware, but cStringIO is not. Thomas discovered this, and has fixed it in some other parts of the code, if I recall correctly. Should we just find/replace all cStringIOs (there are still a few)?

@fperez
Member
fperez commented Sep 12, 2011

Hey,

On Mon, Sep 12, 2011 at 11:52 AM, Min RK
reply@reply.github.com
wrote:

Ah, easy fix: use StringIO instead of cStringIO.  StringIO is unicode aware, but cStringIO is not.  Thomas discovered this, and has fixed it in some other parts of the code, if I recall correctly.  Should we just find/replace all cStringIOs (there are still a few)?

Yes!!! We'd been using the C version simply because that used to be
the advice ages ago, for performance reasons. But lack of unicode
support is definitely a killer, so by all means. If you're on it, go
ahead and do it! I can test/review right away.

thanks

@fperez
Member
fperez commented Sep 12, 2011

Back to actually discussing this PR, sorry for hijacking things with the other bug...

This looks good and the right thing to do. Min, thanks for the work! I'll merge now.

@fperez fperez merged commit ede7936 into ipython:master Sep 12, 2011
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment