#769 (reopened) #770

Merged
merged 3 commits into from Sep 12, 2011

Conversation

Projects
None yet
3 participants
@minrk
Member

minrk commented Sep 6, 2011

reopens #769 with further fixes

Ensures all replies from ipkernel are clean for json (not just oinfo), and guesses stdin.encoding before using sys.getdefaultencoding in json_clean.

@minrk

This comment has been minimized.

Show comment
Hide comment
@minrk

minrk Sep 7, 2011

Member

@takluyver, does this seem more reasonable for the json_clean?

It now performs the same check as elsewhere, which still fails to get an answer in plenty of situations, and cleans more than just oinfo messages.

Member

minrk commented Sep 7, 2011

@takluyver, does this seem more reasonable for the json_clean?

It now performs the same check as elsewhere, which still fails to get an answer in plenty of situations, and cleans more than just oinfo messages.

@minrk

This comment has been minimized.

Show comment
Hide comment
@minrk

minrk Sep 7, 2011

Member

@takluyver perhaps we should have an IPython.zmq.defaultencoding that starts out as just sys.stdin.encoding or sys.getdefaultencoding(), but people can change its value. Since the kernel can be started without there ever being a terminal associated with it (e.g. as a GUI script), it makes sense for there to be somewhere to store an encoding that should be used to interpret bytes. I think there just isn't a reliable way for us to always get the right answer, and when that's the case, it makes sense to let advanced users make the choice.

Member

minrk commented Sep 7, 2011

@takluyver perhaps we should have an IPython.zmq.defaultencoding that starts out as just sys.stdin.encoding or sys.getdefaultencoding(), but people can change its value. Since the kernel can be started without there ever being a terminal associated with it (e.g. as a GUI script), it makes sense for there to be somewhere to store an encoding that should be used to interpret bytes. I think there just isn't a reliable way for us to always get the right answer, and when that's the case, it makes sense to let advanced users make the choice.

@minrk

This comment has been minimized.

Show comment
Hide comment
@minrk

minrk Sep 7, 2011

Member

It seems like locale.getpreferredencoding() is a less conservative choice for fallback than sys.getdefaultencoding(). Perhaps we should use that.

Member

minrk commented Sep 7, 2011

It seems like locale.getpreferredencoding() is a less conservative choice for fallback than sys.getdefaultencoding(). Perhaps we should use that.

@minrk

This comment has been minimized.

Show comment
Hide comment
@minrk

minrk Sep 7, 2011

Member

updated with IPython.utils.text.getdefaultencoding(), which also fixes the issue described in #775 (at least on my OSX machine).

Member

minrk commented Sep 7, 2011

updated with IPython.utils.text.getdefaultencoding(), which also fixes the issue described in #775 (at least on my OSX machine).

@ivanov

This comment has been minimized.

Show comment
Hide comment
@ivanov

ivanov Sep 7, 2011

Member

confirming that issue described in #775 is fixed by this PL

Member

ivanov commented Sep 7, 2011

confirming that issue described in #775 is fixed by this PL

@minrk

This comment has been minimized.

Show comment
Hide comment
@minrk

minrk Sep 8, 2011

Member

Confirmed as closing #768

Member

minrk commented Sep 8, 2011

Confirmed as closing #768

minrk added some commits Sep 6, 2011

add text.getdefaultencoding() for central default encoding guess
This is a central location for the many places we call sys.stdin.encoding or sys.getdefaultencoding(), which
now adds locale.getpreferredencoding(False) after stdin.encoding,
which should be a better guess when stdin.encoding is None.
@fperez

This comment has been minimized.

Show comment
Hide comment
@fperez

fperez Sep 12, 2011

Member

Just a note, Min: this one doesn't seem to solve the weird errors on %debug in the console I was mentioning today. To reproduce those:

  1. start a notebook, type anything in a cell that would cause an exception (but from code executed in the cell, not by using %run).
  2. open a qt console to the notebook's kernel
  3. type %debug in the console

At 3, I see little triangle junk characters in the traceback printout. If the traceback was generated from %run, there's no problem.

Member

fperez commented Sep 12, 2011

Just a note, Min: this one doesn't seem to solve the weird errors on %debug in the console I was mentioning today. To reproduce those:

  1. start a notebook, type anything in a cell that would cause an exception (but from code executed in the cell, not by using %run).
  2. open a qt console to the notebook's kernel
  3. type %debug in the console

At 3, I see little triangle junk characters in the traceback printout. If the traceback was generated from %run, there's no problem.

@minrk

This comment has been minimized.

Show comment
Hide comment
@minrk

minrk Sep 12, 2011

Member

@fperez, too bad it's not this. Can you get info on the characters that are being printed? I should note that I can't actually reproduce what you describe by following your instructions (with a 1/0 error).

Member

minrk commented Sep 12, 2011

@fperez, too bad it's not this. Can you get info on the characters that are being printed? I should note that I can't actually reproduce what you describe by following your instructions (with a 1/0 error).

@fperez

This comment has been minimized.

Show comment
Hide comment
@fperez

fperez Sep 12, 2011

Member

Weird, here's a screenshot of what I get: http://imgur.com/RWa10

I've tried it with a few different fonts in the Qt widget and I get the same thing, so it doesn't seem to be font-related... Any ideas? I can hop on irc if you want...

Member

fperez commented Sep 12, 2011

Weird, here's a screenshot of what I get: http://imgur.com/RWa10

I've tried it with a few different fonts in the Qt widget and I get the same thing, so it doesn't seem to be font-related... Any ideas? I can hop on irc if you want...

@minrk

This comment has been minimized.

Show comment
Hide comment
@minrk

minrk Sep 12, 2011

Member

@fperez

I got to my Ubuntu machine today, and I can reproduce what you describe without ever invoking a notebook. Just a single qtconsole, raise an error and invoke %debug, and I see the weird triangles. It happend with every invocation of %debug in the qtconsole, no need for multiple clients. And it's reproducible all the way back to 0.11 release, so it has nothing to do with recent unicode fixes like I thought.

This is using PyQt4 (Ubuntu 10.04 LTS, qt4/pyqt4 from apt: PyQt4 4.7.2, Qt 4.6.2).

As we discussed on IRC, this only affects PyQt, and not PySide, and the reason it appears new is that your PySide is 1.0.0, and PR #725 made the minimum PySide version 1.0.3, effectively switching your default from PySide to PyQt.

Member

minrk commented Sep 12, 2011

@fperez

I got to my Ubuntu machine today, and I can reproduce what you describe without ever invoking a notebook. Just a single qtconsole, raise an error and invoke %debug, and I see the weird triangles. It happend with every invocation of %debug in the qtconsole, no need for multiple clients. And it's reproducible all the way back to 0.11 release, so it has nothing to do with recent unicode fixes like I thought.

This is using PyQt4 (Ubuntu 10.04 LTS, qt4/pyqt4 from apt: PyQt4 4.7.2, Qt 4.6.2).

As we discussed on IRC, this only affects PyQt, and not PySide, and the reason it appears new is that your PySide is 1.0.0, and PR #725 made the minimum PySide version 1.0.3, effectively switching your default from PySide to PyQt.

@fperez

This comment has been minimized.

Show comment
Hide comment
@fperez

fperez Sep 12, 2011

Member

Yup, you're right, I see it too. At least it's good to know it's an old problem we simply hadn't noticed and not something we broke recently... I'll open an issue and ping Evan about it. Thanks for the extra info!

Member

fperez commented Sep 12, 2011

Yup, you're right, I see it too. At least it's good to know it's an old problem we simply hadn't noticed and not something we broke recently... I'll open an issue and ping Evan about it. Thanks for the extra info!

@minrk

This comment has been minimized.

Show comment
Hide comment
@minrk

minrk Sep 12, 2011

Member

I added debug statements to the frontend, and I can see the difference between the normal traceback sent, and the one that's drawn wrong: null characters. Each triangle corresponds to a (\x00) char that is somehow added to the color-code only in the source, and only from ipdb.

Raising the error with 222/0, the line in the traceback is:
\x1b[0;32m----> 1\x1b[0;31m \x1b[0;36m222\x1b[0m\x1b[0;34m/\x1b[0m\x1b[0;36m0\x1b[0m\x1b[0;34m\x1b[0m\x1b[0m

Whereas the same line, colored by ipdb is:
\x1b[0;32m -1 \x1b[0;31m\x1b[0;36m2\x1b[0m\x1b[0;31m\x00\x1b[0m\x1b[0;36m2\x1b[0m\x1b[0;31m\x00\x1b[0m\x1b[0;36m2\x1b[0m\x1b[0;31m\x00\x1b[0m\x1b[0;34m/\x1b[0m\x1b[0;31m\x00\x1b[0m\x1b[0;36m0\x1b[0m\x1b[0;31m\x00\x1b[0m\x1b[0;34m\x1b[0m\x1b[0m

It would appear that PySide just ignores the \x00 null chars, whereas PyQt draws them as triangles. You can see this by simply doing print '\x00'. You will see a triangle with pyqt, and nothing in every other context I can find.

So this really seems like a bug in ipdb, just one that doesn't actually matter anywhere but in a pyqt-console.

Member

minrk commented Sep 12, 2011

I added debug statements to the frontend, and I can see the difference between the normal traceback sent, and the one that's drawn wrong: null characters. Each triangle corresponds to a (\x00) char that is somehow added to the color-code only in the source, and only from ipdb.

Raising the error with 222/0, the line in the traceback is:
\x1b[0;32m----> 1\x1b[0;31m \x1b[0;36m222\x1b[0m\x1b[0;34m/\x1b[0m\x1b[0;36m0\x1b[0m\x1b[0;34m\x1b[0m\x1b[0m

Whereas the same line, colored by ipdb is:
\x1b[0;32m -1 \x1b[0;31m\x1b[0;36m2\x1b[0m\x1b[0;31m\x00\x1b[0m\x1b[0;36m2\x1b[0m\x1b[0;31m\x00\x1b[0m\x1b[0;36m2\x1b[0m\x1b[0;31m\x00\x1b[0m\x1b[0;34m/\x1b[0m\x1b[0;31m\x00\x1b[0m\x1b[0;36m0\x1b[0m\x1b[0;31m\x00\x1b[0m\x1b[0;34m\x1b[0m\x1b[0m

It would appear that PySide just ignores the \x00 null chars, whereas PyQt draws them as triangles. You can see this by simply doing print '\x00'. You will see a triangle with pyqt, and nothing in every other context I can find.

So this really seems like a bug in ipdb, just one that doesn't actually matter anywhere but in a pyqt-console.

@minrk

This comment has been minimized.

Show comment
Hide comment
@minrk

minrk Sep 12, 2011

Member

I should clarify, not a bug in ipdb, but rather a bug in pycolorize, which doesn't like unicode input. See the output of:

In [20]: from IPython.utils import PyColorize
    ...: p = PyColorize.Parser()

In [21]: p.format('5', 'str')
Out[21]: '\x1b[0;36m5\x1b[0m\x1b[0;34m\x1b[0m\x1b[0m\n'

In [22]: p.format(u'5', 'str')
Out[22]: '\x1b[0;36m5\x1b[0m\x1b[0;31m\x00\x1b[0m\x1b[0;34m\x1b[0m\x1b[0m\n'
Member

minrk commented Sep 12, 2011

I should clarify, not a bug in ipdb, but rather a bug in pycolorize, which doesn't like unicode input. See the output of:

In [20]: from IPython.utils import PyColorize
    ...: p = PyColorize.Parser()

In [21]: p.format('5', 'str')
Out[21]: '\x1b[0;36m5\x1b[0m\x1b[0;34m\x1b[0m\x1b[0m\n'

In [22]: p.format(u'5', 'str')
Out[22]: '\x1b[0;36m5\x1b[0m\x1b[0;31m\x00\x1b[0m\x1b[0;34m\x1b[0m\x1b[0m\n'
@fperez

This comment has been minimized.

Show comment
Hide comment
@fperez

fperez Sep 12, 2011

Member

Nailed, thanks! Now at least we know where the problem is coming from...

I'll have a go at it now.

Member

fperez commented Sep 12, 2011

Nailed, thanks! Now at least we know where the problem is coming from...

I'll have a go at it now.

@minrk

This comment has been minimized.

Show comment
Hide comment
@minrk

minrk Sep 12, 2011

Member

Ah, easy fix: use StringIO instead of cStringIO. StringIO is unicode aware, but cStringIO is not. Thomas discovered this, and has fixed it in some other parts of the code, if I recall correctly. Should we just find/replace all cStringIOs (there are still a few)?

Member

minrk commented Sep 12, 2011

Ah, easy fix: use StringIO instead of cStringIO. StringIO is unicode aware, but cStringIO is not. Thomas discovered this, and has fixed it in some other parts of the code, if I recall correctly. Should we just find/replace all cStringIOs (there are still a few)?

@fperez

This comment has been minimized.

Show comment
Hide comment
@fperez

fperez Sep 12, 2011

Member

Hey,

On Mon, Sep 12, 2011 at 11:52 AM, Min RK
reply@reply.github.com
wrote:

Ah, easy fix: use StringIO instead of cStringIO.  StringIO is unicode aware, but cStringIO is not.  Thomas discovered this, and has fixed it in some other parts of the code, if I recall correctly.  Should we just find/replace all cStringIOs (there are still a few)?

Yes!!! We'd been using the C version simply because that used to be
the advice ages ago, for performance reasons. But lack of unicode
support is definitely a killer, so by all means. If you're on it, go
ahead and do it! I can test/review right away.

thanks

Member

fperez commented Sep 12, 2011

Hey,

On Mon, Sep 12, 2011 at 11:52 AM, Min RK
reply@reply.github.com
wrote:

Ah, easy fix: use StringIO instead of cStringIO.  StringIO is unicode aware, but cStringIO is not.  Thomas discovered this, and has fixed it in some other parts of the code, if I recall correctly.  Should we just find/replace all cStringIOs (there are still a few)?

Yes!!! We'd been using the C version simply because that used to be
the advice ages ago, for performance reasons. But lack of unicode
support is definitely a killer, so by all means. If you're on it, go
ahead and do it! I can test/review right away.

thanks

@fperez

This comment has been minimized.

Show comment
Hide comment
@fperez

fperez Sep 12, 2011

Member

Back to actually discussing this PR, sorry for hijacking things with the other bug...

This looks good and the right thing to do. Min, thanks for the work! I'll merge now.

Member

fperez commented Sep 12, 2011

Back to actually discussing this PR, sorry for hijacking things with the other bug...

This looks good and the right thing to do. Min, thanks for the work! I'll merge now.

fperez added a commit that referenced this pull request Sep 12, 2011

Merge pull request #770 from minrk/jsonclean
Ensures all replies from ipkernel are clean for json (not just oinfo), and guesses stdin.encoding before using sys.getdefaultencoding in json_clean.

@fperez fperez merged commit ede7936 into ipython:master Sep 12, 2011

mattvonrocketstein pushed a commit to mattvonrocketstein/ipython that referenced this pull request Nov 3, 2014

Merge pull request #770 from minrk/jsonclean
Ensures all replies from ipkernel are clean for json (not just oinfo), and guesses stdin.encoding before using sys.getdefaultencoding in json_clean.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment