Skip to content
This repository

codepage handling of output from scripts and shellcommands are not handled properly by qtconsole #768

Closed
jstenar opened this Issue September 06, 2011 · 14 comments

4 participants

Jörgen Stenarson Min RK Thomas Kluyver Fernando Perez
Jörgen Stenarson
Collaborator

On my machine when running ls in a qtconsole any non-ascii characters in the output are garbage (diamond shaped question mark) .

I have a testscript at https://gist.github.com/1198529 that can be used to illustrate the problem

In a regular ipython terminal I get correct result for:

In [1]: %run run-encoding.py cp1252
Test data åäö

But as expected I get incorrect results for

In [2]: %run run-encoding.py cp850
Test data †„”

In [3]: %run run-encoding.py utf-8
Test data åäö

However when running in qtconsole I get incorrect results in all three cases.

/Jörgen

Min RK
Owner

The basic reason is that the 'encoding' associated with the qtconsole is sys.getdefaultencoding(), so just like you get the wrong answer in everything but cp1252 in your Windows terminal, you get the wrong answer in everything but the default encoding (generally ascii) in the qtconsole. The question marks are the result of s.decode(sys.getdefaultencoding(), 'replace').

The general idea is that if you are printing unicode, you should be printing unicode objects, which will behave correctly, not bytes objects, which have discarded the character meaning of their contents.

Jörgen Stenarson
Collaborator
Min RK
Owner

I was mistaken, we actually start with sys.stdin.encoding, and fallback to getdefaultencoding, but sys.stdin.encoding is often None for subprocesses like the kernel.

In any case, I think if we give the OutStream (what we replace sys.stdout with) object a configurable encoding attr, much of these should be helped, and would be configurable.

Thomas Kluyver
Collaborator

It's not entirely clear what the 'correct' encoding is, because we're not limited by the terminal code page. If you do print "åäö", should we assume that to be in the encoding a terminal would force you to use, or UTF-8, or something else?

For external processes, I think we should decode the bytes as we read them from the other process, and assume that it's using the system code page. I thought we already did this, but I guess it must be going wrong somewhere.

Min RK
Owner

We use sys.stdin.encoding, which can be (and often is for subprocesses) None. If we give the OutStream object an encoding with the same default behavior it currently has, it should improve the situation, allowing users to set it when stdin encoding doesn't tell us anything.

Min RK
Owner

@jstenar, can you check if the code in PR #770 makes the behavior more reasonable for you? It adds checking the locale for encoding information, so if you change the locale, it will change the default interpretation of bytes objects.

Jörgen Stenarson
Collaborator
Fernando Perez
Owner

I've just merged #770 which supposedly helped with this, but on linux I still see problems. On the terminal I get:

In [4]: %run run-encoding.py utf-8
Test data åäö

but on the qtconsole I see the little question-mark-diamonds:

In [1]: %run run-encoding.py utf-8
Test data ������

So it seems we still have issues, no?

Min RK
Owner

Arg, I switched getpreferredencoding() to getpreferredencoding(False), since I thought it was safer. Turns out the opposite makes the most sense, and fixes this particular case.

Fernando Perez
Owner

@minrk, since #770 is already merged, do you want to just make that change in master? We can then retest this...

Min RK
Owner

Sure, change pushed.

Fernando Perez
Owner

OK, with Min's fix, master does work for me now both at the terminal and the qtconsole. I should note that only utf-8 shows the output correctly, the cp1252 still shows the diamonds on linux. But I imagine that's correct on a linux box...

So now that this has been merged, should we close the original issue? @jstenar?

Jörgen Stenarson
Collaborator
Min RK
Owner

closed by PR #770

Min RK minrk closed this September 12, 2011
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.