New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unicode bug - encoding input #25
Comments
[ LP comment 1 by: Murkt, on 2009-03-08 20:07:25.828411+00:00 ] This line in trunk: http://bazaar.launchpad.net/~ipython-dev/ipython/trunk/annotate/head%3A/IPython//iplib.py#L2031 |
[ LP comment 2 by: Murkt, on 2009-03-08 20:54:18.976119+00:00 ] This bug was noticed in 2006 year: http://lists.ipython.scipy.org/pipermail/ipython-dev/2006-August/002305.html |
[ LP comment 3 by: Sergey Kishchenko, on 2009-03-09 11:57:58.714729+00:00 ] I confirm this bug. Attached patch fixed the issue for me |
[ LP comment 4 by: Laurent Dufrechou, on 2009-03-17 20:20:24.848465+00:00 ] also related : |
[ LP comment 5 by: Fernando Perez, on 2009-03-17 20:24:00.727454+00:00 ] That's indeed a bug, but the patch is removing a line that was put in there explicitly for some reason. So what I'd like to have, before committing this, is a set of tests in a file named test_unicode.py, that encapsulates all of the recent unicode work. Unfortunately a lot of these unicode fixes have been made in a completely ad-hoc manner, as people report problems, but we don't have a centralized list of cases to check against. His may be a reasonable fix, for all I know, but I'm afraid that if we apply it we'll get back 10 old bugs again. I don't know, maybe not, but there's simply no way to be sure. I'm one of the most ignorant of our bunch in unicode issues, blissfully living in the stupidity of the ascii world. It would be great if one of us who knows more about this stuff could at least write a set of simple unicode tests that catch many of the recently reported encoding problems. Jorgen, Ville, any chance you guys could take this up at some point? You know about it a lot more than I do... |
[ LP comment 6 by: Jörgen Stenarson, on 2009-03-17 20:28:38.963967+00:00 ] The proposed patch does not work for me on win32 with or without pyreadline sys.stdin.encoding == "cp1252" Standard python: c:\python>python
IPython from trunk: c:\python>ipython IPython 0.9.1 -- An enhanced Interactive Python. In [1]: "åäö" In [2]: u"åäö" In [3]: IPython with proposed change: c:\python>ipython IPython 0.9.1 -- An enhanced Interactive Python. In [1]: "åäö" In [2]: u"åäö" In [3]: |
[ LP comment 7 by: Rodrigo Senra, on 2009-03-24 03:46:20.603219+00:00 ] This bugis still live and kicking. This is wrong whenever there is a unicode string in source. A simple: x = u"ação" with the offending line becomes: x = u'a\xc3\xa7\xc3\xa3o' Notice that the encoding is done inplace,and the u"" is kept after the encoding. This is wrong. |
[ LP comment 8 by: INADA Naoki, on 2009-04-12 00:25:48.950763+00:00 ] This is another patch that handle encoded byte string literal and unicode literal correctly.
|
[ LP comment 9 by: Fernando Perez, on 2009-04-14 07:20:50+00:00 ] Can anyone provide a set of tests that we can actually run As I said earlier, it's quite possible that the various proposed fixes Sorry to seem like a curmudgeon: I really appreciate people |
[ LP comment 10 by: Brian Granger, on 2009-04-14 17:38:38+00:00 ] Definitely, I don't like playing whack-a-mole blind. These types of Brian On Tue, Apr 14, 2009 at 12:20 AM, Fernando Perez fperez.net@gmail.com wrote:
Brian E. Granger, Ph.D. |
[ LP comment 11 by: Jörgen Stenarson, on 2009-04-14 18:16:27+00:00 ] Fernando Perez skrev:
/Jörgen |
[ LP comment 12 by: Fernando Perez, on 2009-04-14 21:58:00+00:00 ] On Tue, Apr 14, 2009 at 11:16 AM, Jörgen Stenarson
Well, even if we have a special file we need to re-run by hand, that
we'll never get anywhere reliable on these unicode problems. We can %run test_unicode ourselves for the full visual verification. Cheers, f |
[ LP comment 13 by: gdamjan, on 2009-05-02 01:04:38.272380+00:00 ] I can confirm this bug and the sollution given. Now obviously the bug is in the input handling of ipython .. how do you make test cases for that?? |
[ LP comment 14 by: Andy Mikhailenko, on 2009-05-14 20:52:29.106176+00:00 ] Confirming. "UTF-8" in all cases, IPython prints screwed up "unicode" strings and this renders the program almost unusable. Anyone got ideas about how to test this? I guess IPython developers possess a bit more knowledge of the immense innards of the package than reporters of the bug do, so users could expect at least some guidelines for writing tests, could they? Maybe we should allow to tune bug-related behaviour in user settings until the bug is finally fixed? This may also help with testing. |
[ LP comment 15 by: pawciobiel, on 2009-09-04 00:24:55.923328+00:00 ] Confirming. core/iplib.py Apart of the above, shouldn't the input be decoded if it's not unicode? cheers, |
[ LP comment 16 by: INADA Naoki, on 2009-10-08 03:52:22.535452+00:00 ] I manage to fix this bug in Python side: http://bugs.python.org/issue5911 |
[ LP comment 17 by: t0ster, on 2010-04-26 16:45:21.798105+00:00 ] Patches worked for me, removing 'source=source.encode(self.stdin_encoding)' helped in Mac OS X 10.6 Thanks |
On Launchpad, Thorsten Glaser wrote: I didn’t use the patch from LP: #290677 due to It may or may not touch all places needed and not break anything unrelated, It at least fixes the two Trac things for me. His patch is available here: It's unfortunately too late in the release cycle for 0.10.1 to properly test this, but if more testing shows it to be stable, we can push a 0.10.2 with this as a fix. |
Brian saw this on Python 2.6, Mac OS X 10.5: ====================================================================== ERROR: test_unicode (IPython.core.tests.test_inputsplitter.InputSplitterTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/bgranger/Documents/Computation/IPython/code/ipython/IPython/core/tests/test_inputsplitter.py", line 353, in test_unicode self.isp.push("u'\xc3\xa9'") File "/Users/bgranger/Documents/Computation/IPython/code/ipython/IPython/core/inputsplitter.py", line 374, in push self._store(lines) File "/Users/bgranger/Documents/Computation/IPython/code/ipython/IPython/core/inputsplitter.py", line 607, in _store setattr(self, store, self._set_source(buffer)) File "/Users/bgranger/Documents/Computation/IPython/code/ipython/IPython/core/inputsplitter.py", line 610, in _set_source return ''.join(buffer).encode(self.encoding) UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 2: ordinal not in range(128) ---------------------------------------------------------------------- Ran 270 tests in 1.974s I don't see it on linux, but let's make sure it's fixed across all platforms after reworking the unicode machinery. |
Notes Robert Kern on list: The code is just wrong (at least on Python 2) since it calls .encode() on a byte string, not a unicode string. You've never decoded it. |
These unicode issues should now be fixed. Please reopen if you can still replicate them. |
I still get this issue, Python 2.7, IPython 0.10.2, Mac OS X 10.6.7, python readline 6.1.0
vs Python:
|
The bug has been fixed in master (soon to be 0.11). It will not be fixed in 0.10. |
use markdown package instead instead of subprocess
Original Launchpad bug 339642: https://bugs.launchpad.net/ipython/+bug/339642
Reported by: vsevolod-solovyov (Murkt).
Default Python shell:
IPython 0.9.1:
sys.stdin.encoding is 'UTF-8'.
How to fix: remove the line №2022 from IPython/iplib.py (for 0.9.1 release). Here it is:
--- a/iplib.py
+++ b/iplib.py
@@ -2019,7 +2019,6 @@
# this allows execution of indented pasted code. It is tempting
# to add '\n' at the end of source to run commands like ' a=1'
# directly, but this fails for more complicated scenarios
I didn't find any intoduced bugs by a quick check.
Additionaly, I checked ipython-wx and ipythonx - latter doesn't have this bug.
The text was updated successfully, but these errors were encountered: