Windows: regression in cd magic handling of paths #1009

Closed
jdmarch opened this Issue Nov 18, 2011 · 22 comments

Projects

None yet

4 participants

@jdmarch
Contributor
jdmarch commented Nov 18, 2011

A. In 0.10.2, the user could copy a path from elsewhere, and paste it after cd magic, backslashes and all:

cd c:\Users\myname

In 0.11, this fails:
SyntaxError: (unicode error) 'rawunicodeescape' codec can't decode bytes in position 5-6: truncated \uXXXX...

No form of quoting (including raw) seems to help.

For manual path entry, forward slashes or double backslashes work, but for copy/paste this is a regression.

B. In 0.10, tab completion of quoted paths with spaces worked a bit (Tabbing immediately after a space, it failed, but then tabbing after a subsequent nonspace it worked again.) AFAICT in 0.12 there is no way to get tab completion of paths with spaces (quoted or not) working following a space character.

C. In 0.10, %cd could recognize paths which included spaces, quoted or unquoted. Not so in 0.12, which requires quotes. Arguable whether this is a troublesome regression or a logical improvement.

@takluyver
Member

Can you show the traceback that goes with that SyntaxError?

@takluyver
Member

(High priority for now because the pasting of paths would be particularly good to have)

@jdmarch
Contributor
jdmarch commented Nov 18, 2011

Not much more than that:

File "<ipython-input-3-4d4e46418f5a>", line 1
SyntaxError: (unicode error) 'rawunicodeescape' codec can't decode bytes 
in position 5-6: truncated \uXXXX (<ipython-input-3-4d4e46418f5a>, line 1)
@minrk
Member
minrk commented Nov 18, 2011

If a Windows person wants to put some time into this, that would be great. I stared at it for a while prior to 0.11, but Windows paths were just too much of a mess.

@takluyver
Member

The input gets translated OK, but when we try to parse it to an AST, Python
sees it as a raw unicode literal: u'get_ipython().magic(ur"cd c:\\Users\\myname")\n' . Unicode escapes are still processed in raw
unicode literals, so I think it will fail on any folder beginning with u or
U, such as Users.

In fact, this will probably affect any magic command with the substring \u
or \U, but cd is the most common case. Perhaps we should replace \ with ,
and not construct a raw string literal. i.e. use u"cd c:\\Users\\myname"
rather than ur"cd c:\Users\myname".

@fperez
Member
fperez commented Nov 18, 2011

@takluyver, good point. In fact, if it's really this that we are producing:

`u'get_ipython().magic(ur"cd c:\\Users\\myname")\n'`

then we're simply doing it wrong; it looks like we're both escaping and making it a raw literal, and that's the bug. By escaping it out we duplicate the , but since it's raw, the second one gets interpreted as a unicode escape instead of a backslash. We should either escape OR make it raw, but not both.

@takluyver
Member

It's somewhat confusing - because the code is itself in a string, those are
actually single slashes, repr-ed as double slashes. So we're making it raw,
but not escaping.

I think we assumed at some point that making it raw removed the need to
escape. Unfortunately, that doesn't hold for unicode escapes in raw unicode
strings.

@fperez
Member
fperez commented Nov 18, 2011

Ah, got it. OK, then in that case it seems like we should escape \ instead of making it raw. Do we also need to escape ' ' (space)?

@takluyver
Member

Not at the same level, I don't think. We may need to inside cd, I'm not
sure.

@fperez
Member
fperez commented Nov 18, 2011

paths with spaces in them are pure evil, but people in windows use them all the time (and XP made it mandatory, with it's boneheaded 'c:\documents and settings' or whatever that directory was named...)

@minrk
Member
minrk commented Nov 18, 2011

It's honestly pathetic that spaces cause as much trouble as they do. It should be considered a huge failing of command-line environments in general, if support for perfectly logical names on the filesystem is ever anything more than trivial.

FWIW Windows 7 (or possibly Vista) changed the default path, so now it's C:\Users\username

@fperez
Member
fperez commented Nov 18, 2011

Well, it's the clash with the years-old convenience of using space as an argument separator. Since the spacebar is the easiest key to hit, cmd line environments made the choice of using spaces into the default, universal argument separator (instead of commas like programming languages use). But that meant having to escape spaces all the time. And it means that string splitting logic must be very careful because it must avoid splitting when escapes are present. That's why it's so easy to get code that works with spaces at one level, but once that code calls something else things break, because the escaping at one level is lost at the next...

In any case, we should make it as easy as possible for people to cd, that's for sure :)

@minrk
Member
minrk commented Nov 18, 2011

Yes, and this is an inherent advantage of GUIs over command-line interfaces. This is all part of why Windows is so horrible as a CLI environment - they really have no interest in it, and if they don't care about CLIs, spaces don't cause problems, so they are free to use them. That of course causes a great deal of pain for those of us who do want a sensible CLI on Windows.

@fperez
Member
fperez commented Nov 19, 2011

Agreed. Funnily enough, as much as I despise spaces in filenames, I find myself naturally gravitating towards them when I use a GUI for file operations; it's just a natural thing to want...

In any case, it seems like (trying to get back on topic :) for this bug we need to at least escape \ and test what happens with ' '. Shouldn't be too hard... This seems like a nasty regression, but I'm worried that if we don't draw the line somewhere we'll never get 0.12 out so I'm reluctant to make it critical. If anyone can provide a PR for this soon, that would go a big ways towards getting it fixed :)

@takluyver
Member

I'll look into it.

@takluyver
Member

The issue is in IPython.utils.text.make_quoted_expr. I think it's fixable, but I suddenly wondered: is make_quoted_expr(s) actually doing anything that we couldn't achieve just by using repr? Is there any case where eval(repr(s)) != s (with a string s)?

@fperez
Member
fperez commented Nov 20, 2011

@takluyver, that's a very good point! Looking at that code, which is very old and ugly, I'm not sure we actually need it. If you want to take a stab at just replacing that function for now with a bare return repr(s) and seeing how things go, it would be great. The more of those home-brewed weird/hackish little utilities we get rid of, the better off we'll be.

@takluyver
Member

@jdmarch: Can you test with PR #1019 ? (I don't have a Windows system handy). It probably doesn't fix issues with tab completion and spaces, but I think cd C:\Users should be working, at least.

@jdmarch
Contributor
jdmarch commented Nov 20, 2011

@takluyver: Thank you! PR #1019 fixes regressions A and C. Specifically, cd now recognizes all paths, quoted or unquoted, with or without spaces, using any mix of path separators (forward slashes, backslashes, or double backslashes), specifically including c:\Users.

Tab completion only works when using forward slashes, and even then only on paths which do not include spaces (quoted or unquoted). (Tested on Windows only.)

@takluyver
Member

Great, that's the key thing I wanted to fix. The completion issues will be trickier, so we might leave them until after 0.12.

@takluyver
Member

I'm closing this - I'll open a new issue for the tab completion, with priority medium.

@takluyver takluyver closed this Nov 20, 2011
@fperez
Member
fperez commented Nov 20, 2011

Perfect, thanks a lot @takluyver for the great and quick work! We'll track the rest of the problem in #1021. @jdmarch, thanks for the report and testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment