Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ipdir unicode #393

Closed
wants to merge 3 commits into from
Closed

ipdir unicode #393

wants to merge 3 commits into from

Conversation

takluyver
Copy link
Member

This builds on Min's pull request #362. Bug #392 revealed that, in some circumstances, ipdir can already be unicode before we try to decode it. That can result in a confusing error as it attempts to re-encode it and decode again.

I've simply added a check whether ipdir is unicode or bytes before doing the conversion. Michał says that this fixes the issue (#392) for him.

@minrk
Copy link
Member

minrk commented Apr 19, 2011

Thanks, I think the check should actually be done on most of the functions though, not just the one reporting the bug. Probably every time we call decode, it should be inside an if not isinstance(unicode): block

@takluyver
Copy link
Member Author

As far as I know, all the things we're getting from environment variables and __file__ are bytestrings. It's only where the functions call each other, and get unicode back, that it gets mixed up. Of course, we could just do isinstance checks on all of these - it's not like we're calling them all that often.

@minrk
Copy link
Member

minrk commented Apr 23, 2011

Do you want to go through, and just make sure that everywhere that we call '.decode()' in utils.path, we never do it on a unicode object? These aren't called frequently, so the possibility of wasting cycles double checking is not as bad as the possibility of calling decode on a unicode object, which will always fail if the unicode object actually contains any unicode.

Maybe just add:

def _maybe_decode(s, enc=sys.defaultencoding()):
    if isinstance(s, unicode):
        return s
    else:
        return s.decode(enc)

And replace all calls to decode() with that?

@takluyver
Copy link
Member Author

My belief from reading the code is that all those functions in utils.path
will end up decoding bytes, except for the case we found in get_ipython_dir.
But I can't test them on a wide range of systems, so I'll implement the
thing to try to decode them when I'm back at my own computer.

On 23 April 2011 22:35, minrk <
reply@reply.github.com>wrote:

Do you want to go through, and just make sure that everywhere that we call
'.decode()' in utils.path, we never do it on a unicode object? These aren't
called frequently, so the possibility of wasting cycles double checking is
not as bad as the possibility of calling decode on a unicode object, which
will always fail if the unicode object actually contains any unicode.

Maybe just add:

def _maybe_decode(s, enc=sys.defaultencoding()):
   if isinstance(s, unicode):
       return s
   else:
       return s.decode(enc)

And replace all calls to decode() with that?

Reply to this email directly or view it on GitHub:
#393 (comment)

@takluyver
Copy link
Member Author

OK, I've added the helper function, so all those functions should work whether they end up trying to use bytes or unicode.

@minrk
Copy link
Member

minrk commented Apr 29, 2011

Looks great. I'll go ahead and merge.

@minrk minrk closed this in 3fc6efe Apr 29, 2011
mattvonrocketstein pushed a commit to mattvonrocketstein/ipython that referenced this pull request Nov 3, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants