Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix pip when using Py2 on Windows with non-ASCII hostname or username. #3970

Closed
wants to merge 8 commits into from
3 changes: 3 additions & 0 deletions CHANGES.txt
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,9 @@

* Added pip completion support for fish shell.

* Fix problems on Windows on Python 2 when username or hostname contains
non-ASCII characters (:issue:`3463`, :pull:`3970`).

* Use git fetch --tags to fetch tags in addition to everything else that
is normally fetched; this is necessary in case a git requirement url
points to a tag or commit that is not on a branch (:pull:`3791`)
Expand Down
24 changes: 24 additions & 0 deletions pip/utils/appdirs.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
import sys

from pip.compat import WINDOWS, expanduser
from pip._vendor.six import PY2, text_type


def user_cache_dir(appname):
Expand Down Expand Up @@ -35,6 +36,11 @@ def user_cache_dir(appname):
# Get the base path
path = os.path.normpath(_get_win_folder("CSIDL_LOCAL_APPDATA"))

# When using Python 2, return paths as bytes on Windows like we do on
# other operating systems. See helper function docs for more details.
if PY2 and isinstance(path, text_type):
path = _win_path_to_bytes(path)

# Add our app name and Cache directory to it
path = os.path.join(path, appname, "Cache")
elif sys.platform == "darwin":
Expand Down Expand Up @@ -222,3 +228,21 @@ def _get_win_folder_with_ctypes(csidl_name):
_get_win_folder = _get_win_folder_with_ctypes
except ImportError:
_get_win_folder = _get_win_folder_from_registry


def _win_path_to_bytes(path):
"""Encode Windows paths to bytes. Only used on Python 2.

Motivation is to be consistent with other operating systems where paths
are also returned as bytes. This avoids problems mixing bytes and Unicode
elsewhere in the codebase. For more details and discussion see
<https://github.com/pypa/pip/issues/3463>.

If encoding using ASCII and MBCS fails, return the original Unicode path.
"""
for encoding in ('ASCII', 'MBCS'):
try:
return path.encode(encoding)
except (UnicodeEncodeError, LookupError):
pass
return path
8 changes: 8 additions & 0 deletions tests/unit/test_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
from pip.utils.encoding import auto_decode
from pip.utils.hashes import Hashes, MissingHashes
from pip.utils.glibc import check_glibc_version
from pip.utils.appdirs import user_cache_dir
from pip._vendor.six import BytesIO


Expand Down Expand Up @@ -524,3 +525,10 @@ def test_manylinux1_check_glibc_version(self):
else:
# Didn't find the warning we were expecting
assert False


class Test_user_cache_dir(object):

def test_return_path_as_str(self):
"""Test that path is bytes on Python 2 and Unicode on Python 3."""
assert isinstance(user_cache_dir('test'), str)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would go with something like:

@pytest.mark.skipif("sys.platform != 'win32'")
@patch('pip.utils.appdirs._get_win_folder', _get_win_folder)
def test_your_test(self, _get_win_folder):
    _get_win_folder.return_value = u"läppäri"
    assert isinstance(user_cache_dir('test'), str)
    # Test against regression #3463
    from pip import create_main_parser
    create_main_parser().print_help()  # This should not crash

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking something similar but didn't want to add so complex test when I cannot run tests locally.