Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pipenv doesn't work with locale encoding different than UTF-8 on Linux #3131

Closed
vstinner opened this issue Oct 30, 2018 · 8 comments
Closed

pipenv doesn't work with locale encoding different than UTF-8 on Linux #3131

vstinner opened this issue Oct 30, 2018 · 8 comments

Comments

@vstinner
Copy link

@vstinner vstinner commented Oct 30, 2018

Hi,

I wanted to try pipenv, but it doesn't work with the fr_FR locale.

Versions:

  • pipenv 2018.10.13
  • Python 3.6.6
  • Fedora 28
$ python3 -m venv env
$ env/bin/python -m pip install pipenv
$ LANG=fr_FR env/bin/pipenv install
Traceback (most recent call last):
  File "env/bin/pipenv", line 11, in <module>
    sys.exit(cli())
  File "/home/vstinner/env/lib/python3.6/site-packages/pipenv/vendor/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/home/vstinner/env/lib/python3.6/site-packages/pipenv/vendor/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/vstinner/env/lib/python3.6/site-packages/pipenv/vendor/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/vstinner/env/lib/python3.6/site-packages/pipenv/vendor/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/vstinner/env/lib/python3.6/site-packages/pipenv/vendor/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/vstinner/env/lib/python3.6/site-packages/pipenv/vendor/click/decorators.py", line 64, in new_func
    return ctx.invoke(f, obj, *args, **kwargs)
  File "/home/vstinner/env/lib/python3.6/site-packages/pipenv/vendor/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/vstinner/env/lib/python3.6/site-packages/pipenv/vendor/click/decorators.py", line 17, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/home/vstinner/env/lib64/python3.6/site-packages/pipenv/cli/command.py", line 249, in install
    editable_packages=state.installstate.editables,
  File "/home/vstinner/env/lib64/python3.6/site-packages/pipenv/core.py", line 1975, in do_install
    skip_lock=skip_lock,
  File "/home/vstinner/env/lib64/python3.6/site-packages/pipenv/core.py", line 1282, in do_init
    pypi_mirror=pypi_mirror,
  File "/home/vstinner/env/lib64/python3.6/site-packages/pipenv/core.py", line 708, in do_install_dependencies
    bold=True,
  File "/home/vstinner/env/lib/python3.6/site-packages/pipenv/vendor/click/utils.py", line 260, in echo
    file.write(message)
UnicodeEncodeError: 'latin-1' codec can't encode character '\u2026' in position 59: ordinal not in range(256)

It seems like pipenv likes non-ASCII characters for fancy output.

@serhiy-storchaka
Copy link

@serhiy-storchaka serhiy-storchaka commented Oct 30, 2018

Emojis are not the only cause. Pseudographics used in a progress bar will cause the same kind of errors.

Loading

@vstoykov
Copy link

@vstoykov vstoykov commented Oct 30, 2018

If you want French locale (or any other locale) in 2018 you should be using UTF aware locale like fr_FR.UTF-8. No one should use locale which is not UTF-8 aware. Even when you want C locale you should be using C.UTF-8.

Loading

@techalchemy
Copy link
Member

@techalchemy techalchemy commented Oct 30, 2018

This is unfortunate but you have to talk to Kenneth about it :( We really ought to just ignore them if they can’t be printed, but click also demands utf8 locale settings.

In the interim you can set PIPENV_HIDE_EMOJIS=1 to make things play nicely. Let me know if that helps, and thanks for taking the time to contribute.

I have to admit I’m curious to see what exactly you’re working on :)

Loading

@vstinner
Copy link
Author

@vstinner vstinner commented Oct 30, 2018

If you want French locale (or any other locale) in 2018 you should be using UTF aware locale like fr_FR.UTF-8.

Well, it's just an example of locale. There are other locales which don't use UTF-8, but ASCII, ShiftJIS or anything else.

This is unfortunate but you have to talk to Kenneth about it :( We really ought to just ignore them if they can’t be printed, but click also demands utf8 locale settings.

Oh.

Loading

@techalchemy
Copy link
Member

@techalchemy techalchemy commented Oct 30, 2018

@vstinner I did just rewrite some of our output encoding to use translation maps and native encodings, so I’m wondering how I can solve this most helpfully. You certainly would know more on the subject so if you have any insight I’d be curious. Currently we basically use locale.getpreferredencoding() with some small hacks to avoid setting it on Linux.

To write output we essentially use this approach:

UNICODE_TO_ASCII_TRANSLATION_MAP = {
    8230: u"...",
    8211: u"-"
}


def decode_output(output):
    if not isinstance(output, six.string_types):
        return output
    try:
        output = output.encode(DEFAULT_ENCODING)
    except (AttributeError, UnicodeDecodeError):
        if six.PY2:
            output = unicode.translate(vistir.misc.to_text(output),
                                            UNICODE_TO_ASCII_TRANSLATION_MAP)
        else:
            output = output.translate(UNICODE_TO_ASCII_TRANSLATION_MAP)
    output = output.decode(DEFAULT_ENCODING)
    return output

I’m wondering if I shouldn’t just use an ignore errors re-encoding approach there

Loading

@hroncok
Copy link
Contributor

@hroncok hroncok commented Oct 30, 2018

If the locale is C, it will be coerced to C.utf-8 by Python.

If the locale is not C and not utf-8 based, something is wrong (at least on Linux) and maybe the user just needs to be told (i.e. do what click does: abort).

Loading

@vstoykov
Copy link

@vstoykov vstoykov commented Oct 30, 2018

@techalchemy looking at your code for fail-safe decoding/encoding of text it reminds me of one converter that I made in the past, which uses python's codecs.register_error. You can look at my convert-encoding.py

The only downside with this approach is that you are registering this converter globally and if this technique is used by different modules then there is a chance of name collision if names are not carefully chosen.

Loading

techalchemy added a commit that referenced this issue Oct 30, 2018
- Drops any unmapped non-ascii characters on non-utf8 systems
- Fixes #3131

Signed-off-by: Dan Ryan <dan@danryan.co>
@techalchemy techalchemy self-assigned this Oct 30, 2018
@techalchemy techalchemy added this to To do in Better user experience via automation Oct 30, 2018
@vstinner
Copy link
Author

@vstinner vstinner commented Oct 31, 2018

Thank you ;-)

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants