Compile with '--enable-unicode=ucs4' by default? #257

Closed
KenMacD opened this Issue Oct 21, 2014 · 21 comments

Projects

None yet

9 participants

@KenMacD
KenMacD commented Oct 21, 2014

It seems system python builds use this option (and a few others I haven't checked). It would be nice if pyenv builds matched common distros builds by default to reduce surprises.

Sharing libraries between the system python and pyenv install is more difficult when the above option isn't specified. Also some thirdparty tools will not work, for example rubypython-0.6.3. They have the error:

ImportError: ~/.pyenv/versions/.../lib/python2.7/lib-dynload/operator.so: undefined symbol: _PyUnicodeUCS2_AsDefaultEncodedString (RubyPython::PythonError)
@yyuu yyuu added the wontfix label Oct 25, 2014
@yyuu
Owner
yyuu commented Oct 25, 2014

Basically, pyenv will build Pythons with default options. If you want to build Python with specific configuration options, please specify the options via environment variables of PYTHON_COFIGURE_OPTS or PYTHON_MAKE_OPTS. Please see also the README.md of python-build.

@yyuu yyuu closed this Oct 25, 2014
@ncoghlan
ncoghlan commented Feb 6, 2016

We recently approved an upstream PEP defining a naming scheme and build environment for prebuilt wheel files for Linux systems, which currently only has wide Unicode builds in the build environment: https://www.python.org/dev/peps/pep-0513/

While that shouldn't be a problem for pyenv Python 3.x builds (since the wide/narrow build distinction was removed back in 3.3), the Python 2.x builds are going to hit this problem: most of the Linux wheels uploaded to PyPI aren't going to be compatible with narrow builds of Python, so pyenv users would still need to compile from source even if a wheel targeting wide builds is available.

(I was directed here from a distutils-sig thread discussing the Unicode build settings for common pre-built binaries on Linux)

@thedrow
Contributor
thedrow commented Feb 7, 2016

@yyuu I think we should reopen.

@yyuu yyuu reopened this Feb 8, 2016
@ncoghlan
ncoghlan commented Feb 8, 2016

Thanks for being willing to reconsider this. For the record, we're still discussing the possibility of simply adding a Python 2.7 narrow build to the reference build environment, as switching from narrow builds to wide builds would pose a binary extension module compatibility problem for any redistributor making the switch: https://mail.python.org/pipermail/distutils-sig/2016-February/028284.html

@yyuu yyuu removed the wontfix label Feb 8, 2016
@yyuu
Owner
yyuu commented Feb 15, 2016

I opened PR #542 for the fix for this. Can anyone try it?

@yyuu yyuu closed this in 90e6e30 Feb 17, 2016
@ncoghlan

Thank you!

@3noch
3noch commented Mar 4, 2016

This change is a big deal. Anyone who installed a version of Python prior to this change may suddenly start getting divergent UCS encodings among their installs. We need to communicate to users that they need to reinstall all their Python versions after upgrading to this version of pyenv. Or pyenv could start distinguishing versions based version number and UCS encoding setting.

@joaoponceleao

Just a note. This change is having some consequences with pretty popular modules like PIL: python-pillow/Pillow#1753

@konklone
Contributor

We need to communicate to users that they need to reinstall all their Python versions after upgrading to this version of pyenv.

Yes -- this bit me and it took a long while of baffled googling before I ended up here and figured out why.

@konklone
Contributor

Anyone who installed a version of Python prior to this change may suddenly start getting divergent UCS encodings among their installs. We need to communicate to users that they need to reinstall all their Python versions after upgrading to this version of pyenv.

FWIW, I wasn't able to escape from the undefined symbol: PyUnicodeUCS2_FromString rabbit hole with these instructions. I had to use the advice found here:

$pyenv uninstall 2.7.11
$PYTHON_CONFIGURE_OPTS="--enable-unicode=ucs2" pyenv install 2.7.11

That solved it for me for the times I have to switch into Python 2 (in my case, when using fabric).

@ncoghlan

Further communication of the reason for the change is definitely desirable here, as folks forcing their Python install back to narrow Unicode builds are going to end up in the situation where their installation experience is worse, since they won't be able to use prebuilt wide unicode wheel files published to PyPI.

A preferable approach would be to use pip to reinstall all the existing packages in the environment:

$ pip freeze | pip install --ignore-installed --no-use-wheel -r /dev/stdin

(The "--no-use-wheel" is needed as Python 2.7 wheels built with versions of pip prior to 8.0.0 didn't have their ABI dependencies encoded correctly, and hence would still show up as compatible)

@3noch
3noch commented Mar 14, 2016

@konklone Sorry if I misled you. That's actually what I had to do as well. My comment was regarding the fact that people will start having different UCS encodings among their Python versions installed by pyenv. The only way to make them all consistent is to reinstall any versions that were installed prior to this update.

@jayvdb jayvdb added a commit to jayvdb/cpython-builder that referenced this issue Jul 12, 2016
@jayvdb jayvdb Fix PYTHON_CONFIGURE_OPTS
PYTHON_CONFIGURE_OPTS was mispelt as PYTHON_COFIGURE_OPTS.

Building with --enable-unicode=ucs4 is now the default
since yyuu/pyenv#257 on February 2016.
dfe93b2
@jayvdb jayvdb referenced this issue in travis-ci/cpython-builder Jul 12, 2016
Merged

Fix PYTHON_CONFIGURE_OPTS #14

@mmerickel

I ran into the issue now where homebrew is building python2 packages against the system python which apparently is using a narrow build on OS X. Thus homebrew-compiled packages are incompatible with pyenv unless using ucs2. Is there any recommendation here? The package in question is from homebrew because it's fairly difficult to compile on its own.

@matthew-brett

Just to say that the change to UCS4 on OSX is a pretty big deal, because this makes pyenv the only - to my knowledge - UCS4 Python build on OSX. Therefore very few people are building wheels that work with pyenv, and so using pyenv leads to the kind of problem mentioned in the issue above - where the user is surprised to find that wheel installs are not working.

@ncoghlan
ncoghlan commented Oct 2, 2016

Regarding my own comments above: they're specific to Linux, where the system Python configuration is determined by distro policy rather than CPython's default build settings, and distros long ago opted for ucs4 as the default (before the question became irrelevant in Python 3.3+). The manylinux1 specification then inherited that convention.

For Mac OS X, the upstream CPython defaults (i.e. a narrow ucs2/UTF-16 build) would be the more portable choice at the pyenv level.

@matthew-brett

Confirming that all the OSX Python variants that I know of are UCS2 builds:

$ # System Python
$ /usr/bin/python -c "import sys; print('ucs2' if sys.maxunicode == 65535 else 'ucs4')"
ucs2
$ # Python.org Python
$ /Library/Frameworks/Python.framework/Versions/2.7/bin/python -c "import sys; print('ucs2' if sys.maxunicode == 65535 else 'ucs4')"
ucs2
$ # Homebrew Python
$ /usr/local/Cellar/python/2.7.12/bin/python -c "import sys; print('ucs2' if sys.maxunicode == 65535 else 'ucs4')"
ucs2
$ # Macports Python
$ /opt/local/bin/python2.7 -c "import sys; print('ucs2' if sys.maxunicode == 65535 else 'ucs4')"
ucs2
$ # Anaconda Python
$ /Users/bnaul/anaconda/bin/python -c "import sys; print('ucs2' if sys.maxunicode == 65535 else 'ucs4')"
ucs2

I think you did the switch to UCS4 to be more compatible with standard Linux distributions. For the same reason, here's a plea to switch back to UCS2 by default on OSX. Otherwise pyenv users on OSX are going to have many more problems installing standard packages.

@yyuu
Owner
yyuu commented Oct 3, 2016

The encoding configuration is done at https://github.com/yyuu/pyenv/blob/v1.0.2/plugins/python-build/bin/python-build#L1922-L1925

I can tweak the lines to stop configuring UCS4 on OS X. Although, I'm not sure how it should be on other platforms like BSDs.

@matthew-brett

How about tweaking that line for OSX only for now? I don't know what BSD's defaults are for Python builds either, but I guess it would be reasonable to make changes for BSD later, when more information comes in?

@matthew-brett

It appears this issue just came up again for Python / Pillow (see link above).

@matthew-brett

FreeBSD 10 via pkg install python:

# python -c "import sys; print('ucs2' if sys.maxunicode == 65535 else 'ucs4')"
ucs4

OpenBSD 6.0 via pkg_add python:

# python2.7 -c "import sys; print('ucs2' if sys.maxunicode == 65535 else 'ucs4')"
ucs2

So maybe UCS4 for FreeBSD, UCS2 for OpenBSD.

@yyuu
Owner
yyuu commented Oct 5, 2016

I've opened #726 to stop configuring --enable-unicode=ucs4 on OS X. Please give it a try and will merge it if it works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment