Skip to content
This repository has been archived by the owner on Apr 9, 2021. It is now read-only.

ENH:Create universal wheel on PyPi for pyicu #39

Closed
wants to merge 1 commit into from

Conversation

linwoodc3
Copy link

If this is too instructive, forgive me. Just wanted to require minimal effort from original author since this is me asking you to add something to your repository.

Summary of Merge Request

@ovalhub, this merge request creates a “Universal Wheel” to resolve an issue on the MacOS where pyicu's inability to fully install via pip install pyicu causes an error with polyglot, which is a multilingual NLP library created by @aboSamoor .

Details of Request

Request @ovalhub accept merge request, and run the following commands to update the PyICU PyPi packaging. This will include a universal wheel in the File section.

The problem encountered and fix are discussed in detail here.

What's the Change?

Just created a simple setup.cfg file as described in the Universal Wheels section of the Python Packaging User Guide. The exact content of the simple setup.cfg file is here.

How to "update" your Packaging

After this file is in the master pyicu repository, you run the following commands, as described in the Packaging User Guide:

>>> cd pyicu
>>> python setup.py bdist_wheel --universal  # create the universal wheel 
>>> python setup.py bdist_wheel # creates platform specific wheel just in case

Now, the last step is just uploading the distribution to PyPi. If you use twine (recommended approach, see here for security discussion):

>>> twine upload dist/*

Or the setuptools alternative (not recommended, see here for security discussion):

>>> python setup.py sdist bdist_wheel upload

This should solve one of the problems discussed in the polyglot issue.

…eel. Should pip install of pyicu, and fix the polyglot python 2.7 error on MacOSX.
@ovalhub
Copy link
Owner

ovalhub commented Dec 11, 2016 via email

@linwoodc3
Copy link
Author

linwoodc3 commented Dec 11, 2016

No problem @ovalhub . Thanks for considering this. I'll explain as best as I can (not a Comp Sci person). I figured this out by trouble shooting and just reading tons.

What does a "wheel" accomplish?

It resolved the install issues I had for a project that uses pyicu as a dependency; will likely resolve other issues, but also, aligns your pyicu project to the new PEP Distribution Standard (not so new since standard was approved in Feb 2013) , which uses the wheel binary package format because it frees installers from having to know about the build system, saves time by amortizing compile time over many installations, and removes the need to install a build system in the target environment.

I created two workarounds for pyicu to successfully install on Python 2.7 and Python 3.5 in the Anaconda ecosystem, but I think this small merge would solve the problem (you would add a universal wheel to your PyPi distribution).

You may ask, but who cares about Python wheels? Why does pyicu need it?

The webpage pythonwheels describes this benefit best, but also directly relates to pyicu as it's a C extension as you said above. This is text pasted from the webpage, but note the bullet I bolded below):

What are wheels?

Wheels are the new standard of python distribution and are intended to replace eggs. Support is offered in pip >= 1.4 and setuptools >= 0.8.

Advantages of wheels

  • Faster installation for pure python and native C extension packages.
  • Avoids arbitrary code execution for installation. (Avoids setup.py)
  • Installation of a C extension does not require a compiler on Windows or OS X.
  • Allows better caching for testing and continuous integration.
  • Creates .pyc files as part of installation to ensure they match the python interpreter used.
  • More consistent installs across platforms and machines.

I believe adding this file will fix the install problem for the scientific community folks who use anaconda. Moreover, it will help the folks who want to use polyglot, which depends on pyicu.

I created two work to install pyicu on a MacOS machine running Anaconda and using Python 2.7 or Python 3.5

Recreating my problem

To recreate the problem I experienced (I'm on a Mac), I think you may need to have Anaconda installed. It's a scientific computing distribution.

Here are the steps for me:

pip install polyglot
python
from polyglot.text import Text, Word

That always gives me an error like this:
Library not loaded: libicui18n.54.dylib which comes from something called docs, line 23:

Here is a screen capture of the traceback:
screen shot 2016-12-11 at 6 58 13 pm

Hope this explains it!

@ovalhub
Copy link
Owner

ovalhub commented Dec 12, 2016 via email

@linwoodc3
Copy link
Author

@ovalhub , yes, I did tests with 58.1 and 54.1.0, and 54.1.1. Only 54.1.1 works. Weird, but that's what i found through trial and error.

this gets me icu 54.1.1:

conda install -c ccordoba12 icu=54.1

which is from here: https://anaconda.org/ccordoba12/icu

None of these work (which is why it was confusing..lots of trial and error):

conda install icu # installs 54.1.0, doesn't work
conda install -c conda-forge icu=58.1 # doesn't work
conda install -c anaconda icu=57.1  # doesn't work

Even tried Homebrew, but Homebrew only install 58.1:
http://brewformulas.org/Icu4c

brew install icu4c

All gave me that Library not loaded: libicui18n.54.dylib except for the icu 54.1.1 install.

But, even with that, when I tried to run polyglot, which uses pyicu, I got that same error. Only using easy_install worked. But then, I cloned and made the universal wheel, and pip install of the whl file worked perfectly.

@linwoodc3
Copy link
Author

linwoodc3 commented Dec 12, 2016

I think the main problem is pip install; something about the easy_install works, while it doesn't work in the pip. To be clear, I don't get an error on the pip install pyicu, but when I try to import something from pyicu, it gives an error. However, when I install pyicu with easy_install, I don't get that error.

Another Option (more simple)

When I use pip to download your tar file, you have the setup.cfg. I believe adding a simple line as you see in this pic would do the trick:

screen shot 2016-12-11 at 7 29 14 pm

Then, you would run these commands, and redistribute to PyPi.

>>> cd pyicu
>>> python setup.py bdist_wheel --universal  # create the universal wheel 
>>> python setup.py bdist_wheel # creates platform specific wheel just in case
>>> twine upload dist/*
#or
>>> python setup.py sdist bdist_wheel upload

I tested, and the whl file that appears in the dist/ folder successfully installs pyicu for me and I don't get the error anymore.

Let me know if I can assist anymore. Thanks again for considering.

@ovalhub
Copy link
Owner

ovalhub commented Dec 12, 2016 via email

@linwoodc3
Copy link
Author

I'm Sorry @ovalhub , I missed the earlier comments. Went back, read, and understand your point. The combinatorics make this undoable as the size would blow up and things change too much. This comment from the numba maintainers made it hit home.

Yes, I think polyglot is locked to 54.1.1 (not other way to explain all the errors); I'll look into how I create the binary fork you mentioned (have to figure out how to do it).

Thanks for considering; Learned a bit from you too!!!

I have the workarounds in Gists so hopefully people will come up on those when they run into errors.

I'll link in case a Google search gets them to this page:

Feel free to close; or do I close??

@ovalhub
Copy link
Owner

ovalhub commented Dec 12, 2016 via email

@asottile
Copy link

fwiw, a "universal" wheel is inappropriate for packages which have binary dependencies (such as this one) as universal is intended for pure python packages

@ghost
Copy link

ghost commented Jan 28, 2019

I cannot distribute a PyICU binary on PyPI as it wouldn't work for any people that aren't running the same combination compiler, OS, ICU, Python that I used to build it.

I believe it is possible to provide prebuilt binaries for the people who can use them and still provide source distributions for people who can't. That would be one wheel for each target platform and version. There is even a third-party PyPI package that seems to be doing this for linux versions. For example the coverage.py project offers many different wheels, and pip install coverage will automatically select the right one for the user's platform. It is also common for open-source projects to take advantage of free continuous-integration server time to build artifacts on multiple platforms, such as on Travis, Appveyor, and CircleCI.

As far as I know, the main thing required for getting started is just uploading the built distribution like so:

python setup.py sdist bdist_wheel 
twine upload dist/*

This would allow users on compatible platforms to pip install pyicu without having to build it themselves.

@ovalhub
Copy link
Owner

ovalhub commented Jan 28, 2019 via email

@ghost
Copy link

ghost commented Jan 28, 2019

May I ask what commands you currently run to produce and release a new version?

@ovalhub
Copy link
Owner

ovalhub commented Jan 28, 2019 via email

@ghost
Copy link

ghost commented Jan 28, 2019

Are you proposing to set this up and maintain it ?

I'm not prepared to take on that responsibility. I did want to share what I understand about what would be involved.

I believe the current recommendation from PyPA boils down to

python setup.py sdist bdist_wheel 
twine upload dist/*

and assuming the setup.py is written per the docs, I think that'll upload the source distribution and the wheel appropriate to the platform executing the command. It's also possible to produce wheels for multiple linux targets using the manylinux project.

@ovalhub
Copy link
Owner

ovalhub commented Jan 28, 2019 via email

@asottile
Copy link

A note on manylinux, for libraries which typically would link against a system library (for example, pyicu typically links against libicudata.so.XX, libicui18n.so.XX, libicuuc.so.XX) -- a manylinux wheel would involve vendoring those shared libraries into the wheel (auditwheel provides this functionality for you).

This ~might be undesirable as it removes the benefits of dynamic linking from a security perspective -- that is you no longer get benefit from your system package manager shipping security patches to libraries that are dynamically linked. By extension, this ~probably makes releasing pyicu necessary if a security issue happened in the libicu*.so.XX libraries which it vendors.

That said, once you have a script which can build manylinux wheels, the rest is pretty easy (usually via docker) -- here's some examples from my projects (not super sophisticated): sass/libsass-python asottile/setuptools-golang (NOTE: I've only manylinux packaged things without .so dependencies due to the reason I mentioned above)

@abitrolly
Copy link

Just a note that pip already generates the wheel while installing PyICU for target Linux.

image

https://github.com/chubin/cheat.sh/runs/1409411905#step:3:75

@asottile
Copy link

@abitrolly that's a wheel specific to your machine built for the pip cache and isn't manylinux compatible / distributable (it links against whatever version of libicu you have installed on your machine)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants