Provide NumPy preinstalled #48

Closed
michaelklishin opened this Issue May 9, 2012 · 66 comments

Projects

None yet

8 participants

@michaelklishin
Contributor

Moved from travis-ci/travis-ci#534

As discussed on the travis-ci Google Group, many Python packages depend on NumPy so it would be helpful to have a recent NumPy installed for each supported Python platform (i.e. Python 2.5, 2.6, 2.7 and 3.2 at the time of writing).

Note PyPy is a special case (and now comes with its own built-in partial NumPy emulating library, NumPyPy)

Currently it appears NumPy must be installed from source, which takes a while and wastes bandwidth and CPU.

@michaelklishin michaelklishin referenced this issue in travis-ci/travis-ci May 9, 2012
Closed

Support pre-installed numpy on Python #534

@michaelklishin
Contributor

I cannot find the link to the thread but it all comes down to finding a way to install NumPy on multiple Python versions in a reasonable (e.g. not +1 hour of compilation) amount of time. Unfortunately, Travis team does not have resources or knowledge about NumPy and Python ecosystem to do this on our own. Someone has to come up with Debian packages or an installer that will play nice with virtualenv (on travis-ci.org, every Python has its own virtualenv, currently with system packages disabled).

@peterjc
peterjc commented May 10, 2012

So this isn't as simple as the Travis team doing the installation once manually (for each Python virtualenv) and updating the base VM?

@michaelklishin
Contributor

@peterjc VM images are sometimes updated a few times a week. Sorry, that won't fly.

@Mezzle
Contributor
Mezzle commented Sep 17, 2012

does the ubuntu provided python-numpy not work with virtualenv?

@michaelklishin
Contributor

We do not use --system-site-packages by default (it was recommended by the virtualenv author). Actually now we have something that brings us a lot closer to being able to use system NumPy: there are now two sets of virtualenvs, one with SSP enabled. I will have to investigate.

@peterlandry

What about symlinking python-numpy into the virtualenv?

@michaelklishin
Contributor

Libraries that have native code cannot be symlinked and loaded across all Python versions. CPython's internal APIs change between releases.

@peterlandry

I would be willing to trade broad python version support in numpy for the ability to have my tests complete in at least 2.7. As it is, all of our tests (which rely on numpy) time out.

@peterlandry

Also, I think you can host a binary egg with pre-built numpy versions suitable for your worker boxes. I'll do some research on that myself, but it might be worth looking into.

@michaelklishin
Contributor

We cannot make specializations to our environment for just one library. If someone finds a way to install NumPy for each python in a reasonable amount of time, we will try to use it for provisioning.

@peterlandry

Since I only need to support python 2.7, and fwiw, I was able to get away with the following, if anyone else needs it:

before_install:
  - sudo apt-get install python-numpy
install:
  - ln -s /usr/lib/python2.7/dist-packages/numpy ~/virtualenv/python2.7/lib/python2.7/site-packages/
@peterjc
peterjc commented Oct 18, 2012

Thanks @peterlandry - that trick works nicely for Python 2.7 (although we need the headers too, so I used 'sudo apt-get install python-numpy python-numpy-dev').

It can also be extended to Python 3.2 as well ('sudo apt-get install python3-numpy python3-numpy-dev'). It is a bit fiddly, but perhaps the Biopython .travis.yml file will be illustrative:
biopython/biopython@a082344

@michaelklishin
Contributor

I am going to try how long it takes to provision NumPy later today, maybe we can squeeze it into provisioning. On an 8 core machine the results are encouraging, now I need to see if it goes up about 8 times in our VMs.

@peterlandry

I definitely think people would be willing to trade broad version support to easy provisioning, if that helps.

@peterjc peterjc referenced this issue in biopython/biopython Oct 22, 2012
Closed

Add a "pdb-atom" parser to SeqIO (via PdbIO) #75

@michaelklishin michaelklishin added a commit that referenced this issue Nov 4, 2012
@michaelklishin michaelklishin Initial cut at preinstalling NumPy, references #48
Code in the tutorial works but it installed surprisingly quickly
(~ 2 minutes per virtualenv). We will see how it goes.
d552f49
@peterjc
peterjc commented Nov 22, 2012

Is this live? There is no mention of numpy on http://about.travis-ci.org/docs/user/languages/python/ and my assumption from reading d552f49 is it would be automatically preinstalled - but that doesn't seem to be the case, eg: https://travis-ci.org/peterjc/ccn/builds/3314814

Thanks!

@michaelklishin
Contributor

We install it via pip on pythons that are not 3.x and pypy. What else is necessary?

@michaelklishin
Contributor

Wait, what you are seeing is likely the result of not deploying new images for almost a month. Try again tomorrow, I will try to roll new ones out.

@michaelklishin
Contributor

This should be deployed. I am not sure if anything else may be necessary so won't close it yet. Please let us know how it goes.

@y-p
y-p commented Nov 23, 2012

Previously working py3 builds using numpy started consistently failing about half a day ago.
example: build.

The error is:

2320
2321 File "/home/travis/virtualenv/python3.1/build/numpy/build/py3k/numpy/distutils/ccompiler.py", line 458, in CCompiler_get_version
2322
2323 status, output = exec_command(version_cmd,use_tee=0)
2324
2325 File "/home/travis/virtualenv/python3.1/build/numpy/build/py3k/numpy/distutils/exec_command.py", line 197, in exec_command
2326
2327 if _with_python and (0 or sys.__stdout__.fileno()==-1):
2328
2329ValueError: underlying buffer has been detached

I can reproduce locally after upgrading to virtualenv 1.8.3, I'm guessing that's the
cause on travis' end as well.

I've opened an issue on pypa/virtualenv and found the offending commit, Hopefully
a point release is forthcoming:
https://github.com/pypa/virtualenv/issues/ 359

@y-p
y-p commented Nov 23, 2012

@michaelklishin , with regards to numpy,scipy and long compilation times, have you considered
using ccache as an alternative solution? it might be a good fit to the scenario,
and would be a more general solution then micro-managing specific packages - which I'm sure is exhausting.

@michaelklishin
Contributor

@y-p cache probably will not work for our case: no state is preserved between builds, you need to compile extensions separately for different Pythons, too.

@michaelklishin
Contributor

@y-p thanks for digging into pypa/virtualenv#359, if/when a new point release is out, we will try to roll it out in a couple of days max.

@michaelklishin
Contributor

As far as NumPy goes, we preinstall it via pip. Is there anything else that needs to be done? This should be deployed, can someone confirm it does reduce dependency installation time?

@y-p
y-p commented Nov 23, 2012

@michaelklishin , ccache is designed not to produce something different from what your compiler would,
I don't see how seperate versions is an exception to that.

If you keep the compiled objects cache on persistent shared storage available to all VMs, you don't
need any other persistent state. You can manage what goes in to the cache yourself by compiling
packages you wish to support, and give the VMs just read-only access.

Perhaps the possible win merits at least some exploration?
In any case, Thank you for your continued work on python support in Travis.

@michaelklishin
Contributor

What I mean is that we cannot reuse caches for 2.5 to speed up compilation of NumPy [or anything else] on 3.2 because internal CPython APIs change over time.

@michaelklishin
Contributor

We do not compile things every time in each VM, we compile it once and then deploy the image. But it already takes quite a bit of time.

@y-p
y-p commented Nov 23, 2012

What I'm suggesting, is that you compile what you're interested in supporting once, whenever the buld environment (for the package, I mean) changes, and then whenever users install a package which invokes the c/c++ compiler, the artifacts will get pulled from the common cache.

@michaelklishin
Contributor

That's too complicated and will require modifying pip and such. We won't do that, just like we don't patch Rubies, for example. Things should be predictable and as close to what you get on Linux locally as possible.

@y-p
y-p commented Nov 23, 2012

Assuming it would work (!), you could shrink your images, support more packages, decouple
package support from vm building which would speed up the process.
I don't see why pip would need modifying at all.

in any case, It was just a well-meant suggestion, and we've hit our 3 round limit.

thanks again.

@peterlandry

The current setup has been working great for our tests that require numpy. We've been running builds since it was added to the base image with no issues. Thanks!

@peterjc
peterjc commented Nov 24, 2012

Looks good for the pre-installed NumPy on Python 2.5, 2.6 and 2.7 which is great.

However, it doesn't seem to be pre-installed for Python 3.2 yet, at least not for the two repositories I've tried. e.g.
https://travis-ci.org/peterjc/ccn/jobs/3338862

@michaelklishin
Contributor

@peterjc see the discussion above. Something in Virtualenv 1.8.3 makes it impossible to install NumPy on 3.x.

@peterjc
peterjc commented Nov 24, 2012

OK, thanks for clarifying this. A note on http://about.travis-ci.org/docs/user/languages/python/ would be really helpful.

For anyone else needing NumPy under Python 3.2 in the short term, I'm still using the hack based on the idea from Peter Landry earlier in this discussion:

before install:
  - "if [[ $TRAVIS_PYTHON_VERSION == '3.2' ]]; then sudo apt-get install python3-numpy python3-numpy-dev 2>&1 | tail -n 2; fi"
  - "if [[ $TRAVIS_PYTHON_VERSION == '3.2' ]]; then ln -s /usr/lib/python3/dist-packages/numpy ~/virtualenv/python3.2/lib/python3.2/site-packages/; fi"

eg biopython/biopython@3086cf7

@michaelklishin
Contributor

Just to make it clear, we will preinstall NumPy on all pythons it can install on. But at the moment we had to leave 3.x and PyPy out, because pip install numpy fails on them. When a new Virtualenv release is out, we will try to support 3.x as soon as we can.

@peterjc
peterjc commented Nov 24, 2012

Great.

Note NumPy under PyPy is a special case, because it uses so much C code NumPy will not work under PyPy (or Jython), and this is not likely to ever work. However, the PyPy team are working on a NumPy emulation layer NumPyPy which is bundled with PyPy (at least for now - long term it could be split out).

@y-p
y-p commented Nov 24, 2012

@michaelklishin, I believe pip tries to get 1.6.2 off pypi for py3.3 currently.
Please note this ticket.

I've successfully installed 1.7.0b2 under a py33 env using (a patched) venv 1.8.3.
by having pip directly install the tarball from sourceforge.

@dnouri
dnouri commented Nov 24, 2012

One can also create a new virtualenv with --system-site-packages and then use all globally installed packages, without the need to manually link anything. See: http://danielnouri.org/notes/2012/11/23/use-apt-get-to-install-python-dependencies-for-travis-ci/

@michaelklishin
Contributor

@dnouri or use

virtualenv:
  system_site_packages: true
@dnouri
dnouri commented Nov 24, 2012

Oh, cool, who knew.

FTR, it's

virtualenv:
  system_site_packages: true

(note the added s)

I've updated my blog post...

@y-p
y-p commented Nov 25, 2012

@michaelklishin , 1.8.4 is out.

@michaelklishin
Contributor

I can confirm that NumPy now installs on 3.2 but not on 3.3. Not sure if 3.3 is supported by stable releases or not.

@peterjc
peterjc commented Nov 26, 2012

NumPy 1.6.2 does not work on Python 3.3 and a point release is not being planned to address this,
http://projects.scipy.org/numpy/ticket/2221

NumPy 1.7 (currently in beta) does work on Python 3.3 and will officially support it.

@michaelklishin
Contributor

Unfortunately, I spoke too soon: 3.2 still fails to install NumPy:

  File "/home/travis/virtualenv/python3.2/bin/pip", line 9, in <module>
    load_entry_point('pip==1.2.1', 'console_scripts', 'pip')()
  File "/home/travis/virtualenv/python3.2/lib/python3.2/site-packages/pip-1.2.1-py3.2.egg/pip/__init__.py", line 111, in main
    return command.main(args[1:], options)
  File "/home/travis/virtualenv/python3.2/lib/python3.2/site-packages/pip-1.2.1-py3.2.egg/pip/basecommand.py", line 144, in main
    log_fp = open_logfile(log_fn, 'w')
  File "/home/travis/virtualenv/python3.2/lib/python3.2/site-packages/pip-1.2.1-py3.2.egg/pip/basecommand.py", line 173, in open_logfile
    os.makedirs(dirname)
  File "/home/travis/virtualenv/python3.2/lib/python3.2/os.py", line 152, in makedirs
    mkdir(name, mode)
OSError: [Errno 13] Permission denied: '/root/.pip'
@y-p
y-p commented Nov 26, 2012

pip install numpy works for me locally with venv 1.8.4 in a 3.2 environment.

@y-p
y-p commented Nov 27, 2012

Still getting detach errors on 3.2. So I did some tests on a vagrant box with the python::multi recipe.
By default, the encoding of the terminal is not UTF-8:

vagrant@precise32:/home/travis/virtualenv$ echo $LANG
en_US

Which means the fix introduced in virtualenv 1.8.4 is not effective by default.
See commit log here.

The workaround described worked for me however. Using PYTHONIOENCODING=utf8 pip install numpy
rather then just pip install allows numpy to be installed under 3.2

No idea about the permission error you posted.

EDIT: travis does use utf-8 by default (I didn't replicate the vm properly), simply still at 1.8.3. sorry.

@y-p
y-p commented Nov 28, 2012

@michaelklishin , any update on when 1.8.4 will go live?

@michaelklishin
Contributor

@y-p I am very busy this week and probably the next one, as soon as someone else deploys new images. 1.8.4 did not fix NumPy installation issues for Python 3.x, though.

@y-p
y-p commented Nov 29, 2012

@michaelklishin , 1.8.4 did fix the previous problems with "detach" errors, verified on debian testing and on a stock precise32 vm. The "Permission denied" error you mentioned looks to be travis-specific.

@y-p
y-p commented Nov 29, 2012

Until things get resolved, here is a working workaround:

language: python

python:
  - 2.6
  - 2.7
  - 3.1
  - 3.2


install:
  - export PYTHONIOENCODING=UTF8 # just in case
  - 'if [ $TRAVIS_PYTHON_VERSION == "3.2" ] || [ $TRAVIS_PYTHON_VERSION == "3.1" ]; then pip install https://github.com/y-p/numpy/archive/1.6.2_with_travis_fix.tar.gz; fi'
  - 'if [ ${TRAVIS_PYTHON_VERSION:0:1} == "2" ]; then pip install numpy; fi' # should be nop if pre-installed

script:
   - nosetests

The issues have been fixed both on the numpy side (on master) and on the venv/distribute side (in 1.8.4), the above
installs numpy from a snapshot of 1.6.2 + the cherry-picked fix from numpy/numpy master. Once 1.8.4
is live, stock 1.6.2 should work again (in a utf8 environment).

@michaelklishin
Contributor

3.1 is gone now, replaced by 3.3.

@y-p
y-p commented Nov 29, 2012

@michaelklishin, just tried again, the 3.3 environment reports python version 2.7.3. stale image?

@michaelklishin
Contributor

There were no deployments in the last 4 days or so.

@certik
certik commented Dec 4, 2012

@y-p, which numpy commit in master fixes this? I can't find it. The only one I could find is numpy/numpy@f3905dc, which starts to use "pip" instad of "setup.py install", but that commit does not fix the __stdout__ problem.

@y-p
y-p commented Dec 4, 2012

It's numpy/numpy@649c908. Also fixed independently on the virtualenv side in 1.8.4.

@certik
certik commented Dec 4, 2012

Thanks! I'll backport it into the 1.7.x branch, so that we can run tests for the next release.

@y-p
y-p commented Dec 18, 2012

bump. still waiting on virtualenv 1.8.4 deploy.

@y-p
y-p commented Dec 21, 2012

thanks. 1.8.4 is live.

@astrofrog

Thanks! Python 3.2 is now working, but I'm getting a strange problem with Python 3.1:

https://travis-ci.org/astropy/astropy/builds/3772823

Any ideas for why this is failing?

@michaelklishin
Contributor

@astrofrog 3.1 is no longer provided, replaced with 3.3.

travis-ci/travis-ci.github.com#157 (comment)

@peterjc
peterjc commented Dec 21, 2012

Python 2.5, 2.6 and 2.7 seem to be working fine.

For me neither Python 3.2 nor 3.3 seem to have numpy pre-installed:
https://travis-ci.org/biopython/biopython/builds/3774538

Note we were until recently using the system Python to get NumPy under Python 3.2,
biopython/biopython@9e3f694

@peterjc
peterjc commented Jan 17, 2013

I still don't see NumPy pre-installed under Python 3.2, here's a fairly small project showing the problem:
https://travis-ci.org/peterjc/ccn/builds/4207892
https://travis-ci.org/peterjc/ccn/builds/4207976

@y-p
y-p commented Mar 23, 2013

@michaelklishin , can you post a note to this thread when you change the major version
of preinstalled numpy? that's not likely to happen again this decade, but quietly changing
the version can hide regressions if someone didn't catch the change.

Thanks.

@michaelklishin
Contributor

We install the most recent one and don't follow NumPy development. Sorry, I think this is something not worth worrying about.

@y-p
y-p commented Mar 23, 2013

I see.

@y-p
y-p commented Mar 23, 2013

Actually I don't. travis is a CI service, and you're saying version management is not something
"worth worrying about"?

speechless.

Update: point taken, it is unreasonable. I expect other users will trip over this, but I agree the proper fix
is to be more explicit in the individual project's dep managament.

@michaelklishin
Contributor

@y-p it's up to individual projects to track the ecosystem changes and enforce installation of what they need. Travis CI environment preinstalls the most commonly used version of some libraries to either save time or for convenience.

Asking travis maintainers to monitor 14 ecosystems and updates of popular libraries in them is unreasonably demanding. Plus, per your own words major NumPy version is unlikely to appear any time soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment