Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Install auto-sklearn in mac #155

Closed
dragonfly90 opened this issue Oct 4, 2016 · 33 comments
Closed

Install auto-sklearn in mac #155

dragonfly90 opened this issue Oct 4, 2016 · 33 comments
Labels
documentation Something to be documented enhancement A new improvement or feature

Comments

@dragonfly90
Copy link

Try to install auto-sklearn in mac. There is an error: Failed building wheel for pyrfr. Is the mac version available?

@akodate
Copy link

akodate commented Oct 9, 2016

Are you using Python 3?

@mfeurer
Copy link
Contributor

mfeurer commented Oct 10, 2016

According to PR #122, auto-sklearn can be installed on a MAC. We'll investigate how to set up travis-ci to test for MAC OS.

@dmlicht
Copy link

dmlicht commented Oct 11, 2016

I'm seeing this issue while installing as well. For me, it seems to be about including random

http://pastebin.com/bV9AwvvP

@mfeurer
Copy link
Contributor

mfeurer commented Oct 12, 2016

It seems you're not using a c++11 compatible compiler, that's at least what google suggests. Setting up a test on travis-ci is high on our todo list, nevertheless, no one from the auto-sklearn development team is using a MAC or has experience with MAC OS, so support will still be very limited.

@dragonfly90
Copy link
Author

@akodate Yes, I am using python3.5. @dmlicht, I also met the error about random

@mfeurer mfeurer added the enhancement A new improvement or feature label Oct 17, 2016
@donaldbraman
Copy link

One way to end this dependency problem is to create a docker image for folks. Are any of the devs already using docker / want to share a docker file?

@mfeurer
Copy link
Contributor

mfeurer commented Oct 18, 2016

Thank you for your suggestion. So far, none of the current developers (currently @AYaro and me) are using docker or MAC OS. We're both using Ubuntu on which it's possible to use python without Anaconda or Docker. Since I'm not familiar with docker, would a docker be part of the auto-sklearn repository or would it reside somewhere else? I would be happy to include such a script into the repo, but don't have the time to create one myself. I'll add the label help needed to this issue.

@donaldbraman
Copy link

Thank you! That would be really helpful to many folks, as it would mean the only dependency any user would need is docker. Since docker can map ports and share directories, it would mean a simple one-line installation for people who want to try out auto-sklearn, and it would solve problems with odd install environments. It would also mean that people could contribute to an entire working auto-sklearn stack by making a very simple pull request on the Dockerfile. @timothyjlaurent posted a quick docker install in PR #122, so I'll see if he wants to contribute a docker file he is already using.

There are a few approaches to creating & locating docker files. Personally I prefer the method employed by, among other, the jupyter project, in which you would create a separate "docker-stacks" repository that has one or more directories, each with a Dockerfile in it.

@donaldbraman
Copy link

OK, @mfeurer, if you create an empty repository called docker-stacks, then @timothyjlaurent and I can fork and make pull requests to add the appropriate files.

@mfeurer
Copy link
Contributor

mfeurer commented Oct 19, 2016

Okay, I can do so. While this seems to be a clean solution, wouldn't it be better if the file lived in the auto-sklearn repository? We could then try to include it into the build system so travis-ci continuously checks whether the current code runs inside the docker image.

@mfeurer
Copy link
Contributor

mfeurer commented Oct 19, 2016

A different, potential solution would be to use the Anaconda gcc compiler inside Anaconda.

Since I don't have a MAC, if one of you (@donaldbraman @dragonfly90 @dmlicht) used Anaconda under MAC OS please check whether the command conda install gcc works under MAC OS and installs a gcc compiler? If yes, then the solution in #176 might apply to this issue as well.

@donaldbraman
Copy link

donaldbraman commented Oct 19, 2016

That is a good point, and apologies for jumping to a proposed location.

I think that the benefit to having a separate docker repository is that it allows you to build multiple docker images with different features. If most people use auto-sklearn the same environment, it probably would make sense to locate a single docker image in your main repository. (This works well for things like blogs that have a single desired stack.) On the other hand, if users often have different flavors of auto-sklearn (perhaps a minimal build and a full-featured build) or users often integrate auto-sklearn into other environments (jupyter, rodeo, beaker), then folks could contribute tested stacks that incorporate auto-sklearn (and your other projects) into functional and integrated data science stacks. I think starting with a single build is a good idea, but a separate repository would give you room to grow. And a simple travis build-test-all file in the new repo could test any docker builds.

So I don't have a strong argument against putting a Dockerfile in your existing repository, just a few things to think about based on my observations of how other developers have worked. And, of course, folks like @timothyjlaurent (or those over at jupyter, given the stacks they maintain) may have better advice.

@timothyjlaurent
Copy link
Contributor

@mfeurer to answer your question conda install gcc will install a working gcc compiler on a Mac

@mfeurer
Copy link
Contributor

mfeurer commented Oct 19, 2016

@donaldbraman then I would prefer to keep things simple in the beginning and get a docker image running in the first place. If there is need for a second, different docker stack, I will of course create a new dockerstack repository.

@timothyjlaurent thanks for your answer. I think my question didn't ask for what I actually wanted to know. Basically, is it possible to install auto-sklearn on a MAC if one uses anaconda and does a conda install gcc prior to installing auto-sklearn and all dependencies?

@timothyjlaurent
Copy link
Contributor

@mfeurer Thanks for clarifying. I made a new conda env and installed the requirements.txt packages (btw the cat needs to be changed to curl in the directions PR forthcoming), and then was able to pip install auto-sklearn.

BTW although I did conda install gcc I already had gcc in my path so it wasn't used for this. You may want to suggest that people install XCode Command Line Tools to get the requisite compilers.

@mfeurer
Copy link
Contributor

mfeurer commented Oct 20, 2016

@timothyjlaurent Thanks for the fixing PR. The issues with MAC OS is that I don't know anything about it. Thus, I would really like a solution that is as simple as installing another conda package to get things working. The other advantage of using a conda package is that the provided compiler will be compatible with the python version of anaconda. This is something that was broken with Ubuntu 16.04, where binaries compiled with the shipped gcc are no longer compatible with the Anaconda python executable (see #176). Long story short, it would be awesome if you could check if you can install auto-sklearn if you remove gcc from your path and try the anaconda gcc package.

@timothyjlaurent
Copy link
Contributor

sure thing ... I'll give it a try.

@timothyjlaurent
Copy link
Contributor

@mfeurer --- so I for some reason I had to move my /usr/bin to the end of my $PATH to get the conda-installed gcc picked up first when executing which gcc.

I reinstalled everything with this command:

curl https://raw.githubusercontent.com/automl/auto-sklearn/master/requirements.txt | xargs -n 1 -L 1 pip install --no-cache-dir --force-reinstall -I --no-deps --upgrade

and then installed auto-sklearn with pip install autosklearn --no-cache-dir --force-reinstall -I --no-deps --upgrade

Everything went well and installed without an error.

@mfeurer
Copy link
Contributor

mfeurer commented Oct 21, 2016

@timothyjlaurent Thank you very much, this seems like a low-effort solution to the installation problem on MAC OS. Would you nevertheless still be interested in contributing a docker file?

@timothyjlaurent
Copy link
Contributor

@mfeurer, Sure thing. I could use ubuntu, or python (debian based) images as a base.

I could also include a docker-compose.yml file that would mount the current directory into the container and then the user would only have to run docker-compose up and would have a running container with pwd mounted at some location in the container.

At some point, it might be interesting to see how we might integrate spark into this project, but that seems more involved and a suitable discussion for another issue.

Should I make an issue for the docker stuff or is there an existing one I should link to?

@mfeurer
Copy link
Contributor

mfeurer commented Oct 24, 2016

@timothyjlaurent there is no issue for docker stuff yet, so please go ahead and create one. I do not know really how docker works, but having the current directory mounted on startup sounds like an excellent idea. Could you please describe what spark has to do with auto-sklearn?

Unrelated to docker, we're now able to run unittests on MAC OS since #179. I opened issue #180 to document how auto-sklearn can be installed in an anaconda environment.

@dragonfly90 did you try to install auto-sklearn inside an anaconda environment or outside? If outside, please use anaconda. If inside, please do

conda install gcc
pip uninstall pyrfr auto-sklearn
curl https://raw.githubusercontent.com/automl/auto-sklearn/master/requirements.txt | xargs -n 1 -L 1 pip install
pip install pyrfr auto-sklearn --no-cache-dir

@dragonfly90
Copy link
Author

dragonfly90 commented Oct 26, 2016

@mfeurer, I tried your command and still got the following. Should I change clang to gcc?

#warning "Using deprecated NumPy API, disable it by " \
   ^
  pyrfr/regression.cpp:463:10: fatal error: 'random' file not found
  #include <random>
           ^
  1 warning and 1 error generated.
  error: command '/usr/bin/clang' failed with exit status 1

  ----------------------------------------
  Failed building wheel for pyrfr
  Running setup.py clean for pyrfr

@dragonfly90
Copy link
Author

@mfeurer, set CC works

conda install gcc
pip uninstall pyrfr auto-sklearn
curl https://raw.githubusercontent.com/automl/auto-sklearn/master/requirements.txt | xargs -n 1 -L 1 pip install
CC=/Users/username/anaconda/bin/gcc pip install pyrfr auto-sklearn --no-cache-dir

@mfeurer
Copy link
Contributor

mfeurer commented Oct 31, 2016

I'm surprised by the additional requiremnt to set the CC flag. It wasn't necessary in the travis-ci environment. If you do which gcc, does it point to the anaconda gcc or the system gcc?

@dragonfly90
Copy link
Author

@mfeurer it points to anaconda gcc

@namankumar
Copy link

I've successfully installed on Mac OS X 10.12.1 without the need for gcc, which is a redundant (and huge, size-wise) thing to install on a mac given that Apple has adapted Clang specifically for macs.

You need 3 things:

  1. download a fortran compiler from hpc.sourceforge.net (at this moment, I cannot recall why this is important. I think for Cython)
  2. edit PyRFR setup.py so CC points to clang++, which supports C++11 headers.
  3. add a flag to PyRFR for clang++ so the minimum Mac OS X version is 10.9 else the C++11 isn't picked up. By default, the package will compile with the same flags that your Python installation was compiled with. The Mac OS X version at the time python was compiled was probably before C++11 became pervasive.

Happy to help out if anyone needs a hand getting this up and running.

Here is what my PyRFR setup.py looks like. I added extra flags/options for convenience.

from distutils.core import setup, Extension
import distutils.command.build
import numpy as np
from subprocess import call
from Cython.Build import cythonize
import os

os.environ["CC"] = "clang++"
os.environ["CFLAGS"] = ""
os.environ["LINKCC"] = ""
#os.environ["MACOSX_DEPLOYMENT_TARGET"] = "10.9"                                                  

include_dirs = ['./include', '/Users/naman/miniconda3/envs/envautosklearn/lib/python3.5/site-pack\
ages/numpy/core/include']
extra_compile_args = ['-std=c++11','-stdlib=libc++', '-mmacosx-version-min=10.9']

extensions = cythonize(
                                        [
                                                Extension('pyrfr.regression',
                                                sources=['pyrfr/regression.pyx'],
                                                language="c++",
                                                include_dirs=include_dirs,
                                                extra_compile_args = extra_compile_args
                                                ),

                                                Extension('pyrfr.regression32',
                                                sources=['pyrfr/regression32.pyx'],
                                                language="c++",
                                                include_dirs=include_dirs,
                                                extra_compile_args = extra_compile_args
                                                )
                                        ])
setup(
        name='pyrfr',
        version='0.2.1',
        author='Stefan Falkner',
        author_email='sfalkner@cs.uni-freiburg.de',
        license='Use as you wish. No guarantees whatsoever.',
        classifiers=['Development Status :: 3 - Alpha'],
        packages=['pyrfr'],
        ext_modules=extensions
)

@mfeurer mfeurer added the documentation Something to be documented label May 10, 2017
@mfeurer
Copy link
Contributor

mfeurer commented May 16, 2017

The new release of auto-sklearn has a brief explanation why we can't support MAC OSX: http://automl.github.io/auto-sklearn/stable/installation.html#mac-osx

We'd still be happy about any contribution which allows us to support OSX without any additional overhead.

I'm closing this for now because of inactivity.

@mfeurer mfeurer closed this as completed May 16, 2017
@ghost
Copy link

ghost commented May 17, 2017

Maybe helpful for someone also trying to install auto-sklearn on a Mac...

Following the #122 (comment) by @timothyjlaurent and the Dockerfile by @donaldbraman, I managed to get auto-sklearn to work on my Mac via Docker, with some changes to the Dockerfile though:

FROM ubuntu
RUN apt-get update
RUN apt-get install -yq build-essential swig
RUN apt-get install -yq python3-pip
RUN pip3 install numpy six cython
RUN pip3 install pyrfr==0.2
RUN pip3 install auto-sklearn
RUN pip3 install jupyter

CMD ["bash"]

The resulting Docker image is about 1.23 GB. I used the above Dockerfile to build an image on a Debian host without problem.

@timothyjlaurent @donaldbraman Does it look to you the Dockerfile can be simplified further?
@mfeurer Would it be worthwhile to have the Docker build continuously tested, so we know when it fails?

@mfeurer
Copy link
Contributor

mfeurer commented May 17, 2017

If possible it would be great to have this Dockerfile continuously tested. I just don't know anything about Docker, so you'd have to create a PR to auto-sklearn which adds a dockerfile including test integration in travis-ci.

@ghost
Copy link

ghost commented May 17, 2017

@mfeurer Thanks for the quick reply.

I have tried two options so far: Docker's automated builds, and Travis CI. In short, both can be used to continuously test if the Docker image can be built properly from the Dockerfile, but with the former the image is also available from Docker Hub.

With Docker's automated builds, it can be set up such that every push to Github will trigger a build on Docker Hub. The Docker build status can be added to the README.md as well. Users could then pull the image with say docker pull autoML/auto-sklearn and then launch a Jupyter notebook to play with auto-sklearn. And that would work on Mac/linux and probably Windows too.

With Travis CI, to test the build the .travis.yml seems to need a sudo: required (you'll find the file in the linked Github repo below), which doesn't seem to play nicely with auto-sklearn's Travis setup... Anyway with Travis the Docker image is not built.

To test things out, I created this Github repo with the Dockerfile. I then created a Docker hub repo, and linked the two. I followed https://docs.docker.com/docker-hub/builds/ to set up automated builds (which took a few minutes). In the README.md, I added a docker build status badge, which reflects the most recent build status.

I've forked auto-sklearn and added a Dockerfile. Do you think it's okay to go with the automated builds approach? If so, I am ready to do a PR. I could also help with setting up a repo on Docker Hub.

@mfeurer
Copy link
Contributor

mfeurer commented May 18, 2017

Thanks @felixleungsc for the detailed explanation. The dockerhub looks like the right environment to build such docker images and test them (I don't want to reactivate the sudo-builds on travis-ci). In order to use these, I need to create a new repository for the automl organization, right? This can then be linked to dockerhub, as well as the original auto-sklearn repository. Then, each commit to the original auto-sklearn repository will trigger a build on docker-hub, right? In order to do so, you would create a PR to auto-sklearn and to a repo that I need to create. Did I understand everything correctly?

@ghost
Copy link

ghost commented May 19, 2017

Yes, except you don't need to create a new (Github) repo. We would only need one PR against the development branch of auto-sklearn, adding the Dockerfile. Then over at Docker Hub, you would need to "Create Organization" for automl, then create a Docker repo for auto-sklearn "Create Automated Builds", which would allow you to link to the Github auto-sklearn repo; the resulting Docker repo would look something like https://hub.docker.com/r/automl/auto-sklearn/). There in settings you can link it to the Github repo. Once linked, yes each Github commit will trigger a build over Docker.

I will create a PR shortly then. Thanks for making auto-sklearn by the way. Awesome stuff!

@ksboy
Copy link

ksboy commented Apr 11, 2019

@mfeurer, set CC works

conda install gcc
pip uninstall pyrfr auto-sklearn
curl https://raw.githubusercontent.com/automl/auto-sklearn/master/requirements.txt | xargs -n 1 -L 1 pip install
CC=/Users/username/anaconda/bin/gcc pip install pyrfr auto-sklearn --no-cache-dir

I have tried code below. And it worked.

conda install clang_osx-64
conda install clangxx_osx-64

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Something to be documented enhancement A new improvement or feature
Projects
None yet
Development

No branches or pull requests

8 participants