-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
host on AWS, use multiple GPUs #8
Comments
It would have been cool to use Google Cloud, but they don't seem to want to give any of us access. |
Ok, here we go. I have a script that can be run on a fresh # spin up a g2.2xlarge with ubuntu 14.04
# before starting, scp the tarball for cudnn (cudnn-7.5-linux-x64-v5.0-rc.tgz) to /tmp
sudo add-apt-repository ppa:ubuntugis/ubuntugis-testing -y
sudo apt update
export LANGUAGE="en_US.UTF-8"
export LANG="en_US.UTF-8"
export LC_ALL="en_US.UTF-8"
locale-gen "en_US.UTF-8"
sudo dpkg-reconfigure locales
# blacklist nouveau gpu driver (in favor of CUDA)
echo -e "blacklist nouveau\nblacklist lbm-nouveau\noptions nouveau modeset=0\nalias nouveau off\nalias lbm-nouveau off\n" | sudo tee /etc/modprobe.d/blacklist-nouveau.conf
echo options nouveau modeset=0 | sudo tee -a /etc/modprobe.d/nouveau-kms.conf
sudo update-initramfs -u
# apt prerequisites
sudo apt install -y build-essential git swig default-jdk zip zlib1g-dev libbz2-dev python2.7 python2.7-dev cmake python-pip mercurial libffi-dev libssl-dev libxml2-dev libxslt1-dev libpq-dev libmysqlclient-dev libcurl4-openssl-dev libjpeg-dev libpng12-dev gfortran libblas-dev liblapack-dev libatlas-dev libquadmath0 libfreetype6-dev pkg-config libshp-dev libsqlite3-dev libgd2-xpm-dev libexpat1-dev libgeos-dev libgeos++-dev libxml2-dev libsparsehash-dev libv8-dev libicu-dev libgdal1-dev libprotobuf-dev protobuf-compiler devscripts debhelper fakeroot doxygen libboost-dev libboost-all-dev gdal-bin linux-image-extra-virtual linux-source
# cuda
cd /tmp
wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1404/x86_64/cuda-repo-ubuntu1404_7.5-18_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1404_7.5-18_amd64.deb
sudo apt update
sudo apt install -y cuda
sudo apt install linux-headers-$(uname -r)
sudo reboot now # <<<<<< reboot!
sudo modprobe nvidia # should return no errors
# cuDNN - assumes you already have the tarball in /tmp
cd /tmp
tar -xzf cudnn-7.5-linux-x64-v5.0-rc.tgz
sudo cp /tmp/cuda/lib64/* /usr/local/cuda/lib64
sudo cp /tmp/cuda/include/* /usr/local/cuda/include
# virtualenv
sudo pip install --upgrade pip
sudo pip install virtualenv
cd ~
virtualenv venv
source venv/bin/activate
# python prerequisites
pip install https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0-cp27-none-linux_x86_64.whl
pip install gdal --global-option=build_ext --global-option="-I/usr/include/gdal/"
git clone --branch v2.6.1 https://github.com/osmcode/libosmium.git /tmp/libosmium
pip install --global-option=build_ext --global-option="-I/tmp/libosmium/include" git+https://github.com/osmcode/pyosmium@v2.6.0 At the end of all this, you can do the following and observe Tensorflow using the GPU: $ export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64"
$ export CUDA_HOME=/usr/local/cuda
$ source venv/bin/activate
$ python
Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally
>>> I created an AMI with the above script. You can spin up the AMI and run the following to clone and run DeepOSM: # global vars that need to be set
export AWS_ACCESS_KEY_ID=***
export AWS_SECRET_ACCESS_KEY=***
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64"
export CUDA_HOME=/usr/local/cuda
source ~/venv/bin/activate
# make a /data and /data/cache directory on the SSD for DeepOSM to use
sudo mkdir -p /mnt/data/cache
sudo ln -s /mnt/data /data
sudo chmod -R 777 /mnt/data
export GEO_DATA_DIR=/data
# DeepOSM
git clone https://github.com/trailbehind/DeepOSM.git /tmp/DeepOSM
cd /tmp/DeepOSM
ln -s /tmp/DeepOSM/s3config-default /home/ubuntu/.s3cfg
pip install -r requirements_gpu.txt
export PYTHONPATH=`pwd`
# now you can run DeepOSM scripts!
python bin/create_training_data.py |
Now, a couple of questions for you all:
|
Then I could compare the performance and experience to my Linux box, start @silberman likes Jupyter notebooks a lot too - I think he sees us providing |
|
It seems like a good production solution could be:
Apps 1 & 2 provide an API to app 3, which publishes data to S3 for app 4 to imbibe into its own Django Postgres? My guess is this production solution will start to be more of a requirement at scale... like it will be more convenient to do more than 1 state if we set up something like this, or provide more flexible analysis. We can go ahead and do deeposm.org/delaware, but then maybe have to get this done. |
|
Here's a FOSS repo doing TensorFlow on AWS using Docker. This is a familiar stack.
Google Cloud also released their GPU offering recently.
The text was updated successfully, but these errors were encountered: