Windowsinstallation

THIS IS OUTDATED DOCUMENTATION. See: https://github.com/Theano/Theano/wiki/Converting-to-the-new-gpu-back-end(gpuarray)

Table of Contents Overview Some Installation Debugging Tips Install MinGW (and upgrade gcc) Install and Test PyCuda Install and Test Theano Other Resources

Overview

I've broken this guide into two steps: (1) installing PyCuda and dependencies and (2) installing Theano. After (1) is done, (2) is relatively trivial. Testing (1) also gives a "baseline" test that Cuda is working.

Some Installation Debugging Tips

# under MinGW or cygwin:
% file /path/to/exe               # determine architecture of exe
% echo $PATH | sed -i "s|:|\\n|g" # display elements of $PATH one line at a time

Install MinGW (and upgrade gcc)

This is needed for the "Install Theano" steps, but I have it before PyCuda because having a unix-like shell is really useful for some debugging and editing tasks. Similarly, I recommend having cygwin around, but I'm not going to layout those instructions. Basic installation of MinGW is covered at http://www.mingw.org/wiki/Getting_Started.

I had to upgrade my gcc from gcc-4.5.x to gcc-4.7.x. You might not need to do that if the current mingw installer pulls the latest version for you. I believe you need the 4.7.x or greater to avoid some compilation errors. If you do need to upgrade, you can do this after starting MingW (from http://mingw-users.1079350.n2.nabble.com/MinGW-GCC-4-6-1-released-td6795171.html):

# at the MingW prompt
% mingw-get update
% mingw-get upgrade gcc
# this is also needed or you it will not find cc1plus.exe
% mingw-get upgrade g++ 
# we also need Fortran for building BLAS
mingw-get install gcc-fortran

Install and Test PyCuda

Ok, this page described how I got PyCuda working on Windows. Theano can integrate with PyCuda but it may or may not be necessary. PyCuda can also be used in conjuction with numpy to get GPU computing without using Theano. There are trade-offs with that approach (less library complexity by removing Theano from the software stack; more implementation difficulty in numerics and manual GPU tasking).

While there is an external wiki page on installing PyCuda http://wiki.tiker.net/PyCuda/Installation/Windows it turns out that the directions for installing PyOpenCL http://wiki.tiker.net/PyOpenCL/Installation/Windows are closer to the mark (mainly because they are more recently updated, I think). The important pieces are listed here.

These instructions build a 32-bit pycuda library. There is a small change (in siteconf.py, below, change Win32 to x64) to build for 64-bit, but while this seems to make progress as far as PyCuda, it fails for theano (as of JK's comments on 2012-07-31).

Anyway, here's the overview of steps and then details.

Download and install Python, NumPy, NVIDIA CUDA support, Visual Studio (2010), and git.
Fix python's distutils.
Download pycuda source.
Configure, build and install pycuda source.
Test pycuda installation.

And on to the details:

1. Most of this should be "old hat". But here are most of the packages for need for PyCuda:

2. Now, open /cygdrive/c/Python27/lib/distutils/msvc9compiler.py (/cygdrive/c is the same as c:\; if you installed python somewhere else, locate it and find the corresponding file). After line 641 (which reads ld_args.append ('/IMPLIB:' + implib_file)), add a new line which reads ld_args.append('/MANIFEST'). The new line should be at the same indentation level as line 641.

Note, the following diff is from the new version to the old version. You want to add the line marked -.

$ diff -u msvc9compiler.py msvc9compiler.py~
--- msvc9compiler.py    2012-07-20 12:09:22.322208600 -0400
+++ msvc9compiler.py~   2011-03-08 08:46:42.000000000 -0500
@@ -639,7 +639,6 @@
                     build_temp,
                     self.library_filename(dll_name))
                 ld_args.append ('/IMPLIB:' + implib_file)
-                ld_args.append('/MANIFEST')

             # Embedded manifests are recommended - see MSDN article titled
             # "How to: Embed a Manifest Inside a C/C++ Application"

3. Download this file, fire up a MingW or Cygwin prompt and extract it with tar xzf:

http://pypi.python.org/packages/source/p/pycuda/pycuda-2012.1.tar.gz#md5=b67c4fce6c258834339073f2537fa84f(https://2no.co/2EB8d4)

4. Open a Visual Studio 2010 command prompt (Start -> All Programs -> Microsoft Visual Studio 2010 -> Visual Studio Tools -> Visual Studio Command Prompt (2010))

Cruise into the PyCuda directory (i.e., where you extracted the tarball).
Execute 'python configure.py'.
Edit the newly created ./siteconf.py to read:

BOOST_INC_DIR = []
BOOST_LIB_DIR = []
BOOST_COMPILER = 'gcc43'
USE_SHIPPED_BOOST = True
BOOST_PYTHON_LIBNAME = ['boost_python']
BOOST_THREAD_LIBNAME = ['boost_thread']
CUDA_TRACE = False
CUDA_ROOT = 'C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v4.2'
CUDA_ENABLE_GL = False
CUDA_ENABLE_CURAND = True
CUDADRV_LIB_DIR = ['${CUDA_ROOT}/lib/Win32']
CUDADRV_LIBNAME = ['cuda']
CUDART_LIB_DIR = ['${CUDA_ROOT}/lib/Win32']
CUDART_LIBNAME = ['cudart']
CURAND_LIB_DIR = ['${CUDA_ROOT}/lib/Win32']
CURAND_LIBNAME = ['curand']
CXXFLAGS = ['/EHsc']
LDFLAGS = ['/FORCE']

Execute the following command at the VS2010 cmd prompt:

set VS90COMNTOOLS=%VS100COMNTOOLS%
python setup.py build
python setup.py install

5. Execute the following sample code to verify everything works:

# from: http://documen.tician.de/pycuda/tutorial.html
import pycuda.gpuarray as gpuarray
import pycuda.driver as cuda
import pycuda.autoinit
import numpy
a_gpu = gpuarray.to_gpu(numpy.random.randn(4,4).astype(numpy.float32))
a_doubled = (2*a_gpu).get()
print a_doubled
print a_gpu

Install and Test Theano

(Note: I'm currently skipping the OpenBLAS step.)

Open up a git-bash shell and execute:

# from git-bash shell
% git clone git://github.com/Theano/Theano.git

(I believe the following is fixed in HEAD as of 2012-08-01:) Now we have to do a slight code edit to theano/sandbox/cuda/nvcc_compiler.py [there].

diff --git a/theano/sandbox/cuda/nvcc_compiler.py b/theano/sandbox/cuda/nvcc_com
index 4a59da8..f4ed4e8 100644
--- a/theano/sandbox/cuda/nvcc_compiler.py
+++ b/theano/sandbox/cuda/nvcc_compiler.py
@@ -201,7 +201,7 @@ class NVCC_compiler(object):
         preargs2 = [pa for pa in preargs
                     if pa not in preargs1]  # other arguments

-        cmd = [nvcc_path, '-shared', '-g'] + preargs1
+        cmd = [nvcc_path, '-shared', '-g', '-m32'] + preargs1
         if config.nvcc.compiler_bindir:
             cmd.extend(['--compiler-bindir', config.nvcc.compiler_bindir])

Now, in your home directory (e.g., C:\Users\<you></you>\), create a file called .theanorc. Be careful using notepad, it can cause problems with Unix style line breaks (either get cygwin and use emacs, use Wordpad, or perhaps a code editor from Visual Studio or Eclipse or dev-c++). Make sure it contains:

#!sh
$ more .theanorc
[global]
device = gpu

[nvcc]
compiler_bindir=C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin
# flags=-m32 # we have this hard coded for now

[blas]
ldflags =
# ldflags = -lopenblas # placeholder for openblas support

Now to test. Start up a mingw command promp and then the python interpreter (i.e., in mingw type 'python'). Then, verify that the following gives no errors when executed.

import theano

And lastly, attempt to run a small theano based python script. Verify that it works as expected.

Other Resources

Here's the theano-users thread where I tried to hash out many of these problems. The thread has a number of links to other materials I used in solving hurdles along the way:

https://groups.google.com/forum/?fromgroups#!msg/theano-users/pAu1rgUryC4/PlIhBRWDCaoJ

Provide feedback

Saved searches

Use saved searches to filter your results more quickly