Skip to content

Converting to the new gpu back end(gpuarray)

shuenlaw edited this page May 16, 2019 · 31 revisions

MILA will stop developing Theano.

This page describes how to use the new gpu back-end instead of the current/old one.

Installation:

  • We strongly recommend that you use conda/anaconda to install Theano and pygpu, especially on Windows.
  • Both are available with conda conda install theano pygpu.
  • RECOMMENDED: You can install the latest beta, release candidate or release like this conda install -c mila-udem -c mila-udem/label/pre theano pygpu

Windows cleanup:

  • Remove any previous install of gcc/mingw.
  • Remove Visual C++ for python (or any MSVC that you installed for Theano).
  • Remove previous installs of Theano and Python.

Note that we only support clean install with conda/anaconda on windows. You are welcome to try another configuration, but we won't help you make it work.

Code changes:

  • If you use: conv3d2d, dnn_conv3d, replace with the new 3d abstract conv: theano.tensor.nnet.conv3d()
  • If you use: dnn_conv2d: replace with the 2d abstract conv: theano.tensor.nnet.conv2d()
  • If you use: dnn_pool: replace with the new 3d pooling interface. (It wasn't useful for 2d pooling): theano.tensor.signal.pool.pool_{2d,3d}
  • If you use: dnn_batch_normalization_train() or dnn_batch_normalization_test(), use theano.tensor.nnet.bn.batch_normalization_{train,test} instead.
  • grep for "import theano.sandbox.cuda" in your files.
    • If you find such import, they will need to be converted. In many cases, you can stop using a gpu interface and use the CPU interface. This will make your code work with both CPU and GPU back-ends.
    • Now all convolution are available on the CPU
    • Now all pooling are available on the CPU
    • If there are others, check in the CPU interface, otherwise, you can probably change theano.sandbox.cuda to theano.gpuarray

Config changes:

  • The following Theano config keys sections don't have any effect on the new backend and should be removed:
    • nvcc.*
    • cuda.root
    • lib.cnmem (replace by gpuarray.preallocate) Important: The default changed to be faster, but cause more memory fragmentation. To keep the speed and remove the fragmentation, use the flag gpuarray.preallocate=1 (or any value greater then 0, see the dot. To have the old default of Theano, use the flag: gpuarray.preallocate=-1

Safety checks:

  • Check that it still trains and has the same speed (we don't expect problems, but it is better to be safe!)

What to expect?

  • Maybe a small run time speed up (0-10%).
  • Maybe a run time slow down if you are using one of the op not yet ported (we did 98+%)
  • A compilation speed up.
  • Support for multiple dtypes including float16 for many ops.
  • cuDNN RNN wrapper (It need that you use it manually)
  • float16 for storage (computation in float32 for now, so work even on non Pascal GPU) See https://github.com/Theano/Theano/issues/2908 for exact status.