Update doc with instructions for using new gpu backend

Theano · May 11, 2016 · bd54467 · bd54467
1 parent 319382b
commit bd54467
Show file tree

Hide file tree

Showing 10 changed files with 343 additions and 573 deletions.
diff --git a/.gitignore b/.gitignore
@@ -37,3 +37,4 @@ Theano.suo
 .ipynb_checkpoints
 .pydevproject
 .ropeproject
+core
diff --git a/doc/extending/extending_theano.txt b/doc/extending/extending_theano.txt
@@ -681,8 +681,8 @@ For instance, to verify the Rop method of the DoubleOp, you can use this:
 Testing GPU Ops
 ^^^^^^^^^^^^^^^
 
-Ops to be executed on the GPU should inherit from the
-``theano.sandbox.cuda.GpuOp`` and not ``theano.Op``. This allows
+When using the old GPU backend, Ops to be executed on the GPU should inherit
+from ``theano.sandbox.cuda.GpuOp`` and not ``theano.Op``. This allows
 Theano to distinguish them. Currently, we use this to test if the
 NVIDIA driver works correctly with our sum reduction code on the GPU.
 

diff --git a/doc/install.txt b/doc/install.txt
@@ -375,7 +375,7 @@ If ``theano-nose`` is not found by your shell, you will need to add
 
     If you want GPU-related tests to run on a specific GPU device, and not
     the default one, you should use :attr:`~config.init_gpu_device`.
-    For instance: ``THEANO_FLAGS=device=cpu,init_gpu_device=gpu1``.
+    For instance: ``THEANO_FLAGS=device=cpu,init_gpu_device=cuda1``.
 
     See :ref:`libdoc_config` for more information on how to change these
     configuration options.
@@ -508,25 +508,25 @@ Any one of them is enough.
 
     :ref:`Ubuntu instructions <install_ubuntu_gpu>`.
 
-
+Next, install `libgpuarray <http://deeplearning.net/software/libgpuarray/installation.html>`_.
 
 Once that is done, the only thing left is to change the ``device`` option to name the GPU device in your
 computer, and set the default floating point computations to float32.
-For example: ``THEANO_FLAGS='cuda.root=/path/to/cuda/root,device=gpu,floatX=float32'``.
+For example: ``THEANO_FLAGS='cuda.root=/path/to/cuda/root,device=cuda,floatX=float32'``.
 You can also set these options in the .theanorc file's ``[global]`` section:
 
      .. code-block:: cfg
 
         [global]
-        device = gpu
+        device = cuda
         floatX = float32
 
 Note that:
 
-    * If your computer has multiple GPUs and you use 'device=gpu', the driver
+    * If your computer has multiple GPUs and you use 'device=cuda', the driver
       selects the one to use (usually gpu0).
     * You can use the program nvida-smi to change this policy.
-    * You can choose one specific GPU by specifying 'device=gpuX', with X the
+    * You can choose one specific GPU by specifying 'device=cudaX', with X the
       the corresponding GPU index (0, 1, 2, ...)
     * By default, when ``device`` indicates preference for GPU computations,
       Theano will fall back to the CPU if there is a problem with the GPU.
@@ -794,6 +794,8 @@ setup CUDA, but be aware of the following caveats:
      toggle your GPU on, which can be done with
      `gfxCardStatus <http://codykrieger.com/gfxCardStatus>`__.
 
+Next, install `libgpuarray <http://deeplearning.net/software/libgpuarray/installation.html>`_.
+
 Once your setup is complete, head to :ref:`using_gpu` to find how to verify
 everything is working properly.
 

diff --git a/doc/install_ubuntu.txt b/doc/install_ubuntu.txt
@@ -43,7 +43,7 @@ For Ubuntu 11.10 through 14.04:
 
     sudo apt-get install python-numpy python-scipy python-dev python-pip python-nose g++ libopenblas-dev git
     sudo pip install Theano
-    
+
 On 14.04, this will install Python 2 by default. If you want to use Python 3:
 
 .. code-block:: bash
@@ -104,30 +104,30 @@ For Ubuntu 11.04:
    The development version of Theano supports Python 3.3 and
    probably supports Python 3.2, but we do not test on it.
 
-    
+
 Bleeding Edge Installs
 ----------------------
 
-If you would like, instead, to install the bleeding edge Theano (from github) 
-such that you can edit and contribute to Theano, replace the `pip install Theano` 
+If you would like, instead, to install the bleeding edge Theano (from github)
+such that you can edit and contribute to Theano, replace the `pip install Theano`
 command with:
 
 .. code-block:: bash
 
     git clone git://github.com/Theano/Theano.git
-    cd Theano 
+    cd Theano
     python setup.py develop --user
     cd ..
 
 VirtualEnv
 ----------
-    
-If you would like to install Theano in a VirtualEnv, you will want to pass the 
-`--system-site-packages` flag when creating the VirtualEnv so that it will pick up 
+
+If you would like to install Theano in a VirtualEnv, you will want to pass the
+`--system-site-packages` flag when creating the VirtualEnv so that it will pick up
 the system-provided `Numpy` and `SciPy`.
 
 .. code-block:: bash
-    
+
     virtualenv --system-site-packages -p python2.7 theano-env
     source theano-env/bin/activate
     pip install Theano
@@ -208,7 +208,7 @@ Updating Bleeding Edge Installs
 Change to the Theano directory and run:
 
 .. code-block:: bash
-    
+
     git pull
 
 
@@ -303,7 +303,7 @@ Test GPU configuration
 
 .. code-block:: bash
 
-    THEANO_FLAGS=floatX=float32,device=gpu python /usr/lib/python2.*/site-packages/theano/misc/check_blas.py
+    THEANO_FLAGS=floatX=float32,device=cuda python /usr/lib/python2.*/site-packages/theano/misc/check_blas.py
 
 .. note::
 

diff --git a/doc/install_windows.txt b/doc/install_windows.txt
@@ -423,16 +423,16 @@ Create a test file containing:
    print("NP time: %f[s], theano time: %f[s] (times should be close when run on CPU!)" %(
                                               np_end-np_start, t_end-t_start))
    print("Result difference: %f" % (np.abs(AB-tAB).max(), ))
-   
+
 .. testoutput::
    :hide:
    :options: +ELLIPSIS
-   
+
    NP time: ...[s], theano time: ...[s] (times should be close when run on CPU!)
    Result difference: ...
 
 .. code-block:: none
-   
+
    NP time: 1.480863[s], theano time: 1.475381[s] (times should be close when run on CPU!)
    Result difference: 0.000000
 
@@ -445,6 +445,8 @@ routine for matrix multiplication)
 Configure Theano for GPU use
 ############################
 
+Install `libgpuarray <http://deeplearning.net/software/libgpuarray/installation.html>`_ if you have not already done so.
+
 Theano can be configured with a ``.theanorc`` text file (or
 ``.theanorc.txt``, whichever is easier for you to create under
 Windows). It should be placed in the directory pointed to by the
@@ -457,7 +459,7 @@ To use the GPU please write the following configuration file:
 .. code-block:: cfg
 
    [global]
-   device = gpu
+   device = cuda
    floatX = float32
 
    [nvcc]
@@ -498,7 +500,7 @@ within an MSYS shell if you installed Nose manually as described above.
 Compiling a faster BLAS
 ~~~~~~~~~~~~~~~~~~~~~~~
 
-If you installed Python through WinPython or EPD, Theano will automatically 
+If you installed Python through WinPython or EPD, Theano will automatically
 link with the MKL library, so you should not need to compile your own BLAS.
 
 .. note::

diff --git a/doc/optimizations.txt b/doc/optimizations.txt
@@ -32,6 +32,7 @@ Optimization                                              FAST_RUN  FAST_COMPILE
 ========================================================= ========= ============ =============
 :term:`merge`                                             x         x
 :term:`constant folding<constant folding>`                x         x
+:term:`GPU transfer`                                      x         x
 :term:`shape promotion<shape promotion>`                  x
 :term:`fill cut<fill cut>`                                x
 :term:`inc_subtensor srlz.<inc_subtensor serialization>`  x
@@ -52,7 +53,6 @@ Optimization                                              FAST_RUN  FAST_COMPILE
 :term:`inplace_elemwise`                                  x
 :term:`inplace_random`                                    x
 :term:`elemwise fusion`                                   x
-:term:`GPU transfer`                                      x
 :term:`local_log_softmax`                                 x                      x
 :term:`local_remove_all_assert`                                                   
 ========================================================= ========= ============ =============

diff --git a/doc/tutorial/aliasing.txt b/doc/tutorial/aliasing.txt
@@ -261,52 +261,6 @@ combination of ``return_internal_type=True`` and ``borrow=True`` arguments to
 hints that give more flexibility to the compilation and optimization of the
 graph.
 
-For GPU graphs, this borrowing can have a major speed impact.  See the following code:
-
-.. code-block:: python
-
-   from theano import function, config, shared, sandbox, tensor, Out
-   import numpy
-   import time
-
-   vlen = 10 * 30 * 768  # 10 x # cores x # threads per core
-   iters = 1000
-
-   rng = numpy.random.RandomState(22)
-   x = shared(numpy.asarray(rng.rand(vlen), config.floatX))
-   f1 = function([], sandbox.cuda.basic_ops.gpu_from_host(tensor.exp(x)))
-   f2 = function([],
-                 Out(sandbox.cuda.basic_ops.gpu_from_host(tensor.exp(x)),
-                     borrow=True))
-   t0 = time.time()
-   for i in range(iters):
-       r = f1()
-   t1 = time.time()
-   no_borrow = t1 - t0
-   t0 = time.time()
-   for i in range(iters):
-       r = f2()
-   t1 = time.time()
-   print(
-       "Looping %s times took %s seconds without borrow "
-       "and %s seconds with borrow" % (iters, no_borrow, (t1 - t0))
-   )
-   if numpy.any([isinstance(x.op, tensor.Elemwise) and
-                 ('Gpu' not in type(x.op).__name__)
-                 for x in f1.maker.fgraph.toposort()]):
-       print('Used the cpu')
-   else:
-       print('Used the gpu')
-
-Which produces this output:
-
-.. code-block:: none
-
-   $ THEANO_FLAGS=device=gpu0,floatX=float32 python test1.py
-   Using gpu device 0: GeForce GTX 275
-   Looping 1000 times took 0.368273973465 seconds without borrow and 0.0240728855133 seconds with borrow.
-   Used the gpu
-
 *Take home message:*
 
 When an input *x* to a function is not needed after the function
@@ -317,4 +271,3 @@ requirement.  When a return value *y* is large (in terms of memory
 footprint), and you only need to read from it once, right away when
 it's returned, then consider marking it with an ``Out(y,
 borrow=True)``.
-