OpenCL backend for Torch nn neural networks library
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
cog-batteries fix extraneous whitespace in stringify.py Dec 25, 2015
doc doc tweak Sep 24, 2015
lib
rocks update rock file Jul 5, 2015
test upgrdae to testsuite Aug 20, 2016
thirdparty spatialconvoltuionmm:forward runs at least :-) Jun 21, 2015
travis
util add TemporalConvolution2.lua, add test for TemporlaConvolution2, add … May 2, 2016
.gitignore
.travis.yml add otool tests to travis May 1, 2016
CMakeLists.txt add TemporalConvolution2.lua, add test for TemporlaConvolution2, add … May 2, 2016
CMulTable.lua
ClassNLLCriterion.lua allow scalar target Feb 21, 2016
LICENSE Initial commit Jun 11, 2015
LogSoftMax.lua
LookupTable.lua Address hughperkins/distro-cl#8 Aug 18, 2016
MSECriterion.lua migrate to use apply_on_gpu, map_on_gpu, map2_on_gpu Aug 20, 2016
Narrow.lua
OneHot.lua
Pointwise.lua migrate to use apply_on_gpu, map_on_gpu, map2_on_gpu Aug 20, 2016
README.md update readme to point to distro-cl May 2, 2016
SpatialUpSamplingNearest.lua add SpatialUpSamplingNearest (re-added/squashed from original commits… May 1, 2016
StatefulTimer.lua reindent Statefultimer Aug 11, 2015
THCLNN.lua fix change in ELU from upstream. tests on elu pass again now May 2, 2016
TemporalConvolution2.lua add TemporalConvolution2.lua, add test for TemporlaConvolution2, add … May 2, 2016
Threshold.lua migrate to use apply_on_gpu, map_on_gpu, map2_on_gpu Aug 20, 2016
init.cpp failed attempt to fix thnn4 errors; softmax migrated; but spatialaver… Feb 2, 2016
init.lua
run-mnist2.sh mnist works now :-) Jun 18, 2015
run-test-batch.sh update to work with latest verison of cltorch (ie, no set/get, etc) Jun 19, 2015
run-test-layer-perf.sh improve LogSoftMax perf mildly. Create layer perf tests Jun 27, 2015
run-test-layers.sh migrate Tanh.lua to use ClTensor:map2 Jun 19, 2015
run-test-perf.sh add timings for cuda vs cl for conv network Jun 21, 2015
run-test-prot.sh added ReLU Jun 22, 2015
run-test-spatialconvolution.sh reorg tests slightly Jun 22, 2015
run-test.sh update to work with latest verison of cltorch (ie, no set/get, etc) Jun 19, 2015
test.lua upgrdae to testsuite Aug 20, 2016
utils.cpp added clnn.about() Dec 29, 2015
utils.h added clnn.about() Dec 29, 2015

README.md

clnn

OpenCL backend for Torch nn neural networks library.

Installation

Please see distro-cl for installation instructions.

What works

Parameterized Modules

  • nn.Linear

Basic Tensor methods

These mostly 'just work', since based on underlying tensor methods, already implemented in cltorch. Tested with:

  • nn.Narrow

Miscellaneous modules

  • nn.Identity
  • nn.Dropout

Convolution layers

  • nn.SpatialConvolutionMM
  • nn.SpatialMaxPooling (including ceil mode)
  • nn.SpatialAveragePooling
  • nn.TemporalConvolution2 This is specific to clnn. It works on cpu and cuda too, not just on OpenCL. It is API-compatible with TemporalConvolution, and faster than TemporalConvolution, on both CUDA and OpenCL.

Transfer function layers

  • nn.Tanh
  • nn.Sigmoid
  • nn.ReLU
  • nn.ELU
  • nn.Exp
  • nn.Sqrt
  • nn.Square
  • nn.Abs
  • nn.LogSigmoid
  • nn.HardTanh
  • nn.LogSoftMax
  • nn.SoftMax (including spatial mode)

Table layers

These 'just work', since they are based on underlying torch operations, which are already implemented in cltorch. Tested with:

  • nn.CMulTable
  • nn.CAddTable

Criterions

  • nn.MSECriterion
  • nn.ClassNLLCriterion

Containers:

Containers 'just work', since they just call standard operations on the contained modules. Tested with:

  • nn.Sequential
  • nngraph

Trainers

In theory, trainers 'just work', since they just call standard torch methods on the network. The following are good first choices:

  • nn.StochasticGradient
  • optim.lbfgs
  • optim.adam

Timings

Soumith benchmark layers

Please see https://github.com/soumith/convnet-benchmarks#imagenet-winners-benchmarking

  • On a Titan X, OpenCL torch is about 3 times slower than CUDA torch
    • eg for VGG, cutorch takes 1100ms, and cltorch takes 3400ms

Example networks

Porting guidelines

Porting guidelines, for project maintainers, available here: porting-guidelines.md.

Recent changes

  • 2nd May:
    • Re-applied:
      • 26th March:
        • add TemporalConvolution2: same API and usage as TemporalConvolution, but faster on GPUs
  • 31st April:
    • Re-applied:
      • 10th March:
        • @pawni (Nick Pawlowski) added SpatialUpSamplingNearest. Thank you Nick
      • 20th February:
        • @gloine (Jaehyung Lee) added support for non-batched input to ClassNLLCriterion. Thank you Jaehyung
  • 30th April:
    • rolled back to as-of 21st February, prior to lots of THNN changes in upstream Torch
    • additionally, installation procedure is now to use a specific torch distro, for stability
  • 1st Feb:
    • merged/ported THNN phase 3. Any weird build issues, please update both nn and clnn.
  • 2nd January, 2016:
    • merged/ported THNN architecture across, and the implementation of Abs, so the unit-tests pass again now
  • 15th December:
  • 29th November:
    • added ELU
  • 25th September:
  • 23rd September:
    • ported latest cunn implementation of SpatialMaxPooling across, ie approximately Sergey's Deterministic max-pooling PR
      • this includes :ceil() implementation
  • 22nd September:
    • added non-batch implementation of LogSoftMax (previously only handled batched input)
    • added SoftMax, for both batched and non-batched
  • 20th September:
    • added non-batch implementation for SpatialMaxPooling (previously only handled batched input), for contiguous pools

Older changes