autograd mir and CUDA library for dynamic neural networks in D.


  • dynamic computation graph like chainer or pytorch
  • statically typed tensor Variable(T, size_t dim, alias Storage) unlike numpy
  • CPU (mir) and CUDA (cublas/cudnn) backend
  • extensible (i.e., user-defined) autograd function
  • LDC2 (CPU/CUDA) and DMD (CPU only) support

how to run MNIST

$ dub --config=example-mnist -b=cuda-release # with cuda
$ dub --config=example-mnist -b=release      # without cuda

it results as following (may take several seconds without cuda)

Running ./grain-example-mnist
loading data/train-images-idx3-ubyte.gz
loading data/train-labels-idx1-ubyte.gz
loading data/t10k-images-idx3-ubyte.gz
loading data/t10k-labels-idx1-ubyte.gz
train loss: 0.538635, acc: 0.864311
test loss: 0.299959, acc: 0.915264
train loss: 0.277901, acc: 0.920858
test loss: 0.241783, acc: 0.930589
train loss: 0.229879, acc: 0.934999
test loss: 0.206087, acc: 0.939704
train loss: 0.198716, acc: 0.943937
test loss: 0.181938, acc: 0.945613
train loss: 0.175066, acc: 0.950957
test loss: 0.163919, acc: 0.951022

how to test

$ curl -fsS | bash -s ldc-1.9.0
$ source ~/dlang/ldc-1.9.0/activate
$ dub test -b=cuda-unittest # with cuda
$ dub test                  # without cuda

I have tested with



sorted by higher priority for me

  • practical examples (MNIST, CIFAR10, WordLM). see example/
    • dub --config=example-mnist
    • (wip) dub --config=example-char-rnn
  • more autograd functions. see source/grain/functions/ TODO
  • multi GPU
  • curand wrappers
  • statically linked kernel module instead of ptx string
  • dmd support
  • double backward (implement Function.backward with Chain)