-
Notifications
You must be signed in to change notification settings - Fork 227
Open
Labels
Description
CuArray compatibility and speed on Knet tutorial/examples
Knet/tutorial
- 15.quickstart
- 23.learning
- 30.lin
- 40.mlp
- 50.cnn
- 60.rnn
- 70.imdb
- 80.charlm
- 90.s2s
Knet/examples
- cifar10-cnn: CUDA 25% slower (7 vs 9 secs/epoch on dy03)
- dcgan-mnist: CUDA 50% slower (24 i/s vs 17 i/s on dy03)
- DeepLearningFrameworks/Knet_CNN:
CUDA 50% slower(15 vs 15.8 secs/epoch on dy03) - DeepLearningFrameworks/Knet_RNN
- DeepLearningFrameworks/ResNet50-Knet: needs pool mode=2, CUDA
100%25% slower (7.7 vs 9.7 secs on dy03) - dynet-benchmark/treenn
- dynet-benchmark/rnnlm-batch: CUDA 10% slower (35 vs 38 i/s on dy03)
- dynet-benchmark/bilstm-tagger
- dynet-benchmark/bilstm-tagger-withchar
- fashion-mnist
- housing-linreg
-
julia-tutorialnot a real example - lenet: update! interface changed
- mnist-mlp
- optimizers
-
reinforcement-learning/dp: does not use KnetArray - reinforcement-learning/dqn
- reinforcement-learning/pg
- resnet: mode=2 is not supported for CPU pool.
Knet 50% faster: Missing pooling options, inconsistencies with CuArrays FluxML/NNlib.jl#218 - rnnlm: (26 vs 27 secs/epoch on dy03)
-
rnn-tutorial: this is the same as 90.s2s - synthetic-linreg
- variational-autoencoder:
binary_cross_entropy gives error: Trouble with the @. macro JuliaGPU/CUDA.jl#346 - vgg:
CuArray works but 50% slower.
Other:
- 2014-Sutskever: sequence to sequence rnn
- 2015-Luong: s2s rnn with attention
- 2017-Vaswani: s2s transformer
Knet is 50% faster. - test/karray.jl: test for CuArray
ToucheSir and swiesend