Add faster version of autograd MLPG and linalg utilities #21

r9y9 · 2017-08-20T16:54:37Z

New function UnitVarianceMLPG can run on GPU/CPU. Fixes #4

The package now requires cython.

codecov-io · 2017-08-20T16:59:55Z

Codecov Report

Merging #21 into master will increase coverage by 1.68%.
The diff coverage is 81.81%.

@@            Coverage Diff             @@
##           master      #21      +/-   ##
==========================================
+ Coverage   61.33%   63.02%   +1.68%     
==========================================
  Files          24       25       +1     
  Lines        1107     1190      +83     
==========================================
+ Hits          679      750      +71     
- Misses        428      440      +12

Impacted Files	Coverage Δ
nnmnkwii/autograd/__init__.py	`100% <100%> (ø)`	⬆️
nnmnkwii/functions/_impl/mlpg.py	`80.23% <100%> (+6.38%)`	⬆️
nnmnkwii/util/linalg.py	`100% <100%> (ø)`
nnmnkwii/functions/__init__.py	`100% <100%> (ø)`	⬆️
nnmnkwii/autograd/_impl/mlpg.py	`70.51% <68.75%> (+2.86%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6d5514e...d6d0e48. Read the comment docs.

New function `UnitVarianceMLPG` can run on GPU/CPU. Fixes #4 The package now requires cython. Add variance expand for autograd MLPG Fixes

r9y9 · 2017-08-20T18:41:46Z

Added benchmark script. Compared to existing AF.mlpg, AF.unit_variance_mlpg is ~50x faster. Using cuda, we can get further ~ 3x speedup. Can be slower when we use batch_size=1.

python perf/autograd_mlpg_perf.py
MLPG vs UnitVarianceMLPG (static_dim, T, batch_size, use_cuda) = ((24, 500, 1, False)):
UnitVarianceMLPG, 2.752440 times faster. Elapsed times 0.082350 / 0.029919

MLPG vs UnitVarianceMLPG (static_dim, T, batch_size, use_cuda) = ((24, 500, 5, False)):
UnitVarianceMLPG, 10.013608 times faster. Elapsed times 0.422480 / 0.042191

MLPG vs UnitVarianceMLPG (static_dim, T, batch_size, use_cuda) = ((24, 500, 10, False)):
UnitVarianceMLPG, 14.762426 times faster. Elapsed times 0.802287 / 0.054347

MLPG vs UnitVarianceMLPG (static_dim, T, batch_size, use_cuda) = ((24, 1000, 1, False)):
UnitVarianceMLPG, 7.798280 times faster. Elapsed times 0.643957 / 0.082577

MLPG vs UnitVarianceMLPG (static_dim, T, batch_size, use_cuda) = ((24, 1000, 5, False)):
UnitVarianceMLPG, 27.737057 times faster. Elapsed times 2.996556 / 0.108034

MLPG vs UnitVarianceMLPG (static_dim, T, batch_size, use_cuda) = ((24, 1000, 10, False)):
UnitVarianceMLPG, 35.902094 times faster. Elapsed times 5.902599 / 0.164408

MLPG vs UnitVarianceMLPG (static_dim, T, batch_size, use_cuda) = ((59, 500, 1, False)):
UnitVarianceMLPG, 8.795823 times faster. Elapsed times 0.229524 / 0.026095

MLPG vs UnitVarianceMLPG (static_dim, T, batch_size, use_cuda) = ((59, 500, 5, False)):
UnitVarianceMLPG, 12.272720 times faster. Elapsed times 0.936106 / 0.076275

MLPG vs UnitVarianceMLPG (static_dim, T, batch_size, use_cuda) = ((59, 500, 10, False)):
UnitVarianceMLPG, 14.478066 times faster. Elapsed times 1.832451 / 0.126567

MLPG vs UnitVarianceMLPG (static_dim, T, batch_size, use_cuda) = ((59, 1000, 1, False)):
UnitVarianceMLPG, 14.802162 times faster. Elapsed times 1.446762 / 0.097740

MLPG vs UnitVarianceMLPG (static_dim, T, batch_size, use_cuda) = ((59, 1000, 5, False)):
UnitVarianceMLPG, 39.862342 times faster. Elapsed times 7.303081 / 0.183208

MLPG vs UnitVarianceMLPG (static_dim, T, batch_size, use_cuda) = ((59, 1000, 10, False)):
UnitVarianceMLPG, 46.693724 times faster. Elapsed times 14.615768 / 0.313014

MLPG vs UnitVarianceMLPG (static_dim, T, batch_size, use_cuda) = ((24, 500, 1, True)):
UnitVarianceMLPG, 0.079740 times slower. Elapsed times 0.094709 / 1.187726

MLPG vs UnitVarianceMLPG (static_dim, T, batch_size, use_cuda) = ((24, 500, 5, True)):
UnitVarianceMLPG, 16.662151 times faster. Elapsed times 0.409699 / 0.024589

MLPG vs UnitVarianceMLPG (static_dim, T, batch_size, use_cuda) = ((24, 500, 10, True)):
UnitVarianceMLPG, 32.788087 times faster. Elapsed times 0.793705 / 0.024207

MLPG vs UnitVarianceMLPG (static_dim, T, batch_size, use_cuda) = ((24, 1000, 1, True)):
UnitVarianceMLPG, 7.651012 times faster. Elapsed times 0.603635 / 0.078896

MLPG vs UnitVarianceMLPG (static_dim, T, batch_size, use_cuda) = ((24, 1000, 5, True)):
UnitVarianceMLPG, 34.876959 times faster. Elapsed times 2.994071 / 0.085847

MLPG vs UnitVarianceMLPG (static_dim, T, batch_size, use_cuda) = ((24, 1000, 10, True)):
UnitVarianceMLPG, 70.851952 times faster. Elapsed times 5.915354 / 0.083489

MLPG vs UnitVarianceMLPG (static_dim, T, batch_size, use_cuda) = ((59, 500, 1, True)):
UnitVarianceMLPG, 12.092856 times faster. Elapsed times 0.206201 / 0.017051

MLPG vs UnitVarianceMLPG (static_dim, T, batch_size, use_cuda) = ((59, 500, 5, True)):
UnitVarianceMLPG, 47.144604 times faster. Elapsed times 0.929347 / 0.019713

MLPG vs UnitVarianceMLPG (static_dim, T, batch_size, use_cuda) = ((59, 500, 10, True)):
UnitVarianceMLPG, 63.658279 times faster. Elapsed times 1.775017 / 0.027884

MLPG vs UnitVarianceMLPG (static_dim, T, batch_size, use_cuda) = ((59, 1000, 1, True)):
UnitVarianceMLPG, 18.569600 times faster. Elapsed times 1.464993 / 0.078892

MLPG vs UnitVarianceMLPG (static_dim, T, batch_size, use_cuda) = ((59, 1000, 5, True)):
UnitVarianceMLPG, 86.069254 times faster. Elapsed times 7.209471 / 0.083764

MLPG vs UnitVarianceMLPG (static_dim, T, batch_size, use_cuda) = ((59, 1000, 10, True)):
UnitVarianceMLPG, 149.658690 times faster. Elapsed times 15.029760 / 0.100427

and also support 3d tensors

Add faster version of MLPG and linalg utilities

e80d3fb

New function `UnitVarianceMLPG` can run on GPU/CPU. Fixes #4 The package now requires cython. Add variance expand for autograd MLPG Fixes

r9y9 force-pushed the faster-mlpg branch from 1484238 to e80d3fb Compare August 20, 2017 18:11

Add simple benchmark for MLPG

f3744aa

r9y9 force-pushed the faster-mlpg branch from bdad678 to f3744aa Compare August 20, 2017 18:51

Minor fix

6a59d9a

r9y9 force-pushed the faster-mlpg branch from 3b90a89 to ef5bd1c Compare August 21, 2017 14:29

UnitVarianceMLPG can now support reshaped/unreshaped means

d6d0e48

and also support 3d tensors

r9y9 force-pushed the faster-mlpg branch from ef5bd1c to d6d0e48 Compare August 21, 2017 14:59

r9y9 merged commit 1ac385b into master Aug 21, 2017

r9y9 deleted the faster-mlpg branch August 21, 2017 15:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add faster version of autograd MLPG and linalg utilities #21

Add faster version of autograd MLPG and linalg utilities #21

r9y9 commented Aug 20, 2017

codecov-io commented Aug 20, 2017 •

edited

r9y9 commented Aug 20, 2017

Add faster version of autograd MLPG and linalg utilities #21

Add faster version of autograd MLPG and linalg utilities #21

Conversation

r9y9 commented Aug 20, 2017

codecov-io commented Aug 20, 2017 • edited

Codecov Report

r9y9 commented Aug 20, 2017

codecov-io commented Aug 20, 2017 •

edited