Matmul code and speed improvements #1483

pmichel31415 · 2018-10-24T04:28:51Z

This builds on #1355 and rewrite the matmul and affinetransform nodes to reduce code duplication (now all the matmul/matmul+transpose logic lives in matrix-multiply.h only).

This should also enable cublas batched matmul in the backward pass, which hopefully will make things like dot attention faster.

TODO:

Confirm speed gain
figure out why there is an ""assert false" in matrix-multiply.h (cf Added batched matrix multiplies on CUDA #1355)
~~[ ] Add matmul + transpose as an operation (add the option to transpose in the matmul node)~~ <- maybe for a future PR but this should help further with attention (both memory and speed wise)

FilippoC · 2018-10-24T11:10:07Z

FYI, I just tried this with my implementation of "A Decomposable Attention Model for Natural Language Inference" (I remove the line "DYNET_ARG_CHECK(false, "MatrixMultiplyTranspAcc");"). My implementation uses a static graph and batches of size 64.

Before: ~188 seconds per epoch
Now: ~105 seconds per epoch

This is really nice because the Pytorch implementation of a colleague of the exact same network takes ~108 seconds per epoch.

Having a matmul + transpose operation would be awesome!

pmichel31415 · 2018-10-24T12:23:27Z

Good to hear! I've also seen speedup in my implementation of transformer (not as big as yours but still went from 23 to 17 min per epoch).

I think this one should be merged now and matmul+transpose can be done in another PR.

pmichel31415 added 2 commits October 23, 2018 23:25

Initial commit: remove code duplication

7f10ac9

Fix transposition order bug

4f24331

Remove the mysterious assert

680c0a8

pmichel31415 changed the title ~~[WIP] Matmul code (and speed?) improvements~~ Matmul code and speed improvements Oct 24, 2018

neubig merged commit 946200c into master Oct 25, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Matmul code and speed improvements #1483

Matmul code and speed improvements #1483

pmichel31415 commented Oct 24, 2018 •

edited

Loading

FilippoC commented Oct 24, 2018 •

edited

Loading

pmichel31415 commented Oct 24, 2018

Matmul code and speed improvements #1483

Matmul code and speed improvements #1483

Conversation

pmichel31415 commented Oct 24, 2018 • edited Loading

FilippoC commented Oct 24, 2018 • edited Loading

pmichel31415 commented Oct 24, 2018

pmichel31415 commented Oct 24, 2018 •

edited

Loading

FilippoC commented Oct 24, 2018 •

edited

Loading