Minibatch type family #21

HuwCampbell · 2017-01-30T08:39:53Z

Matrix-matrix multiplications are significantly faster than many matrix-vector ones. So minibatching over layers which do this is definitely worthwhile.

With improvements to how LSTM layers update, I think we could get a 20x speed increase for decent sized layers.

To do this, my thought is to have an injective type family for mini batches, and allow for either runForwards, or runBatchForwards (or both) to be written for each layer, with a default of each either lifting into a batch, or just running many in parallel sparks.

Biggest questions are, how to efficiently store the tapes (what is Tapes? this easier if I don't make the tape change, but I still think I should).

type family MiniBatch (n :: Nat) (s :: Shape) = (b :: Shape) | b -> n s where
  MiniBatch n ('D1 x) = ('D2 x n)
  MiniBatch n ('D2 x y) = ('D3 x y n)
  MiniBatch n ('D3 x y z) = ('D4 x y z n) | Vec n ('D3 x y n)

class UpdateLayer x => Layer x (i :: Shape) (o :: Shape) where
  ...
  runBatchForwards :: x -> S (MiniBatch n i)
    -> (Tapes n x i o, S (MiniBatch n o))
  runBatchBackwards :: x -> Tapes n x i o
    -> S (MiniBatch n o) -> S (MiniBatch n i)

The text was updated successfully, but these errors were encountered:

HuwCampbell · 2017-04-06T04:49:32Z

I don't think this is the best way to go. Minibatching can be simpler using a type.

silky · 2018-11-26T21:33:24Z

did you end up doing anything with batches?

it's not totally clear to me how it should work

silky · 2018-11-27T00:42:22Z

i've done this:

type CppnNetMain batches
  = Network '[ Reshape
             , FullyConnected (batches*TotalDim) (batches*Hidden) , Tanh 
             , FullyConnected (batches*Hidden)   (batches*OutDim) , Logit
             ]
             [ 'D2 batches TotalDim , 'D1 (batches*TotalDim) 
             , 'D1 (batches*Hidden) , 'D1 (batches*Hidden)
             , 'D1 (batches*OutDim) , 'D1 (batches*OutDim)
             ]

but i don't think it's a good idea, because it seems slower?! (for larger batches)

-- edit: to be fair, i think the slowdown is actually due to concatenation that i'm doing after the forward pass (edit again: actually, i'm not so sure ...)

HuwCampbell · 2018-11-27T02:18:24Z

So the main benefit you'll get with minibatching is that matrix matrix multiplications are much faster than many matrix vector ones (with one per example).

Unfortunately just lengthening the vectors won't help, and indeed, what you've got there's not actually minibatching at all, as the examples are now non-linearly connected through the fully connected layer (whose matrix is now n^2 bigger).

For convolutional nets, where it's already matrix matrix multiplications under the covers, minibatching will probably buy you quite a bit less benefit.

silky · 2018-11-27T02:33:09Z

So what should I do if I want to get some kind of batching behaviour here? Can I do fully-connected on 2d things? No, right?

HuwCampbell closed this as completed Apr 6, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minibatch type family #21

Minibatch type family #21

HuwCampbell commented Jan 30, 2017

HuwCampbell commented Apr 6, 2017

silky commented Nov 26, 2018

silky commented Nov 27, 2018 •

edited

HuwCampbell commented Nov 27, 2018

silky commented Nov 27, 2018 •

edited

Minibatch type family #21

Minibatch type family #21

Comments

HuwCampbell commented Jan 30, 2017

HuwCampbell commented Apr 6, 2017

silky commented Nov 26, 2018

silky commented Nov 27, 2018 • edited

HuwCampbell commented Nov 27, 2018

silky commented Nov 27, 2018 • edited

silky commented Nov 27, 2018 •

edited

silky commented Nov 27, 2018 •

edited