Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minibatch type family #21

Closed
HuwCampbell opened this issue Jan 30, 2017 · 5 comments
Closed

Minibatch type family #21

HuwCampbell opened this issue Jan 30, 2017 · 5 comments

Comments

@HuwCampbell
Copy link
Owner

Matrix-matrix multiplications are significantly faster than many matrix-vector ones. So minibatching over layers which do this is definitely worthwhile.

With improvements to how LSTM layers update, I think we could get a 20x speed increase for decent sized layers.

To do this, my thought is to have an injective type family for mini batches, and allow for either runForwards, or runBatchForwards (or both) to be written for each layer, with a default of each either lifting into a batch, or just running many in parallel sparks.

Biggest questions are, how to efficiently store the tapes (what is Tapes? this easier if I don't make the tape change, but I still think I should).

type family MiniBatch (n :: Nat) (s :: Shape) = (b :: Shape) | b -> n s where
  MiniBatch n ('D1 x) = ('D2 x n)
  MiniBatch n ('D2 x y) = ('D3 x y n)
  MiniBatch n ('D3 x y z) = ('D4 x y z n) | Vec n ('D3 x y n)
class UpdateLayer x => Layer x (i :: Shape) (o :: Shape) where
  ...
  runBatchForwards :: x -> S (MiniBatch n i)
    -> (Tapes n x i o, S (MiniBatch n o))
  runBatchBackwards :: x -> Tapes n x i o
    -> S (MiniBatch n o) -> S (MiniBatch n i)
@HuwCampbell
Copy link
Owner Author

I don't think this is the best way to go. Minibatching can be simpler using a type.

@silky
Copy link

silky commented Nov 26, 2018

did you end up doing anything with batches?

it's not totally clear to me how it should work

@silky
Copy link

silky commented Nov 27, 2018

i've done this:

type CppnNetMain batches
  = Network '[ Reshape
             , FullyConnected (batches*TotalDim) (batches*Hidden) , Tanh 
             , FullyConnected (batches*Hidden)   (batches*OutDim) , Logit
             ]
             [ 'D2 batches TotalDim , 'D1 (batches*TotalDim) 
             , 'D1 (batches*Hidden) , 'D1 (batches*Hidden)
             , 'D1 (batches*OutDim) , 'D1 (batches*OutDim)
             ]

but i don't think it's a good idea, because it seems slower?! (for larger batches)

-- edit: to be fair, i think the slowdown is actually due to concatenation that i'm doing after the forward pass (edit again: actually, i'm not so sure ...)

@HuwCampbell
Copy link
Owner Author

So the main benefit you'll get with minibatching is that matrix matrix multiplications are much faster than many matrix vector ones (with one per example).

Unfortunately just lengthening the vectors won't help, and indeed, what you've got there's not actually minibatching at all, as the examples are now non-linearly connected through the fully connected layer (whose matrix is now n^2 bigger).

For convolutional nets, where it's already matrix matrix multiplications under the covers, minibatching will probably buy you quite a bit less benefit.

@silky
Copy link

silky commented Nov 27, 2018

So what should I do if I want to get some kind of batching behaviour here? Can I do fully-connected on 2d things? No, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants