Usage example of RNN with large inputs #12

adicirstei · 2016-09-15T08:08:57Z

Hi,

I would like to use large inputs for training a phrase generating model and I tried to adapt the example on the website but of course I get an OutOfMemoryException

What would be a good approach to this kind of task?

Thanks,
Adrian

The text was updated successfully, but these errors were encountered:

zgrkpnr · 2016-09-15T17:56:20Z

@adicirstei I can offer you my solution, but it requires you to modify source code.

In Optimize.fs find the line whist <- [w] @ whist and comment it out or delete it.
This line accumulates all the weight history which gets very large if your model is large. After that, it works great.

adicirstei · 2016-09-16T10:19:26Z

Thanks @zgrkpnr!

I'll give it a try. Hope I'll be able to compile it.

smoothdeveloper · 2016-09-16T19:23:33Z

@zgrkpnr do you remember what was the size of that list when you got OutOfMemory?

Looking at the code quickly, it feels this should actually be a ResizeArray (initialized with a size matching iters), this would also make the last step (where it is currently reversed and converted to array) faster.

This change could save roughly 4 bytes per element.

zgrkpnr · 2016-09-19T11:28:19Z

@smoothdeveloper Let's say our model has 900,000 parameters to optimize (which is quite normal for deep structures). 100 minibacthes and 10 Epochs for each brings us 100 x 10 x 900,000 single precision floating points of the size 3.4 b bytes. Roughly 3GB. I may be wrong, of course.

@adicirstei If the solution works, let me know.

We might want to switch to Resize array as suggested on #12

cgravill · 2019-06-20T13:38:59Z

Perhaps we should add an option to control collecting history?

Hype/src/Hype/Optimize.fs

Lines 370 to 384 in 366daa7

    
           module Params = 
        
                let Default = {Epochs = 100 
        
                               LearningRate = LearningRate.DefaultRMSProp 
        
                               Momentum = NoMomentum 
        
                               Loss = L2Loss 
        
                               Regularization = Regularization.DefaultL2Reg 
        
                               GradientClipping = NoClip 
        
                               Method = GD 
        
                               Batch = Full 
        
                               EarlyStopping = NoEarly 
        
                               ImprovementThreshold = D 0.995f 
        
                               Silent = false 
        
                               ReturnBest = true 
        
                               ValidationInterval = 10 
        
                               LoggingFunction = fun _ _ _ -> ()}

It could default on but allow people with large models to disable without having to recompile.

Does anyone have an optimization real use case they could share? We could potentially speed it up as well.

cgravill added a commit that referenced this issue Jun 20, 2019

Faster to cons than concat 1 element list.

366daa7

We might want to switch to Resize array as suggested on #12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Usage example of RNN with large inputs #12

Usage example of RNN with large inputs #12

adicirstei commented Sep 15, 2016

zgrkpnr commented Sep 15, 2016

adicirstei commented Sep 16, 2016

smoothdeveloper commented Sep 16, 2016

zgrkpnr commented Sep 19, 2016

cgravill commented Jun 20, 2019

Usage example of RNN with large inputs #12

Usage example of RNN with large inputs #12

Comments

adicirstei commented Sep 15, 2016

zgrkpnr commented Sep 15, 2016

adicirstei commented Sep 16, 2016

smoothdeveloper commented Sep 16, 2016

zgrkpnr commented Sep 19, 2016

cgravill commented Jun 20, 2019