Plans for Queues/Data Loaders for async loading of data to the GPU? #5986

botev · 2017-05-28T11:36:41Z

So I had this discussion quite a few times and had my own attempts at doing this, but it seems Theano has not been developed to allow asynchronous access to variables outside of its own internals. As such are there any plans for creating operators similar to Tensorflow's Queue or Pytorch data loaders for speeding up data transfers between main memory and the GPU?

nouiz · 2017-05-31T20:34:57Z

In the past, everytimes it was requested, I asked a benchmark that show if that was taking significant time and never saw one. Do you have one?

This is useful in a specific case, where the cpu->gpu transfer is significant enough compared to the computation, but not so big that doing it in parallel won't be useful.

There is code in the new back-end that would allow that. But it request using some sync mechanism that add overhead at other places, so it was disabled by default. If you can confirm me you are in the timing windows where this would help you, I can describe in more detail how to do it. Mostly, change a Theano flag. Make the Theano function use as input a GPU object and start the async transfer outside the Theano function.

nouiz · 2017-06-06T14:20:58Z

I discussed about this yesterday with @lamblin and he reminded me that one flag could in some cases already do what you want. The flag to try is: gpuarray.single_stream=False

Explanation:
If you have a loop that call at each iteration one Theano function and that Theano function return before all the computation is done, you could have the transfer of the next iteration start in parallel to the computation of the previous iteration with that flag.

How does that work? In the new gpu back-end, with that flag, we use 2 stream. One for memory copy and one for computation. We don't enable that by default, as when you don't benefit from that, it add overhead and slow down the computation a little bit.

When can a Theano function return while not all computation are done? This can happen if the last nodes executed in the graph don't cause a sync. Mostly, when they don't transfer to/from the GPU. If you Theano function have some output like the loss or error, this is probably not a problem for the training function, as this is could be computed in the middle of the graph.

Do you want to try it in your own code? Don't enable Theano profiler to profile it, as it force the sync for each node, so it would kill that feature.

If that don't work, you could make the Theano function take as input a gpundarray and start the transfer of 2 batch in python and pass the oldest one to the Theano function. This wasn't tested.

botev · 2017-06-06T14:42:06Z

I will give it a shot when I get the time.

I think I tried option 2 with the gpuarray, but I got from pygpu some errors - #5929. Note that there I'm trying to set shared variables, but I got similar issues when trying just to pass pygpu arrays.

nouiz · 2017-07-07T20:46:33Z

I have a PR that could make this easy: #6125

Mostly, if you try this PR with Theano flag: gpuarray.single_stream=False, different minibatch could overload transfer/computation. Note, take care to have very few operation between the two call to the Theano function with different minibatch.

lamblin mentioned this issue Jun 6, 2017

Thread safe shared updates on the GPU? #5929

Closed

lamblin added the GPU - New back-end label Jun 6, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Plans for Queues/Data Loaders for async loading of data to the GPU? #5986

Plans for Queues/Data Loaders for async loading of data to the GPU? #5986

botev commented May 28, 2017

nouiz commented May 31, 2017

nouiz commented Jun 6, 2017

botev commented Jun 6, 2017

nouiz commented Jul 7, 2017

Plans for Queues/Data Loaders for async loading of data to the GPU? #5986

Plans for Queues/Data Loaders for async loading of data to the GPU? #5986

Comments

botev commented May 28, 2017

nouiz commented May 31, 2017

nouiz commented Jun 6, 2017

botev commented Jun 6, 2017

nouiz commented Jul 7, 2017