Buffer donation to a jit function on GPU #1273

romanngg · 2019-08-30T22:16:15Z

Below is a CNN iteratedly applied to a 2Gb input. It produces a 4x2Gb = 8 Gb peak memory consumption.

import jax.numpy as np
import jax.random as random
from jax import lax
from jax import jit

@jit
def f(x):
  for _ in range(10):
    x = lax.conv_general_dilated(x, np.ones((3, 3, 1, 1)), (1, 1), 'SAME', 
                                 dimension_numbers=('NHWC', 'HWIO', 'NHWC'))
  return x

x = random.normal(random.PRNGKey(1), (2**19, 2**5, 2**5, 1))  
# (2**20, 2**5, 2**5, 1)) OOMs!
x = f(x)

Without JIT, the peak memory consumption is 2x2Gb = 4 Gb, as is expected.

Would be great to achieve a comparable memory usage with JIT by input buffer donation to the jit function (not sure on the exact terminology).

Thanks a lot!

The text was updated successfully, but these errors were encountered:

hawkinsp · 2020-06-28T13:36:23Z

Buffer donation has been checked in!

romanngg · 2020-06-29T13:44:04Z

Thanks Peter, do you know how can I leverage it to reduce the memory consumption in the example above?

So far, even if I do

f = jit(f, donate_argnums=0)

I still get peak memory of 4x2 = 8Gb, and a message

jax/interpreters/xla.py:660: UserWarning: Some donated buffers were not usable: f32[524288,32,32,1]{3,2,1,0}

jekbradbury · 2020-06-30T03:54:15Z

I believe that means that there wasn't an output with the same shape that could have reused that buffer (or there weren't an equal number of such outputs as inputs).

romanngg · 2020-06-30T09:25:14Z

Interesting - how come it doesn't work in this example then? From my understanding here there's 1 input, 1 output, both of shape and type f32[524288,32,32,1].

tomhennigan · 2020-06-30T14:31:34Z

FYI buffeer donation is only supported on TPU at the moment, XLA team are working to support this on CPU/GPU but that may be why we cannot use the donation.

romanngg · 2020-06-30T14:34:49Z

I see, thanks! Could you please reopen this issue then?

hawkinsp · 2020-07-20T13:00:27Z

Fixed by #3800

mattjj added the enhancement New feature or request label Sep 2, 2019

romanngg mentioned this issue Nov 21, 2019

Add support for buffer donation (input/output aliasing) #1733

Closed

romanngg mentioned this issue Jan 22, 2020

very large memory footprint for a simple UNet google/neural-tangents#18

Open

romanngg mentioned this issue Apr 14, 2020

Memory and running time issues for CNN google/neural-tangents#29

Closed

hawkinsp closed this as completed Jun 28, 2020

romanngg changed the title ~~Buffer donation to a jit function~~ Buffer donation to a jit function on GPU Jun 30, 2020

tomhennigan reopened this Jun 30, 2020

hawkinsp closed this as completed Jul 20, 2020

mohamad-amin mentioned this issue Dec 11, 2021

Suggestion: Replace _scan in batch.py with lax.scan google/neural-tangents#133

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Buffer donation to a jit function on GPU #1273

Buffer donation to a jit function on GPU #1273

romanngg commented Aug 30, 2019

hawkinsp commented Jun 28, 2020

romanngg commented Jun 29, 2020

jekbradbury commented Jun 30, 2020

romanngg commented Jun 30, 2020

tomhennigan commented Jun 30, 2020

romanngg commented Jun 30, 2020

hawkinsp commented Jul 20, 2020

Buffer donation to a jit function on GPU #1273

Buffer donation to a jit function on GPU #1273

Comments

romanngg commented Aug 30, 2019

hawkinsp commented Jun 28, 2020

romanngg commented Jun 29, 2020

jekbradbury commented Jun 30, 2020

romanngg commented Jun 30, 2020

tomhennigan commented Jun 30, 2020

romanngg commented Jun 30, 2020

hawkinsp commented Jul 20, 2020