Excessive memory consumption and preparation runtime of tf.keras.backend.max in custom layer with masking #37479

padoremu · 2020-03-10T14:24:06Z

System information

Have I written custom code: yes
OS Platform and Distribution: Linux Ubuntu 18.04
Mobile device if the issue happens on mobile device: -
TensorFlow installed from: binary
TensorFlow version: 2.2.0-dev20200303
Python version: 3.6.9
Bazel version: -
GCC/Compiler version: -
CUDA/cuDNN version: CPU only
GPU model and memory: CPU only

Describe the current behavior
Memory consumption seems to be proportional to num_iterations and thus excessive, most likely being a memory leak. Runtime until seeing the first fit result is also extremely slow: 15 seconds until the first fit call, 55 seconds until seeing the result of the first fit, and the other fits run through in less than a second. Apparently, runtime is due to memory management and not due to the actual max function evaluation.

When using tf.keras.backend.max for computing a mask with tf.stack in a real setup, memory consumption increases steadily until running out of memory at approx. 30 GB. In contrast, without compute_mask, memory consumption doesn't go beyond approx 1 GB.

Describe the expected behavior
I would expect memory consumption to be independent of num_iterations and thus being much lower, plus preparation runtime being much lower.

Code to reproduce the issue

import tensorflow as tf
import numpy as np


batch_size = 100
dim_input = 100
dim_output = 1
num_iterations = 100 # will consume approx. 5 GB RAM when set to 1000


class CustomMask(tf.keras.layers.Layer):
  def __init__(self):
    super(CustomMask, self).__init__()

  def compute_mask(self, inputs, mask=None):
    batch_size = inputs.shape[0]

    batch_maxes = tf.keras.backend.max(inputs, axis=1)

    for batch in range(batch_size):
      for i in range(num_iterations):
        max = tf.keras.backend.max(batch_maxes[batch])

    return None

  def call(self, inputs, mask=None):
    return inputs


model = tf.keras.Sequential()

model.add(tf.keras.layers.Input(batch_input_shape=(batch_size, dim_input)))

model.add(CustomMask())

model.add(tf.keras.layers.Dense(dim_output))

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

training_input = np.zeros([batch_size, dim_input])
training_output = np.zeros([batch_size, dim_output])

model.fit(training_input, training_output, batch_size=batch_size)

Other info / logs
If my usage of tf.keras.backend.max is wrong with regard to memory consumption and / or runtime, please let me know. I need to call it frequently within compute_mask for computing a custom mask in conjunction with tf.stack. However, the latter does not seem to be the problem, which is why I left it out in the stripped down code.

The text was updated successfully, but these errors were encountered:

gadagashwini-zz · 2020-03-11T08:57:10Z

@padoremu, When i tried to execute the issue by setting num_iterations=1000 session got crashed. Which shows, it took more than 12GB of RAM. Please find the gist here and confirm the issue. Thanks!

padoremu · 2020-03-11T16:29:10Z

Thanks. I can confirm the issue.

padoremu · 2020-03-20T12:30:26Z

I would like to kindly ask, if there are any news on this issue?

padoremu · 2020-04-09T07:12:36Z

This issue has been inactive for one month now. I would appreciate some feedback very much. Thank you.

padoremu · 2020-04-20T14:36:07Z

@gadagashwini @gowthamkpr @fchollet Is there any chance that a tensorflower comments on this issue? That would be very kind. Thank you.

Saduf2019 · 2021-09-19T14:33:47Z

@padoremu
Can you please try on latest tf version and in case the issue persist.

Please post this issue on keras-team/keras repo.
To know more refer to:
https://discuss.tensorflow.org/t/keras-project-moved-to-new-repository-in-https-github-com-keras-team-keras/1999

padoremu · 2021-09-21T07:01:53Z

@Saduf2019
Thank you for asking after so long. I just tried with a fresh tf-nightly installation (2.7.0-dev20210920) and using the initially posted code, and I can still perfectly reproduce the problem: the larger you set num_iterations, the more memory consumption increases. I recommend setting num_iterations = 1000 and observe the evolution with e.g. top on Linux. Memory consumption steadily increases. After 15 minutes, I even have > 20 GB.

Since this issue was created one and a half years ago, as you can imagine I had to find ways to avoid needing this kind of functionality. Please feel free to move this issue to kears-team/keras repo. My motivation to invest more time in communicating and documenting this problem is limited. Of course I would still be interested in a solution / fix. Anybody can easily reproduce the problem with the initially posted code - that's all one needs.

Thank you.

tensorflowbutler · 2021-12-14T23:11:40Z

Hi There,

This is a stale issue. As you are using an older version of tensorflow, we are checking to see if you still need help on this issue. Please test the issue with the latest TensorFlow (TF2.7 and tf-nightly). If the issue still persists with the newer versions of TF, please feel free to open it in keras-team/keras repository by providing details about the issue and a standalone code to reproduce the issue. Thanks!

Please note that Keras development has moved to a separate Keras-team/keras repository to focus entirely on only Keras. Thanks!

google-ml-butler · 2022-01-11T17:46:48Z

Are you satisfied with the resolution of your issue?
Yes
No

padoremu added the type:bug Bug label Mar 10, 2020

tensorflow-bot bot assigned gadagashwini-zz Mar 10, 2020

gadagashwini-zz added comp:keras Keras related issues TF 2.1 for tracking issues in 2.1 release type:performance Performance Issue and removed type:bug Bug labels Mar 11, 2020

gadagashwini-zz added the stat:awaiting response Status - Awaiting response from author label Mar 11, 2020

gadagashwini-zz removed the stat:awaiting response Status - Awaiting response from author label Mar 12, 2020

gadagashwini-zz assigned gowthamkpr and unassigned gadagashwini-zz Mar 12, 2020

gowthamkpr assigned fchollet and unassigned gowthamkpr Mar 12, 2020

gowthamkpr added the type:bug Bug label Mar 12, 2020

Saduf2019 added the stat:awaiting response Status - Awaiting response from author label Sep 19, 2021

Saduf2019 removed the stat:awaiting response Status - Awaiting response from author label Sep 21, 2021

tensorflowbutler closed this as completed Jan 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Excessive memory consumption and preparation runtime of tf.keras.backend.max in custom layer with masking #37479

Excessive memory consumption and preparation runtime of tf.keras.backend.max in custom layer with masking #37479

padoremu commented Mar 10, 2020

gadagashwini-zz commented Mar 11, 2020

padoremu commented Mar 11, 2020

padoremu commented Mar 20, 2020

padoremu commented Apr 9, 2020

padoremu commented Apr 20, 2020 •

edited

Loading

Saduf2019 commented Sep 19, 2021

padoremu commented Sep 21, 2021

tensorflowbutler commented Dec 14, 2021

google-ml-butler bot commented Jan 11, 2022

Excessive memory consumption and preparation runtime of tf.keras.backend.max in custom layer with masking #37479

Excessive memory consumption and preparation runtime of tf.keras.backend.max in custom layer with masking #37479

Comments

padoremu commented Mar 10, 2020

gadagashwini-zz commented Mar 11, 2020

padoremu commented Mar 11, 2020

padoremu commented Mar 20, 2020

padoremu commented Apr 9, 2020

padoremu commented Apr 20, 2020 • edited Loading

Saduf2019 commented Sep 19, 2021

padoremu commented Sep 21, 2021

tensorflowbutler commented Dec 14, 2021

google-ml-butler bot commented Jan 11, 2022

padoremu commented Apr 20, 2020 •

edited

Loading