Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak when calling policy function? #19

Closed
AntoineRichard opened this issue May 14, 2020 · 3 comments
Closed

Memory leak when calling policy function? #19

AntoineRichard opened this issue May 14, 2020 · 3 comments

Comments

@AntoineRichard
Copy link

AntoineRichard commented May 14, 2020

Hi,

I run into a memory leak when I use this function over and over and over again.

action, state = agent.policy(obs, state, training)

Is there something that needs to be cleared? I tried calling the function inside the call that resets the state (I'm not sure that it does that I'm assuming). This one:

if state is not None and reset.any():
      mask = tf.cast(1 - reset, self._float)[:, None]
      state = tf.nest.map_structure(lambda x: x * mask, state)

My guess is that is comes from this function:

def preprocess(obs, config):
  dtype = prec.global_policy().compute_dtype
  obs = obs.copy()
  with tf.device('cpu:0'):
    obs['image'] = tf.cast(obs['image'], dtype) / 255.0 - 0.5
    clip_rewards = dict(none=lambda x: x, tanh=tf.tanh)[config.clip_rewards]
    obs['reward'] = clip_rewards(obs['reward'])
  return obs

But I'm not familiar with tensorflow 2.X and I couldn't fix it.

A quick way to reproduce the issue is to modify the dreamer code as shown below and to run HTOP to monitor the RAM.

def main(config):
  if config.gpu_growth:
    for gpu in tf.config.experimental.list_physical_devices('GPU'):
      tf.config.experimental.set_memory_growth(gpu, True)
  assert config.precision in (16, 32), config.precision
  if config.precision == 16:
    prec.set_policy(prec.Policy('mixed_float16'))
  config.steps = int(config.steps)
  config.logdir.mkdir(parents=True, exist_ok=True)
  print('Logdir', config.logdir)

  # Create environments.
  datadir = config.logdir / 'episodes'
  writer = tf.summary.create_file_writer(
      str(config.logdir), max_queue=1000, flush_millis=20000)
  writer.set_as_default()
  train_envs = [wrappers.Async(lambda: make_env(
      config, writer, 'train', datadir, store=True), config.parallel)
      for _ in range(config.envs)]
  test_envs = [wrappers.Async(lambda: make_env(
      config, writer, 'test', datadir, store=False), config.parallel)
      for _ in range(config.envs)]
  actspace = train_envs[0].action_space

  # Prefill dataset with random episodes.
  step = count_steps(datadir, config)
  prefill = max(0, config.prefill - step)
  print(f'Prefill dataset with {prefill} steps.')
  random_agent = lambda o, d, _: ([actspace.sample() for _ in d], None)
  tools.simulate(random_agent, train_envs, prefill / config.action_repeat)
  writer.flush()

  # Train and regularly evaluate the agent.
  step = count_steps(datadir, config)
  print(f'Simulating agent for {config.steps-step} steps.')
  agent = Dreamer(config, datadir, actspace, writer)
  if (config.logdir / 'variables.pkl').exists():
    print('Load checkpoint.')
    agent.load(config.logdir / 'variables.pkl')
  
  import os
state = None
  state = None
  training = True
  files = os.listdir(str(datadir))
  keys = ['image','reward']
  for i in range(len(files)):
      print(i)
      episode = np.load(str(datadir)+'/'+files[i])
      episode = {k: episode[k] for k in episode.keys()}
      state=None
      for i in range(500):
          obs = {k: [episode[k][i]] for k in keys}
          action, state = agent.policy(obs, state, training)
  for env in train_envs + test_envs:
    env.close()

Please note that this behavior also happens if you run the call function instead of the policy one.

This happened with both python2.7 python3.8 on a ubuntu 18.04 system using tensorflow 2.1.0

Thanks in advance,

Regards,

Antoine

@AntoineRichard AntoineRichard changed the title emory leak when calling policy function Memory leak when calling policy function May 14, 2020
@AntoineRichard
Copy link
Author

Nevermind this issue, that's my bad it's a tensorflow 2.1 non-sense.

@danijar danijar changed the title Memory leak when calling policy function Memory leak when calling policy function? May 15, 2020
@Cospui
Copy link

Cospui commented May 14, 2021

Nevermind this issue, that's my bad it's a tensorflow 2.1 non-sense.

Hi, I'm using tensorflow 2.4.0 and I think there are still memory leak problems somewhere. How do you solve this? Thx @AntoineRichard

@AntoineRichard
Copy link
Author

Initially, I had made some modifications to the code to integrate it with python2 and ROS that created a real memory leak because I was not feeding NumPy arrays.
I believe that in the original code it's just related to the dataset going in RAM that eventually fills up, I am not 100% sure though. My ``workaround'' was to just get more RAM ... An alternative could be to add some swap. (I am not saying it's a good solution but it works...).

I hope this helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants