Skip to content

tensorflow won't release system memroy #3701

@xiaorongfan

Description

@xiaorongfan

Hi all,

I started to use tensorflow one month ago. I briefly use it to train neural machine translation models.
I find that after several times of training, there were a huge amount of memory not released. I have to reboot my machine after some numbers of training to release those memory.

I use the sample code( tensorflow/models/rnn/translate python translate.py) to train the model.

Environment info

Operating System: Ubuntu 14.04.4 LTS

tensorflow version: r.0.10

CUDA version: libcudnn.so.4.0
If installed from binary pip package, provide:

What have you tried?

Add a maximum step number to terminate training normally.

  1. replace

    with tf.Session() as sess:

with the following code:

config = tf.ConfigProto()
config.gpu_options.allow_growth=True
sess = tf.Session(config=config)

with tf.Graph().as_default(),sess:

  1. close the session after training
    sess.close()
    del sess
  2. the version of my nvidia driver is 352.79 now.
    I will update it to the latest version next week.

I have no idea what to do now.

Thanks!
Best,
xiaorongfan

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions