You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template
System information
Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Arch linux 5.1.12
Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: N/A
TensorFlow version (use command below): 1.14.0-rc1
Python version: 3.7.3
Bazel version (if compiling from source): N/A
GCC/Compiler version (if compiling from source): N/A
CUDA/cuDNN version: 10.1.168
GPU model and memory: Quadro M2200, 4043 MB
You can collect some of this information using our environment capture script
You can also obtain the TensorFlow version with: 1. TF 1.0: python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)" 2. TF 2.0: python -c "import tensorflow as tf; print(tf.version.GIT_VERSION, tf.version.VERSION)"
Describe the current behavior
Tensorflow allocates more memory than specified. When running multiple processes sharing the same GPU can cause one process to have out of memory exception. For example, I specified it to use no more than 50% of GPU memory. However, it actually allocates ~52% memory as in the screenshot.
Describe the expected behavior
I would expect it to allocate no more than 50% memory. In my case, it would be <=2021.5 MB.
Code to reproduce the issue
Provide a reproducible test case that is the bare minimum necessary to generate the problem.
Other info / logs
Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.
The text was updated successfully, but these errors were encountered:
Can confirm this issue, on a system with NVIDIA 2080 Ti it behaves the following way:
with per_process_gpu_memory_fraction=0.2 it allocates 2457 MiB / 10989 MiB (as shown in nvidia-smi), which is obviously greater than expected (0.2 * 10989 = 2198 MiB)
With per_process_gpu_memory_fraction=0.1 it allocates 1357 MiB / 10989 MiB, which is greater than 0.1 * 10989 = 1099 MiB expected.
Hi @zli117, I think this is expected. per_process_gpu_memory_fraction specifies the amount of memory that TF will be used to allocate input/output tensors of the graph and temporary buffers for intermediate results. This doesn't include memory that is needed to initialize CUDA/cuDNN and other GPU libraries.
I'm closing this, feel free to reopen if there are further questions.
Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template
System information
You can collect some of this information using our environment capture
script
You can also obtain the TensorFlow version with: 1. TF 1.0:
python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"
2. TF 2.0:python -c "import tensorflow as tf; print(tf.version.GIT_VERSION, tf.version.VERSION)"
Describe the current behavior
Tensorflow allocates more memory than specified. When running multiple processes sharing the same GPU can cause one process to have out of memory exception. For example, I specified it to use no more than 50% of GPU memory. However, it actually allocates ~52% memory as in the screenshot.
Describe the expected behavior
I would expect it to allocate no more than 50% memory. In my case, it would be <=2021.5 MB.
Code to reproduce the issue
Provide a reproducible test case that is the bare minimum necessary to generate the problem.
Other info / logs
Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.
The text was updated successfully, but these errors were encountered: