Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tensorflow Still Trying to use CUDA even when Session Created with device_count={'GPU': 0} #9201

Closed
cancan101 opened this issue Apr 13, 2017 · 14 comments
Assignees
Labels
stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author stat:contribution welcome Status - Contributions welcome type:feature Feature requests

Comments

@cancan101
Copy link
Contributor

System Information

Using the tensorflow/tensorflow:1.0.1-devel-gpu Docker image.
('v1.0.0-65-g4763edf-dirty', '1.0.1')
Host: Driver Version: 367.57, 3.13.0-57-generic

Issue

If I Set compute mode to EXCLUSIVE_PROCESS on the Nvidia device (sudo nvidia-smi -c 1), then even though I tell the Session not to use GPUs (config=tf.ConfigProto(device_count={'GPU': 0})), Tensorflow attempts to use the GPU resulting in an inability to create session:

InternalErrorTraceback (most recent call last)
<ipython-input-1-cabf26c1451a> in <module>()
      1 import tensorflow as tf
      2 from tensorflow.python.framework import ops
----> 3 with tf.Session(config=tf.ConfigProto(device_count={'GPU': 0})) as sess:
      4     pass

/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.pyc in __init__(self, target, graph, config)
   1174 
   1175     """
-> 1176     super(Session, self).__init__(target, graph, config=config)
   1177     # NOTE(mrry): Create these on first `__enter__` to avoid a reference cycle.
   1178     self._default_graph_context_manager = None

/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.pyc in __init__(self, target, graph, config)
    550     try:
    551       with errors.raise_exception_on_not_ok_status() as status:
--> 552         self._session = tf_session.TF_NewDeprecatedSession(opts, status)
    553     finally:
    554       tf_session.TF_DeleteSessionOptions(opts)

/usr/lib/python2.7/contextlib.pyc in __exit__(self, type, value, traceback)
     22         if type is None:
     23             try:
---> 24                 self.gen.next()
     25             except StopIteration:
     26                 return

/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/errors_impl.pyc in raise_exception_on_not_ok_status()
    464           None, None,
    465           compat.as_text(pywrap_tensorflow.TF_Message(status)),
--> 466           pywrap_tensorflow.TF_GetCode(status))
    467   finally:
    468     pywrap_tensorflow.TF_DeleteStatus(status)

InternalError: Failed to create session.

This can be demonstrated by running:

import tensorflow as tf
from tensorflow.python.framework import ops
with tf.Session(config=tf.ConfigProto(device_count={'GPU': 0})) as sess:
    pass

when another process is using CUDA and the exclusive process mode is set.

If exclusive process mode is not set, then the session is created but using nvidia-smi, I see that the process is using GPU ram (and CUDA):

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0      2237    C   /usr/bin/python                                 61MiB |

The issue seems limited to TF trying to lock the CUDA device (an allocate ~61MB memory). Subsequent computations do happen correctly on the CPU.

@jart
Copy link
Contributor

jart commented Apr 14, 2017

You want either export CUDA_VISIBLE_DEVICES= or alternatively a virtualenv with non-GPU TensorFlow. See also: #2175 (comment)

@jart jart closed this as completed Apr 14, 2017
@jart jart added the type:support Support issues label Apr 14, 2017
@cancan101
Copy link
Contributor Author

cancan101 commented Apr 14, 2017

@jart: I'm not sure why the config approach I outlined doesn't work and why the only suggestion is to set an env var. Setting the configuration as I did seems to partially work (ie prevent usage of the gpu for the graph) but not totally ie it locks device. This seems to violate "principle of least astonishment". It seems like this is either a documentation issue or an issue with how the config is used.

The environmental var approach is not ideal as:

  1. It is weird for a process to set this value it self.
  2. related to 1, but limits the ability to set use GPU vs not on a per Session basis.

@cancan101
Copy link
Contributor Author

@jart Any thoughts on the above questions / comments?

@jart
Copy link
Contributor

jart commented Apr 25, 2017

@zheng-xq Our friend @cancan101 believes it would be less astonishing for our users if tf.ConfigProto(device_count={'GPU': 0}) also implied export CUDA_VISIBLE_DEVICES="". That doesn't sound unreasonable to me. What are your opinions on this feature request?

@jart jart reopened this Apr 25, 2017
@jart jart added the type:feature Feature requests label Apr 25, 2017
@aselle aselle added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Apr 25, 2017
@Belval
Copy link
Contributor

Belval commented May 5, 2017

I am experiencing the same issue with TF and I too believe tf.ConfigProto(device_count={'GPU': 0}) should imply export CUDA_VISIBLE_DEVICES="". I'd like to be able to use my GPU for specific tasks without setting up a second env.

@skye skye added stat:contribution welcome Status - Contributions welcome and removed stat:awaiting tensorflower Status - Awaiting response from tensorflower type:support Support issues labels Jun 16, 2017
@zjuxxd
Copy link

zjuxxd commented Jun 29, 2017

I'm also have the same problem. Will be very happy if this could be supported.

@TimZaman
Copy link
Contributor

Why is this issue closed?

@Belval
Copy link
Contributor

Belval commented Dec 20, 2017

@TimZaman

It's not, but it really isn't a priority as you can (I know it's ugly)

os.environ["CUDA_VISIBLE_DEVICES"] = ''

# Code that uses Tensorflow without GPU

os.environ["CUDA_VISIBLE_DEVICES"] = '0'

If you want you could also wrap the whole thing in a decorator:

import os

def cpu_only():
    def _method_wrapper(function):
        def wrap(*args, **kwargs):
            os.environ["CUDA_VISIBLE_DEVICES"] = ''
            ret = function(*args, **kwargs)
            os.environ["CUDA_VISIBLE_DEVICES"] = '0'
            return ret
            wrap.__doc__ = function.__doc__
            wrap.__name__ = function.__name__
        return wrap
    return _method_wrapper

Someone might work on it one day but I wouldn't hold my breath

@TimZaman
Copy link
Contributor

@Belval hehe yeah that makes me feel like I want to take a shower.
But I love how you wrapped that turd in a beautiful decorator! 🤣 Fair enough for now, I can see how this is not top prio.

@mohantym
Copy link
Contributor

Hi @cancan101 ! 1.x issues are not supported any more. You can use tf.device to switch between CPU and GPU in 2.x versions. Thank you!

@mohantym mohantym added the stat:awaiting response Status - Awaiting response from author label May 24, 2022
@google-ml-butler
Copy link

This issue has been automatically marked as stale because it has no recent activity. It will be closed if no further activity occurs. Thank you.

@google-ml-butler google-ml-butler bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label May 31, 2022
@google-ml-butler
Copy link

Closing as stale. Please reopen if you'd like to work on this further.

copybara-service bot pushed a commit that referenced this issue Feb 8, 2024
Imported from GitHub PR openxla/xla#9201

This PR implements the following optimizations:
```
Gt(Max(a,b), a) -> Gt(b,a)
Gt(Max(a,b), b) -> Gt(a,b)
Gt(Min(a,b), a) -> False
Gt(Min(a,b), b) -> False

Gt(a, Min(a,b)) -> Gt(a,b)
Gt(b, Min(a,b)) -> Gt(b,a)
Gt(a, Max(a,b)) -> False
Gt(b, Max(a,b)) -> False
```

We tested `Gt(Max(a,b), a) -> Gt(b,a)` optimization on Resnet50 model internally.

Overall, we observed the following benefits of adding this optimization:

- VmRSS usage is 14% less
- Number of instructions - 13% less
- Memory locations - 22% less

Discussion: **Optimization for fold compare_GT(maximum(a,b), a) into compare_GT(b,a). #[8346](openxla/xla#8346
Copybara import of the project:

--
7695a3ff259d2174af46634a6e2276aabe295d05 by Alexander Pivovarov <pivovaa@amazon.com>:

Simplify Gt(Max(a,b), a) -> Gt(b,a)

Merging this change closes #9201

PiperOrigin-RevId: 605206447
copybara-service bot pushed a commit that referenced this issue Feb 9, 2024
Imported from GitHub PR openxla/xla#9201

PiperOrigin-RevId: 605598127
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author stat:contribution welcome Status - Contributions welcome type:feature Feature requests
Projects
None yet
Development

No branches or pull requests

9 participants