CUDA cannot create more than one session #19482

Davidnet · 2018-05-22T21:27:10Z

Please go to Stack Overflow for help and support:

https://stackoverflow.com/questions/tagged/tensorflow

If you open a GitHub issue, here is our policy:

It must be a bug, a feature request, or a significant problem with documentation (for small docs fixes please send a PR instead).
The form below must be filled out.
It shouldn't be a TensorBoard issue. Those go here.

Here's why we have that policy: TensorFlow developers respond to issues. We want to focus on work that benefits the whole community, e.g., fixing bugs and adding features. Support only helps individuals. GitHub also notifies thousands of people when issues are filed. We want them to see you communicating an interesting problem, rather than being redirected to Stack Overflow.

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Xenial
TensorFlow installed from (source or binary):
source
TensorFlow version (use command below):
1.8
Python version:
2.7
Bazel version (if compiling from source):
0.13
GCC/Compiler version (if compiling from source):
5.4.0
CUDA/cuDNN version:
9.0/7.0
GPU model and memory:
Tegra x2
Exact command to reproduce:
config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.4 # I have also tried with allow memory growth
sess1 = tf.Session(config=config)
sess2=tf.Session(config=config) # Cannot create the session
You can collect some of this information using our environment capture script:

https://github.com/tensorflow/tensorflow/tree/master/tools/tf_env_collect.sh

You can obtain the TensorFlow version with

python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"

Describe the problem

Describe the problem clearly here. Be sure to convey here why it's a bug in TensorFlow or a feature request.

I have a Jetson TX2 with updated drivers and the last jetpack provided by Nvidia, I have built tensorflow (r.1.5 and r.18) and I'm not able to create more than one session, I can execute operations and everything with only one session, but once I create a new session, I encounter that tensorflow cannot create a new session, which I suspect is Nvidia fault, but the error is not that informative:

  File "object_detection.py", line 183, in detection
    with tf.Session(graph=detection_graph,config=config) as sess:
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1509, in __init__
    super(Session, self).__init__(target, graph, config=config)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 628, in __init__
    self._session = tf_session.TF_NewDeprecatedSession(opts, status)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/errors_impl.py", line 473, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InternalError: Failed to create session.

Is there any way I can get more information about the cuda error or status? So I can complement my bug report?

Thanks

Source code / logs

Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached. Try to provide a reproducible test case that is the bare minimum necessary to generate the problem.

The text was updated successfully, but these errors were encountered:

skye · 2018-06-04T22:29:49Z

@tfboyd do we have a way to reproduce this?

@Davidnet just to clarify, do you not experience this problem with a CPU-only build?

Davidnet · 2018-06-04T23:29:01Z

I use a Jetson TX2, so its quite difficult to use a cpu only build (takes a lot of time). I'm concerned since I do not know how to fill a proper bug report. How to debug programs with CUDA and CUdnn ?

tfboyd · 2018-06-05T00:49:49Z

I had not used or seen multiple sessions in a script but indeed it happens and there as an issues resolved a long time ago. I would test the code on a regular GPU just to rule in or out Jetson being the issue. We do not have a Jetson TX2 or TX setup in our area. I would also wrap the session in a with.device just to make sure it is going where you want, but that is likely not needed really. If there was a CUDA issue I would expect to see a CUDA error but expectation does not always match reality.

Here is the multi-session example I found while looking for issues. They call run before starting the next session but I do not see why that would matter.

For debugging CUDA, I do not have direct knowledge. You could try the tfdebugging as a starting point. https://www.tensorflow.org/programmers_guide/debugger I also do not have experience with it. Huge help right?

That is where I would start. Finding out it is a Jetson specific issue would be my first thought, then going from there.

Davidnet · 2018-06-05T02:03:52Z

I'm now getting error on the CUDA runtime implicit initialization on GPU

2018-05-31 01:23:28.378527: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0:   N 
2018-05-31 01:23:28.378766: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4744 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
Building Graph
2018-05-31 01:23:40.001179: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2018-05-31 01:23:40.001338: E tensorflow/core/common_runtime/direct_session.cc:154] Internal: CUDA runtime implicit initialization on GPU:0 failed. Status: unknown error
Traceback (most recent call last):
  File "real_time_detection.py", line 174, in <module>
    main()

tfboyd · 2018-06-05T02:20:31Z

@Davidnet There is likely not much I can debug. I would suggest the following:

Share the code you are running. All of it so someone else could cut/paste and run it. If someone has a few minutes to look at the issue they will get a lot farther, before giving up, if they have code to run and a lot of details stated in a concise manner.
Be clear about what you tried. In this case, without the code it is hard to know why you saw no initialization message before but you do not. What did you change?
Share the full log and command-line that was run. Link to a file or just paste it in.

Finally, keep in mind this is not a help desk and mostly it is about giving you ideas. In the case of the Jetson even more so as we do no have those sitting around like we do GPUs or CPUs on our local machines.

Do not read this as not wanting to help. Everyone wants to help and the biggest frustration is not having enough information.

tensorflowbutler · 2018-06-19T18:43:34Z

Nagging Assignee @skye: It has been 14 days with no activity and this issue has an assignee. Please update the label and/or status accordingly.

tensorflowbutler assigned skye May 26, 2018

Davidnet closed this as completed Jun 20, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA cannot create more than one session #19482

CUDA cannot create more than one session #19482

Davidnet commented May 22, 2018

skye commented Jun 4, 2018

Davidnet commented Jun 4, 2018

tfboyd commented Jun 5, 2018

Davidnet commented Jun 5, 2018

tfboyd commented Jun 5, 2018

tensorflowbutler commented Jun 19, 2018

CUDA cannot create more than one session #19482

CUDA cannot create more than one session #19482

Comments

Davidnet commented May 22, 2018

System information

Describe the problem

Source code / logs

skye commented Jun 4, 2018

Davidnet commented Jun 4, 2018

tfboyd commented Jun 5, 2018

Davidnet commented Jun 5, 2018

tfboyd commented Jun 5, 2018

tensorflowbutler commented Jun 19, 2018