Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

my cuda is 11.0 but it need libcudart.so.10.0 #10335

Open
3 tasks
Wuxinxiaoshifu opened this issue Oct 27, 2021 · 2 comments
Open
3 tasks

my cuda is 11.0 but it need libcudart.so.10.0 #10335

Wuxinxiaoshifu opened this issue Oct 27, 2021 · 2 comments
Assignees
Labels
models:research models that come under research directory type:bug Bug in the code

Comments

@Wuxinxiaoshifu
Copy link

Prereqisites

Please answer the following questions for yourself before submitting an issue.

  • I am using the latest TensorFlow Model Garden release and TensorFlow 1.15.
  • I am reporting the issue to the correct repository. (Model Garden official or research directory)
  • I checked to make sure that this issue has not already been filed.

1. The entire URL of the file you are using

https://github.com/tensorflow/models/tree/master/research/deeplab

2. Describe the bug

when i run python3 train.py, the bug is "ensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudart.so.10.0'; dlerror: libcudart.so.10.0"
my libcudart.so is libcudart.so.11.0

3. Steps to reproduce

Steps to reproduce the behavior.

4. Expected behavior

i want to train my datasets, can you fix this problem?

5. Additional context

I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2021-10-27 22:44:43.970758: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-10-27 22:44:43.971327: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 1650 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.485
pciBusID: 0000:01:00.0
2021-10-27 22:44:43.971467: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudart.so.10.0'; dlerror: libcudart.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /home/wuxin/YZH/object_detect/yolo_ws/devel/lib:/home/wuxin/JKW/local_road_ws/devel/lib:/home/wuxin/YZH/lidar_ws/devel/lib:/home/wuxin/YZH/ccm_slam_ws/devel/lib:/home/wuxin/YZH/yzh_ws/devel/lib:/home/wuxin/YZH/yzh_ws2NoCat/cartographer_ws/devel_isolated/cartographer_rviz/lib:/home/wuxin/YZH/yzh_ws2NoCat/cartographer_ws/install_isolated/lib:/home/wuxin/zed/zed_ros_ws/devel/lib:/opt/ros/melodic/lib:/usr/local/opencv/lib:/usr/local/cuda-11.0/lib64:/opt/ros/noetic/lib/x86_64-linux-gnu:/home/wuxin/px/PX4-Autopilot/build/px4_sitl_default/build_gazebo

6. System information

  • Linux Ubuntu 18.04
  • TensorFlow installed from: pip3 install tensorflow-gpu
  • TensorFlow version:1.15
  • Python version:3.6.9
  • version :4.2.1
  • GCC/Compiler version (if compiling from source):
  • CUDA/cuDNN version: cuda11.0 cudnn 8.0.5
  • GPU model and memory:GTX1650Ti 3903MB
@Wuxinxiaoshifu Wuxinxiaoshifu added models:research models that come under research directory type:bug Bug in the code labels Oct 27, 2021
@kumariko kumariko self-assigned this Oct 28, 2021
@Wuxinxiaoshifu
Copy link
Author

python3 /home/wuxin/YZH/object_detect/deeplab/models/research/deeplab/train.py --logtostderr --training_number_of_steps=10000 --train_split="train" --model_variant="xception_65" --atrous_rates=6 --atrous_rates=12 --atrous_rates=18 --output_stride=16 --decoder_output_stride=6 --train_crop_size=481,641 --train_batch_size=2 --dataset="pascal_voc_seg" --num_clones=1 --tf_initial_checkpoint='/home/wuxin/YZH/object_detect/deeplab/datasets/deeplabv3_pascal_trainval' --train_logdir='/home/wuxin/YZH/object_detect/deeplab/datasets/model_get' --dataset_dir='/home/wuxin/YZH/object_detect/deeplab/datasets/dataset/tfrecords'
WARNING:tensorflow:From /home/wuxin/YZH/object_detect/deeplab/models/research/deeplab/core/conv2d_ws.py:40: The name tf.layers.Layer is deprecated. Please use tf.compat.v1.layers.Layer instead.

WARNING:tensorflow:
The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:

INFO:tensorflow:Training on train set
I1028 13:33:41.810364 139670364448576 train.py:290] Training on train set
WARNING:tensorflow:From /home/wuxin/.local/lib/python3.6/site-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.FixedLenFeature is deprecated. Please use tf.io.FixedLenFeature instead.

W1028 13:33:41.953724 139670364448576 module_wrapper.py:139] From /home/wuxin/.local/lib/python3.6/site-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.FixedLenFeature is deprecated. Please use tf.io.FixedLenFeature instead.

WARNING:tensorflow:From /home/wuxin/.local/lib/python3.6/site-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.parse_single_example is deprecated. Please use tf.io.parse_single_example instead.

W1028 13:33:41.954856 139670364448576 module_wrapper.py:139] From /home/wuxin/.local/lib/python3.6/site-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.parse_single_example is deprecated. Please use tf.io.parse_single_example instead.

2021-10-28 13:33:42.336265: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2021-10-28 13:33:42.376173: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-10-28 13:33:42.376664: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 1650 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.485
pciBusID: 0000:01:00.0
2021-10-28 13:33:42.376876: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2021-10-28 13:33:42.378843: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2021-10-28 13:33:42.379859: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2021-10-28 13:33:42.380096: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2021-10-28 13:33:42.382565: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2021-10-28 13:33:42.383152: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2021-10-28 13:33:42.383301: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-10-28 13:33:42.383390: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-10-28 13:33:42.383792: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-10-28 13:33:42.384121: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
WARNING:tensorflow:From /home/wuxin/.local/lib/python3.6/site-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.logging.warning is deprecated. Please use tf.compat.v1.logging.warning instead.

W1028 13:33:42.746240 139670364448576 module_wrapper.py:139] From /home/wuxin/.local/lib/python3.6/site-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.logging.warning is deprecated. Please use tf.compat.v1.logging.warning instead.

WARNING:tensorflow:From /home/wuxin/.local/lib/python3.6/site-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

W1028 13:33:43.337642 139670364448576 module_wrapper.py:139] From /home/wuxin/.local/lib/python3.6/site-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

WARNING:tensorflow:From /home/wuxin/.local/lib/python3.6/site-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.lin_space is deprecated. Please use tf.linspace instead.

W1028 13:33:43.338138 139670364448576 module_wrapper.py:139] From /home/wuxin/.local/lib/python3.6/site-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.lin_space is deprecated. Please use tf.linspace instead.

WARNING:tensorflow:From /home/wuxin/.local/lib/python3.6/site-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.random_shuffle is deprecated. Please use tf.random.shuffle instead.

W1028 13:33:43.338405 139670364448576 module_wrapper.py:139] From /home/wuxin/.local/lib/python3.6/site-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.random_shuffle is deprecated. Please use tf.random.shuffle instead.

WARNING:tensorflow:From /home/wuxin/.local/lib/python3.6/site-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.image.resize_bilinear is deprecated. Please use tf.compat.v1.image.resize_bilinear instead.

W1028 13:33:43.539838 139670364448576 module_wrapper.py:139] From /home/wuxin/.local/lib/python3.6/site-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.image.resize_bilinear is deprecated. Please use tf.compat.v1.image.resize_bilinear instead.

WARNING:tensorflow:From /home/wuxin/.local/lib/python3.6/site-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.reverse_v2 is deprecated. Please use tf.reverse instead.

W1028 13:33:45.765414 139670364448576 module_wrapper.py:139] From /home/wuxin/.local/lib/python3.6/site-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.reverse_v2 is deprecated. Please use tf.reverse instead.

WARNING:tensorflow:From /home/wuxin/YZH/object_detect/deeplab/models/research/deeplab/datasets/data_generator.py:339: DatasetV1.make_one_shot_iterator (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use for ... in dataset: to iterate over a dataset. If using tf.estimator, return the Dataset object directly from your input function. As a last resort, you can use tf.compat.v1.data.make_one_shot_iterator(dataset).
W1028 13:33:45.997265 139670364448576 deprecation.py:323] From /home/wuxin/YZH/object_detect/deeplab/models/research/deeplab/datasets/data_generator.py:339: DatasetV1.make_one_shot_iterator (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use for ... in dataset: to iterate over a dataset. If using tf.estimator, return the Dataset object directly from your input function. As a last resort, you can use tf.compat.v1.data.make_one_shot_iterator(dataset).
WARNING:tensorflow:From /home/wuxin/YZH/object_detect/deeplab/models/research/deeplab/core/feature_extractor.py:490: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
W1028 13:33:46.019285 139670364448576 deprecation.py:323] From /home/wuxin/YZH/object_detect/deeplab/models/research/deeplab/core/feature_extractor.py:490: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
WARNING:tensorflow:From /home/wuxin/.local/lib/python3.6/site-packages/tensorflow_core/contrib/layers/python/layers/layers.py:1057: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use layer.__call__ method instead.
W1028 13:33:46.024497 139670364448576 deprecation.py:323] From /home/wuxin/.local/lib/python3.6/site-packages/tensorflow_core/contrib/layers/python/layers/layers.py:1057: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use layer.__call__ method instead.
WARNING:tensorflow:From /home/wuxin/YZH/object_detect/deeplab/models/research/deeplab/core/xception.py:393: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

W1028 13:33:46.099911 139670364448576 module_wrapper.py:139] From /home/wuxin/YZH/object_detect/deeplab/models/research/deeplab/core/xception.py:393: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

Traceback (most recent call last):
File "/home/wuxin/YZH/object_detect/deeplab/models/research/deeplab/train.py", line 464, in
tf.compat.v1.app.run()
File "/home/wuxin/.local/lib/python3.6/site-packages/tensorflow_core/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/home/wuxin/.local/lib/python3.6/site-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/home/wuxin/.local/lib/python3.6/site-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "/home/wuxin/YZH/object_detect/deeplab/models/research/deeplab/train.py", line 321, in main
clones = model_deploy.create_clones(config, model_fn, args=model_args)
File "/home/wuxin/YZH/object_detect/deeplab/models/research/slim/deployment/model_deploy.py", line 192, in create_clones
outputs = model_fn(*args, **kwargs)
File "/home/wuxin/YZH/object_detect/deeplab/models/research/deeplab/train.py", line 252, in _build_deeplab
'total_training_steps': FLAGS.training_number_of_steps,
File "/home/wuxin/YZH/object_detect/deeplab/models/research/deeplab/model.py", line 323, in multi_scale_logits
nas_training_hyper_parameters=nas_training_hyper_parameters)
File "/home/wuxin/YZH/object_detect/deeplab/models/research/deeplab/model.py", line 597, in _get_logits
use_bounded_activation=model_options.use_bounded_activation)
File "/home/wuxin/YZH/object_detect/deeplab/models/research/deeplab/model.py", line 709, in refine_by_decoder
feature_extractor.DECODER_END_POINTS][output_stride]
KeyError: 6

@kumariko kumariko assigned YknZhu and aquariusjay and unassigned kumariko Oct 28, 2021
@TJxiaominliu
Copy link

have you solve it? i have the same error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
models:research models that come under research directory type:bug Bug in the code
Projects
None yet
Development

No branches or pull requests

5 participants