Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skipping cancelled dequeue attempt with queue not closed #36

Open
Liang-ZX opened this issue Dec 10, 2021 · 0 comments
Open

Skipping cancelled dequeue attempt with queue not closed #36

Liang-ZX opened this issue Dec 10, 2021 · 0 comments

Comments

@Liang-ZX
Copy link

Liang-ZX commented Dec 10, 2021

  1. ERROR LOG (first epoch)
    [1210 18:09:10 @param.py:158] [HyperParamSetter] At global_step=0, learning_rate is set to 0.001000
    [1210 18:09:11 @prof.py:294] [HostMemoryTracker] Free RAM in before_train() is 238.12 GB.
    [1210 18:09:11 @stac_helper.py:83] ----------------------------------------------------------------------------------------------------
    [1210 18:09:11 @stac_helper.py:84] Model save path: result/VOC2007/instances_trainval
    [1210 18:09:11 @stac_helper.py:85] ----------------------------------------------------------------------------------------------------
    [1210 18:09:11 @eval.py:313] [EvalCallback] Will evaluate every 20 epochs
    [1210 18:09:28 @base.py:273] Start Epoch 1 ...
    0%| |0/500[00:00<?,?it/s]2021-12-10 18:09:43.544891: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
    2021-12-10 18:10:23.596973: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
    0%| |0/500[02:46<?,?it/s]
    2021-12-10 18:12:16.766932: W tensorflow/core/kernels/queue_base.cc:277] _0_QueueInput/input_queue: Skipping cancelled enqueue attempt with queue not closed
    Traceback (most recent call last):
    File "/mnt/lustre/liangzhixuan/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1356, in _do_call
    return fn(*args)
    File "/mnt/lustre/liangzhixuan/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1341, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
    File "/mnt/lustre/liangzhixuan/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1429, in _call_tf_sessionrun
    run_metadata)
    tensorflow.python.framework.errors_impl.DeadlineExceededError: Timed out waiting for notification

  2. Environment Information:


sys.platform linux
Python 3.6.9 |Anaconda, Inc.| (default, Jul 30 2019, 19:07:31) [GCC 7.3.0]
Tensorpack v0.9.8-61-g4ac2e22b-dirty
Numpy 1.16.4
TensorFlow 1.14.0/v1.14.0-rc1-22-gaf24dc91b5
TF Compiler Version 4.8.5
TF CUDA support True
TF MKL support False
TF XLA support False
Nvidia Driver /usr/lib64/libnvidia-ml.so.460.73.01
CUDA /mnt/lustre/share/cuda-10.0/lib64/libcudart.so.10.0.130
CUDNN /mnt/lustre/share/cuda-10.0/lib64/libcudnn.so.7.4.1
NCCL
CUDA_VISIBLE_DEVICES 1,2,3,4
GPU 0,1,2,3,4,5,6,7 Tesla V100-SXM2-32GB
Free RAM 344.40/376.39 GB
CPU Count 48
cv2 4.1.1
msgpack 1.0.3
python-prctl False


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant