We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ERROR LOG (first epoch) [1210 18:09:10 @param.py:158] [HyperParamSetter] At global_step=0, learning_rate is set to 0.001000 [1210 18:09:11 @prof.py:294] [HostMemoryTracker] Free RAM in before_train() is 238.12 GB. [1210 18:09:11 @stac_helper.py:83] ---------------------------------------------------------------------------------------------------- [1210 18:09:11 @stac_helper.py:84] Model save path: result/VOC2007/instances_trainval [1210 18:09:11 @stac_helper.py:85] ---------------------------------------------------------------------------------------------------- [1210 18:09:11 @eval.py:313] [EvalCallback] Will evaluate every 20 epochs [1210 18:09:28 @base.py:273] Start Epoch 1 ... 0%| |0/500[00:00<?,?it/s]2021-12-10 18:09:43.544891: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0 2021-12-10 18:10:23.596973: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7 0%| |0/500[02:46<?,?it/s] 2021-12-10 18:12:16.766932: W tensorflow/core/kernels/queue_base.cc:277] _0_QueueInput/input_queue: Skipping cancelled enqueue attempt with queue not closed Traceback (most recent call last): File "/mnt/lustre/liangzhixuan/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1356, in _do_call return fn(*args) File "/mnt/lustre/liangzhixuan/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1341, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/mnt/lustre/liangzhixuan/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1429, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.DeadlineExceededError: Timed out waiting for notification
Environment Information:
sys.platform linux Python 3.6.9 |Anaconda, Inc.| (default, Jul 30 2019, 19:07:31) [GCC 7.3.0] Tensorpack v0.9.8-61-g4ac2e22b-dirty Numpy 1.16.4 TensorFlow 1.14.0/v1.14.0-rc1-22-gaf24dc91b5 TF Compiler Version 4.8.5 TF CUDA support True TF MKL support False TF XLA support False Nvidia Driver /usr/lib64/libnvidia-ml.so.460.73.01 CUDA /mnt/lustre/share/cuda-10.0/lib64/libcudart.so.10.0.130 CUDNN /mnt/lustre/share/cuda-10.0/lib64/libcudnn.so.7.4.1 NCCL CUDA_VISIBLE_DEVICES 1,2,3,4 GPU 0,1,2,3,4,5,6,7 Tesla V100-SXM2-32GB Free RAM 344.40/376.39 GB CPU Count 48 cv2 4.1.1 msgpack 1.0.3 python-prctl False
The text was updated successfully, but these errors were encountered:
No branches or pull requests
ERROR LOG (first epoch)
[1210 18:09:10 @param.py:158] [HyperParamSetter] At global_step=0, learning_rate is set to 0.001000
[1210 18:09:11 @prof.py:294] [HostMemoryTracker] Free RAM in before_train() is 238.12 GB.
[1210 18:09:11 @stac_helper.py:83] ----------------------------------------------------------------------------------------------------
[1210 18:09:11 @stac_helper.py:84] Model save path: result/VOC2007/instances_trainval
[1210 18:09:11 @stac_helper.py:85] ----------------------------------------------------------------------------------------------------
[1210 18:09:11 @eval.py:313] [EvalCallback] Will evaluate every 20 epochs
[1210 18:09:28 @base.py:273] Start Epoch 1 ...
0%| |0/500[00:00<?,?it/s]2021-12-10 18:09:43.544891: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2021-12-10 18:10:23.596973: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
0%| |0/500[02:46<?,?it/s]
2021-12-10 18:12:16.766932: W tensorflow/core/kernels/queue_base.cc:277] _0_QueueInput/input_queue: Skipping cancelled enqueue attempt with queue not closed
Traceback (most recent call last):
File "/mnt/lustre/liangzhixuan/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1356, in _do_call
return fn(*args)
File "/mnt/lustre/liangzhixuan/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1341, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/mnt/lustre/liangzhixuan/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1429, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.DeadlineExceededError: Timed out waiting for notification
Environment Information:
sys.platform linux
Python 3.6.9 |Anaconda, Inc.| (default, Jul 30 2019, 19:07:31) [GCC 7.3.0]
Tensorpack v0.9.8-61-g4ac2e22b-dirty
Numpy 1.16.4
TensorFlow 1.14.0/v1.14.0-rc1-22-gaf24dc91b5
TF Compiler Version 4.8.5
TF CUDA support True
TF MKL support False
TF XLA support False
Nvidia Driver /usr/lib64/libnvidia-ml.so.460.73.01
CUDA /mnt/lustre/share/cuda-10.0/lib64/libcudart.so.10.0.130
CUDNN /mnt/lustre/share/cuda-10.0/lib64/libcudnn.so.7.4.1
NCCL
CUDA_VISIBLE_DEVICES 1,2,3,4
GPU 0,1,2,3,4,5,6,7 Tesla V100-SXM2-32GB
Free RAM 344.40/376.39 GB
CPU Count 48
cv2 4.1.1
msgpack 1.0.3
python-prctl False
The text was updated successfully, but these errors were encountered: