You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried to get the experiment working on Amazon GPU Cloud machine with a K520 graphic card with cuda 8. I got pretty much warnings, but I think the problem is some cuda function not working on the GPU. Here is some of the output:
assign pretrain model weights to conv2_1
assign pretrain model biases to conv2_1
Faster-RCNN_TF/tools/../lib/rpn_msr/proposal_target_layer_tf.py:89: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
bbox_targets[ind, start:end] = bbox_target_data[ind, 1:]
Faster-RCNN_TF/tools/../lib/rpn_msr/proposal_target_layer_tf.py:90: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
bbox_inside_weights[ind, start:end] = cfg.TRAIN.BBOX_INSIDE_WEIGHTS
cudaCheckError() failed : invalid device function
E tensorflow/stream_executor/stream.cc:272] Error recording event in stream: error recording CUDA event on stream 0x4cae120: CUDA_ERROR_DEINITIALIZED; not marking stream as bad, as the Event object may be at fault. Monitor for further errors.
E tensorflow/stream_executor/cuda/cuda_event.cc:49] Error polling for event status: failed to query event: CUDA_ERROR_DEINITIALIZED
F tensorflow/core/common_runtime/gpu/gpu_event_mgr.cc:198] Unexpected Event status: 1
E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:671] failed to record completion event; therefore, failed to create inter-stream dependency
I tensorflow/stream_executor/stream.cc:3775] stream 0x4caea80 did not memcpy device-to-host; source: 0x723f3cf00
./experiments/scripts/faster_rcnn_end2end.sh: line 57: 10679 Aborted (core dumped) python ./tools/train_net.py --device ${DEV} --device_id ${DEV_ID} --weights data/pretrain_model/VGG_imagenet.npy --imdb ${TRAIN_IMDB} --iters ${ITERS} --cfg experiments/cfgs/faster_rcnn_end2end.yml --network VGGnet_train ${EXTRA_ARGS}
Can you give me hint what the problem could be?
Thanks in advance
The text was updated successfully, but these errors were encountered:
I found my problem :) It was a problem with Tensorflow itself. Official binaries wont work on AWS AMI because the use the 3.5 compute ability of NVV. Building Tensorflow from source with correct settings solved this problem for me.
Hi,
I tried to get the experiment working on Amazon GPU Cloud machine with a K520 graphic card with cuda 8. I got pretty much warnings, but I think the problem is some cuda function not working on the GPU. Here is some of the output:
Can you give me hint what the problem could be?
Thanks in advance
The text was updated successfully, but these errors were encountered: