Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could not satisfy explicit device specification? #39

Closed
felipk101 opened this issue Oct 28, 2018 · 6 comments
Closed

Could not satisfy explicit device specification? #39

felipk101 opened this issue Oct 28, 2018 · 6 comments

Comments

@felipk101
Copy link

Hi, I've tried training maskrcnn on a custom dataset, and I'm getting the following crash,
It seems to come from tensorflow, I've now installed multiple versions, and tested on titan and 1070 same result.

Could it have something to do with this?
google/prettytensor#1

Running Ubuntu 16, keras 2.2.4, tensorflow 1.4

Any ideas?

/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/keras/callbacks.py:1065: UserWarning: epsilon argument is deprecated and will be removed, use min_delta instead.
warnings.warn('epsilon argument is deprecated and '
/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py:95: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
Traceback (most recent call last):
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1327, in _do_call
return fn(*args)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1297, in _run_fn
self._extend_graph()
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1358, in _extend_graph
self._session, graph_def.SerializeToString(), status)
File "/usr/lib/python3.5/contextlib.py", line 66, in exit
next(self.gen)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation 'training/Adam/gradients/filtered_detections/map/while/embedding_lookup_5_grad/Reshape_1/f_acc': Could not satisfy explicit device specification '' because the node was colocated with a group of nodes that required incompatible device '/job:localhost/replica:0/task:0/device:GPU:0'
Colocation Debug Info:
Colocation group had the following types and devices:
Gather: GPU CPU
ConcatV2: GPU CPU
StridedSlice: CPU
Cast: GPU CPU
TensorArrayGradV3: GPU CPU
Pack: GPU CPU
RefEnter: GPU CPU
Enter: GPU CPU
ExpandDims: GPU CPU
StackPop: GPU CPU
Stack: GPU CPU
TensorArrayReadV3: GPU CPU
Reshape: GPU CPU
UnsortedSegmentSum: GPU CPU
Identity: GPU CPU
TensorArrayGatherV3: GPU CPU
TensorArrayV3: GPU CPU
Unpack: GPU CPU
TensorArrayScatterV3: GPU CPU
Const: GPU CPU
TensorArrayWriteV3: GPU CPU
Shape: GPU CPU
Size: GPU CPU
StackPush: GPU CPU
[[Node: training/Adam/gradients/filtered_detections/map/while/embedding_lookup_5_grad/Reshape_1/f_acc = Stack_class=["loc:@filtered_detections/map/while/TensorArrayReadV3", "loc:@filtered_detections/map/while/strided_slice_10"], elem_type=DT_INT64, stack_name=""]]

@felipk101
Copy link
Author

Apparently it something to do with embedding on GPU, I added
to line 42 of Train.py now am getting different error. After reading
ematvey/tensorflow-seq2seq-tutorials#11

config = tf.ConfigProto(allow_soft_placement=True)

Now getting different error.

Epoch 1/50
2018-10-29 19:26:53.750264: W tensorflow/core/framework/op_kernel.cc:1192] Not found: Resource __per_step_6/_tensor_arraysfiltered_detections/map/TensorArray_0/N10tensorflow11TensorArrayE does not exist.
2018-10-29 19:26:53.750489: W tensorflow/core/framework/op_kernel.cc:1192] Not found: Resource __per_step_6/_tensor_arraysfiltered_detections/map/TensorArray_0/N10tensorflow11TensorArrayE does not exist.
[[Node: training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3 = TensorArrayGradV3[_class=["loc:@filtered_detections/map/while/TensorArrayReadV3", "loc:@filtered_detections/map/while/TensorArrayReadV3/Enter"], source="training/Adam/gradients", _device="/job:localhost/replica:0/task:0/cpu:0"](training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3/Enter, training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3/Enter_1, ^training/Adam/gradients/Sub/_2469, ^_clooptraining/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayWrite/strided_slice/stack_2/_2315)]]
2018-10-29 19:26:53.750579: W tensorflow/core/framework/op_kernel.cc:1192] Not found: Resource __per_step_6/_tensor_arraysfiltered_detections/map/TensorArray_0/N10tensorflow11TensorArrayE does not exist.
[[Node: training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3 = TensorArrayGradV3[_class=["loc:@filtered_detections/map/while/TensorArrayReadV3", "loc:@filtered_detections/map/while/TensorArrayReadV3/Enter"], source="training/Adam/gradients", _device="/job:localhost/replica:0/task:0/cpu:0"](training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3/Enter, training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3/Enter_1, ^training/Adam/gradients/Sub/_2469, ^_clooptraining/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayWrite/strided_slice/stack_2/_2315)]]
2018-10-29 19:26:53.750906: W tensorflow/core/framework/op_kernel.cc:1192] Not found: Resource __per_step_6/_tensor_arraysfiltered_detections/map/TensorArray_0/N10tensorflow11TensorArrayE does not exist.
[[Node: training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3 = TensorArrayGradV3[_class=["loc:@filtered_detections/map/while/TensorArrayReadV3", "loc:@filtered_detections/map/while/TensorArrayReadV3/Enter"], source="training/Adam/gradients", _device="/job:localhost/replica:0/task:0/cpu:0"](training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3/Enter, training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3/Enter_1, ^training/Adam/gradients/Sub/_2469, ^_clooptraining/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayWrite/strided_slice/stack_2/_2315)]]
2018-10-29 19:26:53.750964: W tensorflow/core/framework/op_kernel.cc:1192] Not found: Resource __per_step_6/_tensor_arraysfiltered_detections/map/TensorArray_0/N10tensorflow11TensorArrayE does not exist.
[[Node: training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3 = TensorArrayGradV3[_class=["loc:@filtered_detections/map/while/TensorArrayReadV3", "loc:@filtered_detections/map/while/TensorArrayReadV3/Enter"], source="training/Adam/gradients", _device="/job:localhost/replica:0/task:0/cpu:0"](training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3/Enter, training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3/Enter_1, ^training/Adam/gradients/Sub/_2469, ^_clooptraining/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayWrite/strided_slice/stack_2/_2315)]]
2018-10-29 19:26:53.751008: W tensorflow/core/framework/op_kernel.cc:1192] Not found: Resource __per_step_6/_tensor_arraysfiltered_detections/map/TensorArray_0/N10tensorflow11TensorArrayE does not exist.
[[Node: training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3 = TensorArrayGradV3[_class=["loc:@filtered_detections/map/while/TensorArrayReadV3", "loc:@filtered_detections/map/while/TensorArrayReadV3/Enter"], source="training/Adam/gradients", _device="/job:localhost/replica:0/task:0/cpu:0"](training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3/Enter, training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3/Enter_1, ^training/Adam/gradients/Sub/_2469, ^_clooptraining/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayWrite/strided_slice/stack_2/_2315)]]
2018-10-29 19:26:53.751046: W tensorflow/core/framework/op_kernel.cc:1192] Not found: Resource __per_step_6/_tensor_arraysfiltered_detections/map/TensorArray_0/N10tensorflow11TensorArrayE does not exist.
[[Node: training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3 = TensorArrayGradV3[_class=["loc:@filtered_detections/map/while/TensorArrayReadV3", "loc:@filtered_detections/map/while/TensorArrayReadV3/Enter"], source="training/Adam/gradients", _device="/job:localhost/replica:0/task:0/cpu:0"](training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3/Enter, training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3/Enter_1, ^training/Adam/gradients/Sub/_2469, ^_clooptraining/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayWrite/strided_slice/stack_2/_2315)]]
2018-10-29 19:26:53.751166: W tensorflow/core/framework/op_kernel.cc:1192] Not found: Resource __per_step_6/_tensor_arraysfiltered_detections/map/TensorArray_0/N10tensorflow11TensorArrayE does not exist.
[[Node: training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3 = TensorArrayGradV3[_class=["loc:@filtered_detections/map/while/TensorArrayReadV3", "loc:@filtered_detections/map/while/TensorArrayReadV3/Enter"], source="training/Adam/gradients", _device="/job:localhost/replica:0/task:0/cpu:0"](training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3/Enter, training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3/Enter_1, ^training/Adam/gradients/Sub/_2469, ^_clooptraining/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayWrite/strided_slice/stack_2/_2315)]]
2018-10-29 19:26:53.754601: W tensorflow/core/framework/op_kernel.cc:1192] Not found: Resource __per_step_6/_tensor_arraysfiltered_detections/map/TensorArray_0/N10tensorflow11TensorArrayE does not exist.
[[Node: training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3 = TensorArrayGradV3[_class=["loc:@filtered_detections/map/while/TensorArrayReadV3", "loc:@filtered_detections/map/while/TensorArrayReadV3/Enter"], source="training/Adam/gradients", _device="/job:localhost/replica:0/task:0/cpu:0"](training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3/Enter, training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3/Enter_1, ^training/Adam/gradients/Sub/_2469, ^_clooptraining/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayWrite/strided_slice/stack_2/_2315)]]
2018-10-29 19:26:53.754641: W tensorflow/core/framework/op_kernel.cc:1192] Not found: Resource __per_step_6/_tensor_arraysfiltered_detections/map/TensorArray_0/N10tensorflow11TensorArrayE does not exist.
[[Node: training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3 = TensorArrayGradV3[_class=["loc:@filtered_detections/map/while/TensorArrayReadV3", "loc:@filtered_detections/map/while/TensorArrayReadV3/Enter"], source="training/Adam/gradients", _device="/job:localhost/replica:0/task:0/cpu:0"](training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3/Enter, training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3/Enter_1, ^training/Adam/gradients/Sub/_2469, ^_clooptraining/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayWrite/strided_slice/stack_2/_2315)]]
2018-10-29 19:26:53.754655: W tensorflow/core/framework/op_kernel.cc:1192] Not found: Resource __per_step_6/_tensor_arraysfiltered_detections/map/TensorArray_0/N10tensorflow11TensorArrayE does not exist.
[[Node: training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3 = TensorArrayGradV3[_class=["loc:@filtered_detections/map/while/TensorArrayReadV3", "loc:@filtered_detections/map/while/TensorArrayReadV3/Enter"], source="training/Adam/gradients", _device="/job:localhost/replica:0/task:0/cpu:0"](training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3/Enter, training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3/Enter_1, ^training/Adam/gradients/Sub/_2469, ^_clooptraining/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayWrite/strided_slice/stack_2/_2315)]]
Traceback (most recent call last):
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1327, in _do_call
return fn(*args)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1306, in _run_fn
status, run_metadata)
File "/usr/lib/python3.5/contextlib.py", line 66, in exit
next(self.gen)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.NotFoundError: Resource __per_step_6/_tensor_arraysfiltered_detections/map/TensorArray_0/N10tensorflow11TensorArrayE does not exist.
[[Node: training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3 = TensorArrayGradV3[_class=["loc:@filtered_detections/map/while/TensorArrayReadV3", "loc:@filtered_detections/map/while/TensorArrayReadV3/Enter"], source="training/Adam/gradients", _device="/job:localhost/replica:0/task:0/cpu:0"](training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3/Enter, training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3/Enter_1, ^training/Adam/gradients/Sub/_2469, ^_clooptraining/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayWrite/strided_slice/stack_2/_2315)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "./keras-maskrcnn/keras_maskrcnn/bin/train.py", line 314, in
main()
File "./keras-maskrcnn/keras_maskrcnn/bin/train.py", line 310, in main
max_queue_size=1,
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/keras/engine/training.py", line 1418, in fit_generator
initial_epoch=initial_epoch)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/keras/engine/training_generator.py", line 217, in fit_generator
class_weight=class_weight)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/keras/engine/training.py", line 1217, in train_on_batch
outputs = self.train_function(ins)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 2721, in call
return self._legacy_call(inputs)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 2693, in _legacy_call
**self.session_kwargs)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1124, in _run
feed_dict_tensor, options, run_metadata)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1321, in _do_run
options, run_metadata)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1340, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: Resource __per_step_6/_tensor_arraysfiltered_detections/map/TensorArray_0/N10tensorflow11TensorArrayE does not exist.
[[Node: training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3 = TensorArrayGradV3[_class=["loc:@filtered_detections/map/while/TensorArrayReadV3", "loc:@filtered_detections/map/while/TensorArrayReadV3/Enter"], source="training/Adam/gradients", _device="/job:localhost/replica:0/task:0/cpu:0"](training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3/Enter, training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3/Enter_1, ^training/Adam/gradients/Sub/_2469, ^_clooptraining/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayWrite/strided_slice/stack_2/_2315)]]

Caused by op 'training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3', defined at:
File "./keras-maskrcnn/keras_maskrcnn/bin/train.py", line 314, in
main()
File "./keras-maskrcnn/keras_maskrcnn/bin/train.py", line 310, in main
max_queue_size=1,
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/keras/engine/training.py", line 1418, in fit_generator
initial_epoch=initial_epoch)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/keras/engine/training_generator.py", line 40, in fit_generator
model._make_train_function()
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/keras/engine/training.py", line 509, in _make_train_function
loss=self.total_loss)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/keras/optimizers.py", line 475, in get_updates
grads = self.get_gradients(loss, params)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/keras/optimizers.py", line 89, in get_gradients
grads = K.gradients(loss, params)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 2757, in gradients
return tf.gradients(loss, variables, colocate_gradients_with_ops=True)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py", line 542, in gradients
grad_scope, op, func_call, lambda: grad_fn(op, *out_grads))
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py", line 348, in _MaybeCompile
return grad_fn() # Exit early
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py", line 542, in
grad_scope, op, func_call, lambda: grad_fn(op, *out_grads))
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/ops/tensor_array_grad.py", line 104, in _TensorArrayReadGrad
.grad(source=grad_source, flow=flow))
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/ops/tensor_array_ops.py", line 253, in grad
handle=self._handle, source=source, flow_in=flow, name=name)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 2504, in _tensor_array_grad_v3
flow_in=flow_in, source=source, name=name)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1204, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

...which was originally created as op 'filtered_detections/map/while/TensorArrayReadV3', defined at:
File "./keras-maskrcnn/keras_maskrcnn/bin/train.py", line 314, in
main()
File "./keras-maskrcnn/keras_maskrcnn/bin/train.py", line 288, in main
anchor_params=anchor_params
File "./keras-maskrcnn/keras_maskrcnn/bin/train.py", line 67, in create_models
anchor_params=anchor_params
File "./keras-maskrcnn/keras_maskrcnn/bin/../../keras_maskrcnn/models/resnet.py", line 30, in maskrcnn
return resnet_maskrcnn(*args, backbone=self.backbone, **kwargs)
File "./keras-maskrcnn/keras_maskrcnn/bin/../../keras_maskrcnn/models/resnet.py", line 51, in resnet_maskrcnn
model = retinanet.retinanet_mask(inputs=inputs, num_classes=num_classes, backbone_layers=resnet.outputs[1:], **kwargs)
File "./keras-maskrcnn/keras_maskrcnn/bin/../../keras_maskrcnn/models/retinanet.py", line 142, in retinanet_mask
)([boxes, classification] + other)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/keras/engine/base_layer.py", line 457, in call
output = self.call(inputs, **kwargs)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/keras_retinanet/layers/filter_detections.py", line 179, in call
parallel_iterations=self.parallel_iterations
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/keras_retinanet/backend/tensorflow_backend.py", line 35, in map_fn
return tensorflow.map_fn(*args, **kwargs)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/ops/functional_ops.py", line 389, in map_fn
swap_memory=swap_memory)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2775, in while_loop
result = context.BuildLoop(cond, body, loop_vars, shape_invariants)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2604, in BuildLoop
pred, body, original_loop_vars, loop_vars, shape_invariants)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2554, in _BuildLoop
body_result = body(*packed_vars_for_body)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/ops/functional_ops.py", line 378, in compute
packed_values = input_pack([elem_ta.read(i) for elem_ta in elems_ta])
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/ops/functional_ops.py", line 378, in
packed_values = input_pack([elem_ta.read(i) for elem_ta in elems_ta])
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/util/tf_should_use.py", line 93, in fn
return method(self, *args, **kwargs)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/ops/tensor_array_ops.py", line 280, in read
name=name)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 2584, in _tensor_array_read_v3
name=name)
File "/media/felix/MongoDB/Train_Test/Knox/mask5/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)

NotFoundError (see above for traceback): Resource __per_step_6/_tensor_arraysfiltered_detections/map/TensorArray_0/N10tensorflow11TensorArrayE does not exist.
[[Node: training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3 = TensorArrayGradV3[_class=["loc:@filtered_detections/map/while/TensorArrayReadV3", "loc:@filtered_detections/map/while/TensorArrayReadV3/Enter"], source="training/Adam/gradients", _device="/job:localhost/replica:0/task:0/cpu:0"](training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3/Enter, training/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayGrad/TensorArrayGradV3/Enter_1, ^training/Adam/gradients/Sub/_2469, ^_clooptraining/Adam/gradients/filtered_detections/map/while/TensorArrayReadV3_grad/TensorArrayWrite/strided_slice/stack_2/_2315)]]

@hgaiser
Copy link
Contributor

hgaiser commented Nov 2, 2018

I've never seen this error before, but as you also mentioned, it looks like a tensorflow error to me.

Closing this for now. Feel free to reopen if you can show it is a fault of keras-maskrcnn.

@hgaiser hgaiser closed this as completed Nov 2, 2018
@felipk101
Copy link
Author

@hgaiser just a note that this error is caused when using CUDA 8 and tensorflow 1.4, Once I installed CUDA 9 and CUDA nn 7 it all worked. It would be useful for a minimum requirements section in the readme.md specifying cuda and tensorflow requirements.

@felipk101
Copy link
Author

PS: Love the work.

@hgaiser
Copy link
Contributor

hgaiser commented Nov 7, 2018

I see, thanks for letting us know what the error was. If you want you can make a PR to mention this requirement.

@DenceChen
Copy link

@felipk101 i have the same error, and install CUDA 9 and CUDA nn 7 tensorflow-gpu 1.8.0 , do you have some suggestion?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants