Skip to content
This repository has been archived by the owner on Mar 17, 2021. It is now read-only.

out-of-bounds w.r.t. dense side with broadcasted shape issue during promise12 demo training #8

Closed
ladislav-urban opened this issue Oct 26, 2017 · 5 comments
Labels

Comments

@ladislav-urban
Copy link

ladislav-urban commented Oct 26, 2017

When I run training with default settings of promise12 demo. The process fails with "out-of-bounds w.r.t. dense side with broadcasted shape" after couple of hundreds iterations.
This problem disappears if I switch off data augmentation :
#rotation_angle = (-10.0, 10.0)
#scaling_percentage = (-10.0, 10.0)
#random_flipping_axes= 1

Python libraries are as described in requirements and tensorflow in version 1.3. This problem is present in default branch of Niftynet as of Oct 26, 2017 . It was also seen 1 month ago.

This is full log of the issue:
INFO:niftynet: iter 234, dice_loss=0.5082721710205078 (1.068s)
2017-10-26 10:57:30.556125: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: Provided indices are out-of-bounds w.r.t. dense side with broadcasted shape
[[Node: worker_0/loss_function/mul = SparseDenseCwiseMul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](worker_0/loss_function/stack, worker_0/loss_function/ones_like, worker_0/loss_function/ToInt64_2, worker_0/loss_function/Softmax)]]
2017-10-26 10:57:30.556195: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: Provided indices are out-of-bounds w.r.t. dense side with broadcasted shape
[[Node: worker_0/loss_function/mul = SparseDenseCwiseMul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](worker_0/loss_function/stack, worker_0/loss_function/ones_like, worker_0/loss_function/ToInt64_2, worker_0/loss_function/Softmax)]]
2017-10-26 10:57:30.556216: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: Provided indices are out-of-bounds w.r.t. dense side with broadcasted shape
[[Node: worker_0/loss_function/mul = SparseDenseCwiseMul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](worker_0/loss_function/stack, worker_0/loss_function/ones_like, worker_0/loss_function/ToInt64_2, worker_0/loss_function/Softmax)]]
INFO:niftynet: Cleaning up...
INFO:niftynet: iter 235 saved: /home/ladislav/temp/promise12_model/models/model.ckpt
INFO:niftynet: stopping sampling threads
INFO:niftynet: SegmentationApplication stopped (time in second 1073.88).
Traceback (most recent call last):
File "/usr/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1327, in _do_call
return fn(*args)
File "/usr/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1306, in _run_fn
status, run_metadata)
File "/usr/lib64/python3.5/contextlib.py", line 66, in exit
next(self.gen)
File "/usr/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Provided indices are out-of-bounds w.r.t. dense side with broadcasted shape
[[Node: worker_0/loss_function/mul = SparseDenseCwiseMul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](worker_0/loss_function/stack, worker_0/loss_function/ones_like, worker_0/loss_function/ToInt64_2, worker_0/loss_function/Softmax)]]
[[Node: worker_0/gradients/AddN/_1901 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_6703_worker_0/gradients/AddN", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/ladislav/NiftyNet/net_segment.py", line 7, in
sys.exit(main())
File "/home/ladislav/NiftyNet/niftynet/init.py", line 83, in main
app_driver.run_application()
File "/home/ladislav/NiftyNet/niftynet/engine/application_driver.py", line 183, in run_application
self._training_loop(session, loop_status)
File "/home/ladislav/NiftyNet/niftynet/engine/application_driver.py", line 352, in _training_loop
graph_output = sess.run(vars_to_run)
File "/usr/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/usr/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1124, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1321, in _do_run
options, run_metadata)
File "/usr/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1340, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Provided indices are out-of-bounds w.r.t. dense side with broadcasted shape
[[Node: worker_0/loss_function/mul = SparseDenseCwiseMul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](worker_0/loss_function/stack, worker_0/loss_function/ones_like, worker_0/loss_function/ToInt64_2, worker_0/loss_function/Softmax)]]
[[Node: worker_0/gradients/AddN/_1901 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_6703_worker_0/gradients/AddN", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]]

Caused by op 'worker_0/loss_function/mul', defined at:
File "/home/ladislav/NiftyNet/net_segment.py", line 7, in
sys.exit(main())
File "/home/ladislav/NiftyNet/niftynet/init.py", line 83, in main
app_driver.run_application()
File "/home/ladislav/NiftyNet/niftynet/engine/application_driver.py", line 169, in run_application
self.graph = self._create_graph(self.graph)
File "/home/ladislav/NiftyNet/niftynet/engine/application_driver.py", line 236, in _create_graph
self.gradients_collector)
File "/home/ladislav/NiftyNet/niftynet/application/segmentation_application.py", line 237, in connect_data_and_network
weight_map=data_dict.get('weight', None))
File "/home/ladislav/NiftyNet/niftynet/layer/base_layer.py", line 32, in call
return self._op(*args, **kwargs)
File "/usr/lib/python3.5/site-packages/tensorflow/python/ops/template.py", line 268, in call
result = self._call_func(args, kwargs, check_for_new_variables=False)
File "/usr/lib/python3.5/site-packages/tensorflow/python/ops/template.py", line 217, in _call_func
result = self._func(*args, **kwargs)
File "/home/ladislav/NiftyNet/niftynet/layer/loss_segmentation.py", line 61, in layer_op
pred, ground_truth, weight_map))
File "/home/ladislav/NiftyNet/niftynet/layer/loss_segmentation.py", line 343, in dice
dice_numerator = 2.0 * tf.sparse_reduce_sum(one_hot * prediction,
File "/usr/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py", line 876, in binary_op_wrapper_sparse
name=name), sp_x.dense_shape)
File "/usr/lib/python3.5/site-packages/tensorflow/python/ops/gen_sparse_ops.py", line 601, in sparse_dense_cwise_mul
dense=dense, name=name)
File "/usr/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/usr/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1204, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Provided indices are out-of-bounds w.r.t. dense side with broadcasted shape
[[Node: worker_0/loss_function/mul = SparseDenseCwiseMul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](worker_0/loss_function/stack, worker_0/loss_function/ones_like, worker_0/loss_function/ToInt64_2, worker_0/loss_function/Softmax)]]
[[Node: worker_0/gradients/AddN/_1901 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_6703_worker_0/gradients/AddN", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]]

@wyli
Copy link
Member

wyli commented Oct 26, 2017

Hi @ladislav-urban many thanks for the feedback! This could happen when the largest voxel value in the discrete segmentation maps is greater than num_classes in the config file (since the voxel values are used as sparse binary map indices). Could you check if that's the case?

@wyli wyli added the question label Oct 26, 2017
@ladislav-urban
Copy link
Author

Hi Wenqi Li,
thanks for your quick answer! I do use default promise12 dataset. I have checked segmentation files in directories: TrainingData_Part1 to TrainingData_Part3. These are files starting from Case00_segmentation.mhd till Case49_segmentation.mhd . They include only values 0 or 1. There is a settings num_classes = 2 in the config files.
Can the higher voxel numbers be created by data augmentation process?
Thanks a lot for your answer
Ladislav

@wyli
Copy link
Member

wyli commented Oct 26, 2017

Yes I think you're right, the interp_order parameter should be 0 for [label] section, which means using the nearest neighbour interpolation in the data augmentations. Thanks and I'll update the config files.

@ladislav-urban
Copy link
Author

Hi Wenqi Li,
thanks a lot for the advice! The training is now working without errors.

@qianjiangcn
Copy link

I have been running into the same issue.
InvalidArgumentError (see above for traceback): Provided indices are out-of-bounds w.r.t. dense side with broadcasted shape
[[{{node worker_0/loss_function/map/while/mul}} = SparseDenseCwiseMul[T=DT_FLOAT, _class=["loc:@worke...tackPushV2"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](worker_0/loss_function/map/while/SparseReshape, worker_0/loss_function/map/while/ones_like, worker_0/loss_function/map/while/SparseReshape:1, worker_0/loss_function/map/while/Softmax)]]

As stated, this could happen when the largest voxel value in the discrete segmentation maps is greater than num_classes in the config file.
what do you mean largest voxel value in the discrete segmentation maps?
my spatial window is 32 and the num_classes in the config is 160.

Thank you very much !

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants