Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing 'Incompatible Shapes' error #3

Closed
boycejam opened this issue Dec 7, 2020 · 6 comments
Closed

Fixing 'Incompatible Shapes' error #3

boycejam opened this issue Dec 7, 2020 · 6 comments

Comments

@boycejam
Copy link

boycejam commented Dec 7, 2020

I am trying to train this model on my own data. I have been able to get it to work before with my own data, but I wanted to have my images semantically segmented beforehand, so I used a different model to do so, and I think in doing so I must have changed my environment enough to start getting this error because I highly doubt it's an issue with the new images I'm using. They are the same size as the previous. I have run the requirements.txt file and still am getting this issue. I have posted the error below. Any help on what the problem might be and how to fix it would be greatly appreciated. Thank you in advance!

Starting training...
Train for 200 steps, validate for 1169 steps
Epoch 1/100
2020-12-06 20:35:50.965473: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Invalid argument: Incompatible shapes: [3] vs. [256,512,4]
[[{{node Equal_29}}]]
[[IteratorGetNext]]
2020-12-06 20:35:50.987892: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Invalid argument: Incompatible shapes: [3] vs. [256,512,4]
[[{{node Equal_29}}]]
[[IteratorGetNext]]
[[metrics/mean_io_u_with_one_hot_labels/StatefulPartitionedCall/confusion_matrix/assert_non_negative/assert_less_equal/Assert/AssertGuard/else/_6/Assert/data_1/_20]]
1/200 [..............................] - ETA: 35:21WARNING:tensorflow:Can save best model only with val_mean_io_u_with_one_hot_labels available, skipping.
WARNING:tensorflow:Early stopping conditioned on metric val_mean_io_u_with_one_hot_labels which is not available. Available metrics are:
Traceback (most recent call last):
File "./train.py", line 185, in
callbacks=callbacks)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training.py", line 819, in fit
use_multiprocessing=use_multiprocessing)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 342, in fit
total_epochs=epochs)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 128, in run_one_epoch
batch_outs = execution_function(iterator)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py", line 98, in execution_function
distributed_function(input_fn))
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/eager/def_function.py", line 568, in call
result = self._call(*args, **kwds)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/eager/def_function.py", line 632, in _call
return self._stateless_fn(*args, **kwds)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/eager/function.py", line 2363, in call
return graph_function._filtered_call(args, kwargs) # pylint: disable=protected-access
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/eager/function.py", line 1611, in _filtered_call
self.captured_inputs)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/eager/function.py", line 1692, in _call_flat
ctx, args, cancellation_manager=cancellation_manager))
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/eager/function.py", line 545, in call
ctx=ctx)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/eager/execute.py", line 67, in quick_execute
six.raise_from(core._status_to_exception(e.code, message), None)
File "", line 3, in raise_from
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [3] vs. [256,512,4]
[[{{node Equal_29}}]]
[[IteratorGetNext]] [Op:__inference_distributed_function_17137]
Function call stack:
distributed_function

@lreiher
Copy link
Member

lreiher commented Dec 7, 2020

Hm, not sure how changing parts of the environment would lead to this error.

From the [256,512,4] shape I'm assuming that your images have 4 semantic classes/colors defined in your one-hot conversion file (cf. model/one_hot_conversion). Perhaps verify that this is correct.

The error seems to come up during IoU calculation, see MeanIoUWithOneHotLabels. Judging by the order of the arguments y_true and y_pred, I would assume that the faulty shape [3] is related to y_true, i.e. to the labels that you provide. Thus the first thing that I would examine is whether the one-hot encoding of the label images works correctly. You may execute these 3 lines (load, resize, encode) manually for one label image and check if the output shape is [256, 512, 4].

@rakshitagouni
Copy link

When I print out the label image shape I get (256, 512, 5)
In my one_hot_conversion file:
`

<SLabel Name="road"             fromColour="128  64 128"    toValue="1" />

<SLabel Name="sidewalk"         fromColour="244  35 232"    toValue="2" />

<SLabel Name="obstacle"         fromColour="  0   0   0"    toValue="3" /> <!-- static -->
<SLabel Name="person"           fromColour="255   0   0"    toValue="3" />
<SLabel Name="parking"          fromColour="250 170 160"    toValue="3" />
<SLabel Name="dynamic"          fromColour="111  74   0"    toValue="3" />
<SLabel Name="ground"           fromColour=" 81   0  81"    toValue="3" />
<SLabel Name="rail track"       fromColour="230 150 140"    toValue="3" />
<SLabel Name="building"         fromColour=" 70  70  70"    toValue="3" />
<SLabel Name="wall"             fromColour="102 102 156"    toValue="3" />
<SLabel Name="fence"            fromColour="190 153 153"    toValue="3" />
<SLabel Name="guard rail"       fromColour="180 165 180"    toValue="3" />
<SLabel Name="bridge"           fromColour="150 100 100"    toValue="3" />
<SLabel Name="tunnel"           fromColour="150 120  90"    toValue="3" />
<SLabel Name="pole"             fromColour="153 153 153"    toValue="3" />
<SLabel Name="polegroup"        fromColour="153 153 153"    toValue="3" />
<SLabel Name="traffic light"    fromColour="250 170  30"    toValue="3" />
<SLabel Name="traffic sign"     fromColour="220 220   0"    toValue="3" />
<SLabel Name="train"            fromColour="  0  80 100"    toValue="3" />
<SLabel Name="vehicle"          fromColour="  0   0 142"    toValue="3" />
<SLabel Name="trailer"          fromColour="  0   0 110"    toValue="3" />
<SLabel Name="truck"            fromColour="  0   0  70"    toValue="3" />
<SLabel Name="bus"              fromColour="  0  60 100"    toValue="3" />
<SLabel Name="caravan"          fromColour="  0   0  90"    toValue="3" />
<SLabel Name="rider"            fromColour="220  20  60"    toValue="3" />
<SLabel Name="motorcycle"       fromColour="  0   0 230"    toValue="3" />
<SLabel Name="bicycle"          fromColour="119  11  32"    toValue="3" />
<SLabel Name="sky"              fromColour=" 70 130 180"    toValue="3" />

<SLabel Name="vegetation"       fromColour="107 142  35"    toValue="4" />
<SLabel Name="terrain"          fromColour="152 251 152"    toValue="4" />

<SLabel Name="occluded"         fromColour="150 150 150"    toValue="5" />
` I still don't understand where the [3] is coming from?

So to match the [256, 512, 4] shape, I made a new one hot conversion file with only 4 semantic classes and I now get the following error:

`Traceback (most recent call last):
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1619, in _create_c_op
c_op = c_api.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimensions must be equal, but are 4 and 5 for 'loss/activation_loss/mul_1' (op: 'Mul') with input shapes: [5,256,512,4], [5].

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "./train.py", line 187, in
callbacks=callbacks)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training.py", line 819, in fit
use_multiprocessing=use_multiprocessing)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 342, in fit
total_epochs=epochs)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 128, in run_one_epoch
batch_outs = execution_function(iterator)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py", line 98, in execution_function
distributed_function(input_fn))
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/eager/def_function.py", line 568, in call
result = self._call(*args, **kwds)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/eager/def_function.py", line 615, in _call
self._initialize(args, kwds, add_initializers_to=initializers)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/eager/def_function.py", line 497, in _initialize
*args, **kwds))
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/eager/function.py", line 2389, in _get_concrete_function_internal_garbage_collected
graph_function, _, _ = self._maybe_define_function(args, kwargs)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/eager/function.py", line 2703, in _maybe_define_function
graph_function = self._create_graph_function(args, kwargs)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/eager/function.py", line 2593, in _create_graph_function
capture_by_value=self._capture_by_value),
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/framework/func_graph.py", line 978, in func_graph_from_py_func
func_outputs = python_func(*func_args, **func_kwargs)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/eager/def_function.py", line 439, in wrapped_fn
return weak_wrapped_fn().wrapped(*args, **kwds)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py", line 85, in distributed_function
per_replica_function, args=args)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/distribute/distribute_lib.py", line 763, in experimental_run_v2
return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/distribute/distribute_lib.py", line 1819, in call_for_each_replica
return self._call_for_each_replica(fn, args, kwargs)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/distribute/distribute_lib.py", line 2164, in _call_for_each_replica
return fn(*args, **kwargs)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/autograph/impl/api.py", line 292, in wrapper
return func(*args, **kwargs)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py", line 433, in train_on_batch
output_loss_metrics=model._output_loss_metrics)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_eager.py", line 312, in train_on_batch
output_loss_metrics=output_loss_metrics))
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_eager.py", line 253, in _process_single_batch
training=training))
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_eager.py", line 167, in _model_loss
per_sample_losses = loss_fn.call(targets[i], outs[i])
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/keras/losses.py", line 221, in call
return self.fn(y_true, y_pred, **self._fn_kwargs)
File "/home/techlab_grizzly/Desktop/Cam2BEV/model/utils.py", line 248, in wcce
return tf.keras.backend.categorical_crossentropy(y_true, y_pred) * tf.keras.backend.sum(y_true * Kweights, axis=-1)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/ops/math_ops.py", line 902, in binary_op_wrapper
return func(x, y, name=name)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/ops/math_ops.py", line 1201, in _mul_dispatch
return gen_math_ops.mul(x, y, name=name)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/ops/gen_math_ops.py", line 6125, in mul
"Mul", x=x, y=y, name=name)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/framework/op_def_library.py", line 742, in _apply_op_helper
attrs=attr_protos, op_def=op_def)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/framework/func_graph.py", line 595, in _create_op_internal
compute_device)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3322, in _create_op_internal
op_def=op_def)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1786, in init
control_input_ops)
File "/home/techlab_grizzly/Desktop/Cam2BEV/env37/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1622, in _create_c_op
raise ValueError(str(e))
ValueError: Dimensions must be equal, but are 4 and 5 for 'loss/activation_loss/mul_1' (op: 'Mul') with input shapes: [5,256,512,4], [5].
`

Now, I don't understand where I'm getting the dimension 5 from.

Any help is appreciated! Thank you!

@lreiher
Copy link
Member

lreiher commented Dec 7, 2020

I can have a look if you upload a single input/label pair and the config and one-hot-conversion files that you use.

Please paste error messages as code to make them readable.

@rakshitagouni
Copy link

https://github.com/boycejam/Our_BEV_Files

Let me know if there's anything else that I'm missing

@boycejam boycejam closed this as completed Dec 7, 2020
@boycejam boycejam reopened this Dec 7, 2020
@boycejam
Copy link
Author

boycejam commented Dec 7, 2020

I accidentally closed this issue so just commenting again to make sure you can see the last comment:

https://github.com/boycejam/Our_BEV_Files

Above is the link to a repo I made containing the files you mentioned. Let me know if there is anything else you need

@lreiher
Copy link
Member

lreiher commented Dec 7, 2020

Problem is that your input image has a fourth alpha channel, s.t. the resized image has shape (256, 512, 4). This causes the crash during one-hot-encoding.

I will push a fix tomorrow, s.t. an image will always be loaded as RGB instead of RGBA, even if present. In the meantime, you can fix it yourself by replacing utils.py#L77 with

img = tf.image.decode_png(img, channels=3)

Some more notes on your files:

  • The standard implementation expects semantically segmented input and output images, which are then one-hot-en/decoded as part of the pipeline. Your images are a blend of the real-world-image and the semantic segmentation. One-hot-en/decoding will not work properly this way.
  • Your input image color-codes vehicles in a purple-ish way, but the standard 0,0,142 (RGB) blue is listed in the convert_10.xml. You need to check the colors you specify there.
  • Your label image has shape (640, 480), while your input image has shape (480, 640). Keep in mind that both will be center-cropped/resized to (256, 512).
  • It's important that you provide a good estimate of the homography matrix. Just saying as I couldn't have a look at your homography file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants