Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

weights can not be broadcast to values. values.rank=3. weights.rank=1. values.shape=(2, 448, 448). weights.shape=(2,). #8

Closed
JoshuaChou2018 opened this issue Nov 20, 2019 · 12 comments

Comments

@JoshuaChou2018
Copy link

I got error while training.

Epoch 1/500
2019-11-20 14:47:08.423134: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2019-11-20 14:47:17.904216: W tensorflow/stream_executor/cuda/redzone_allocator.cc:312] Not found: ./bin/ptxas not found
Relying on driver to perform ptx compilation. This message will be only logged once.
2019-11-20 14:47:24.653919: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
weights can not be broadcast to values. values.rank=3. weights.rank=1. values.shape=(2, 448, 448). weights.shape=(2,).
Traceback (most recent call last):
File "/home/zhouj0d/.conda/envs/mp/bin/mp", line 11, in
load_entry_point('MultiPlanarUNet==0.2.3', 'console_scripts', 'mp')()
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/MultiPlanarUNet-0.2.3-py3.6.egg/MultiPlanarUNet/bin/mp.py", line 55, in entry_func
mod.entry_func(parsed.args)
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/MultiPlanarUNet-0.2.3-py3.6.egg/MultiPlanarUNet/bin/train.py", line 398, in entry_func
raise e
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/MultiPlanarUNet-0.2.3-py3.6.egg/MultiPlanarUNet/bin/train.py", line 394, in entry_func
run(project_dir=project_dir, gpu_mon=gpu_mon, logger=logger, args=args)
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/MultiPlanarUNet-0.2.3-py3.6.egg/MultiPlanarUNet/bin/train.py", line 358, in run
hparams=hparams, no_im=args.no_images, **hparams["fit"])
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/MultiPlanarUNet-0.2.3-py3.6.egg/MultiPlanarUNet/train/trainer.py", line 111, in fit
raise e
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/MultiPlanarUNet-0.2.3-py3.6.egg/MultiPlanarUNet/train/trainer.py", line 96, in fit
val_ignore_class_zero)
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/MultiPlanarUNet-0.2.3-py3.6.egg/MultiPlanarUNet/train/trainer.py", line 204, in _fit_loop
self.model.fit_generator(**fit_kwargs)
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 1297, in fit_generator
steps_name='steps_per_epoch')
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_generator.py", line 265, in model_iteration
batch_outs = batch_function(*batch_data)
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 973, in train_on_batch
class_weight=class_weight, reset_metrics=reset_metrics)
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py", line 264, in train_on_batch
output_loss_metrics=model._output_loss_metrics)
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_eager.py", line 311, in train_on_batch
output_loss_metrics=output_loss_metrics))
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_eager.py", line 252, in _process_single_batch
training=training))
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_eager.py", line 170, in _model_loss
reduction=losses_utils.ReductionV2.NONE)
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/tensorflow_core/python/keras/utils/losses_utils.py", line 107, in compute_weighted_loss
losses, sample_weight)
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/tensorflow_core/python/ops/losses/util.py", line 148, in scale_losses_by_sample_weight
sample_weight = weights_broadcast_ops.broadcast_weights(sample_weight, losses)
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/tensorflow_core/python/ops/weights_broadcast_ops.py", line 167, in broadcast_weights
with ops.control_dependencies((assert_broadcastable(weights, values),)):
File "/home/zhouj0d/.conda/envs/mp/lib/python3.6/site-packages/tensorflow_core/python/ops/weights_broadcast_ops.py", line 103, in assert_broadcastable
weights_rank_static, values.shape, weights.shape))
ValueError: weights can not be broadcast to values. values.rank=3. weights.rank=1. values.shape=(2, 448, 448). weights.shape=(2,).

@perslev
Copy link
Owner

perslev commented Nov 20, 2019

Hi,

Which version of MultiPlanarUNet are you using?
And which TensorFlow version?

Cheers,
Mathias

@JoshuaChou2018
Copy link
Author

JoshuaChou2018 commented Nov 20, 2019

Hi,

Multi-Planar UNet (0.2.3)
TensorFlow-gpu 1.14.0

Thank you!

@JoshuaChou2018
Copy link
Author

I have also tried other versions of tensorFlow-gpu, the same error occurred.

@perslev
Copy link
Owner

perslev commented Nov 22, 2019

Okay thanks! I will look into it as soon as possible.

@perslev
Copy link
Owner

perslev commented Dec 5, 2019

Sorry, still did not have time to look into this - did you solve the issue or find a workaround meanwhile?

@mcastrorennes
Copy link

Hi,

I also fall into the same error using the database Task02_heart, i have the same error with my BD.

I use:
MultiPlanarUNet (0.2.3)
tensorflow-gpu (1.15.0)

Do you have some solution,

Thank in advance

@mcastrorennes
Copy link

Same error when I run mp train --just_one --overwrite on a toy project with data generated using mp toy_data --out_dir ./toy_data

@perslev
Copy link
Owner

perslev commented Feb 3, 2020

Hi,

Thank you for your feedback.

I will be working on this project this week, and expect a solution to your problem no later than friday.

Cheers,
Mathias

@arashmh
Copy link

arashmh commented Feb 4, 2020

Hi
I get the same error in the first iteration:
ValueError: weights can not be broadcast to values. values.rank=3. weights.rank=1. values.shape=(?, 304, 304). weights.shape=(?,).

@perslev
Copy link
Owner

perslev commented Feb 7, 2020

Hi!

Would you mind trying out the newest version (0.2.4) and see if this fixed the issue on your end?
Note that the 0.2.4 package and forward only supports TensorFlow >=2.0. Also, the newest version slightly updated package requirements, so I suggest you reinstall it with PIP.

Sorry for the trouble! Please let me know if you still have the problem.

@perslev
Copy link
Owner

perslev commented Feb 7, 2020

Hi,
It was only just uploaded, it may take a few minutes to appear I belive. Otherwise, you may clone it from github and install directly, if you prefer.
Also, you might want to uninstall the old 'MultiPlanarUNet' package to make sure you are running the right one, once you get the new (renamed) version installed.

@arashmh
Copy link

arashmh commented Feb 7, 2020

Thanks! you were right. I installed it successfully via pip and now training it. No error related to weight allocation or anything.

have a good weekend

@perslev perslev closed this as completed Feb 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants