Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with using a custom dataset and image_util.ImageDataProvider() #6

Open
JackBurdick opened this issue Dec 4, 2016 · 13 comments
Open
Labels

Comments

@JackBurdick
Copy link

Hey!

Thank you for your help and for the updates. Unfortunately, we're still having trouble using a custom dataset and we're hoping you can help us.

The main error produced is this;

....
tensorflow.python.framework.errors.InvalidArgumentError: logits and labels must be same size: logits_size=[709520,2] labels_size=[710500,2]
	 [[Node: SoftmaxCrossEntropyWithLogits = SoftmaxCrossEntropyWithLogits[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_29, Reshape_30)]]
....

We're working with rgb images of size (767x1022) and using the data provider function like this (where everything here seems to be working as expected);
data_provider = image_util.ImageDataProvider('reduced_segmentation_dataset/*', data_suffix=".jpg", mask_suffix='_mask.png')

The other thing we noticed is that when we call path = trainer.train(data_provider, "./skin_trained", training_iters=10, epochs=4, display_step=2) it defaults to calling test_x, test_y = data_provider(4) on line 367 of master/tf_unet/unet.py If we call data_provider(1) we seem to get the results we expect and bypass the error but the above error is still preventing us from a full run.

Do you have any ideas why we're having this mismatch? I'd be happy to provide more information as needed.

@jakeret
Copy link
Owner

jakeret commented Dec 4, 2016

Based on the information I have to guess a bit. Are you sure that the data and the mask image have the same dimension?

What is the error you are getting for "the other thing"?
For testing purpose: can you create an instance of the ImageDataProvider as posted above and call test_x, test_y = data_provider(4). What are the shapes of test_x and test_y?

@JackBurdick
Copy link
Author

We double checked and all images are the same dimensions.

If we run test_x, test_y = data_provider(4) we get the following;

>>> print(x_test.shape)
(4, 767, 1022, 3)
>>> print(y_test.shape)
(4, 767, 1022, 2)

@JackBurdick
Copy link
Author

We assumed the line data_provider(4) on line 367 of master/tf_unet/unet.py was left over from the toy demo.. We can revert back and see what the error is if needed though

@jakeret
Copy link
Owner

jakeret commented Dec 4, 2016

No this line is actually important. Thats where I load the dataset used for the verification.
I just checked in an update where I replaced the magic-number with a constant to make it more explicit

@JackBurdick
Copy link
Author

Ok.. We've reverted back/updated, thank you. If we follow this example where should data come from? I think our logic must be wrong.

@jakeret
Copy link
Owner

jakeret commented Dec 4, 2016

Thats the data you want to run the prediction/test on.

@JackBurdick
Copy link
Author

Right, ok so if we try this (not sure we're using it correctly):

data_provider = image_util.ImageDataProvider('reduced_segmentation_dataset/*', data_suffix=".jpg", mask_suffix='_Segmentation.png')

net = unet.Unet(channels=3, n_class=2, layers=3, features_root=64)
trainer = unet.Trainer(net, optimizer="momentum", opt_kwargs=dict(momentum=0.2))

path = trainer.train(data_provider, output_path, training_iters=16, epochs=4)

prediction = net.predict(path, data_provider)

we get this output:

Number of files used: 14
Layers 3, features 64, filter size 3x3, pool size: 2x2
Removing '/Users/MrBurdick/Documents/test_unet/prediction'
Removing '/Users/MrBurdick/Documents/test_unet/jack_trained'
Allocating '/Users/MrBurdick/Documents/test_unet/prediction'
Allocating '/Users/MrBurdick/Documents/test_unet/jack_trained'
Traceback (most recent call last):
  File "/Users/MrBurdick/anaconda/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 965, in _do_call
    return fn(*args)
  File "/Users/MrBurdick/anaconda/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 947, in _run_fn
    status, run_metadata)
  File "/Users/MrBurdick/anaconda/lib/python3.5/contextlib.py", line 66, in __exit__
    next(self.gen)
  File "/Users/MrBurdick/anaconda/lib/python3.5/site-packages/tensorflow/python/framework/errors.py", line 450, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors.InvalidArgumentError: logits and labels must be same size: logits_size=[2838080,2] labels_size=[2842000,2]
	 [[Node: SoftmaxCrossEntropyWithLogits = SoftmaxCrossEntropyWithLogits[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_29, Reshape_30)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "skin2.py", line 13, in <module>
    path = trainer.train(data_provider, output_path, training_iters=16, epochs=4)
  File "/Users/MrBurdick/.local/lib/python3.5/site-packages/tf_unet-0.1.0-py3.5.egg/tf_unet/unet.py", line 368, in train
    pred_shape = self.store_prediction(sess, test_x, test_y, "_init")
  File "/Users/MrBurdick/.local/lib/python3.5/site-packages/tf_unet-0.1.0-py3.5.egg/tf_unet/unet.py", line 415, in store_prediction
    self.net.keep_prob: 1.})
  File "/Users/MrBurdick/anaconda/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 710, in run
    run_metadata_ptr)
  File "/Users/MrBurdick/anaconda/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 908, in _run
    feed_dict_string, options, run_metadata)
  File "/Users/MrBurdick/anaconda/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 958, in _do_run
    target_list, options, run_metadata)
  File "/Users/MrBurdick/anaconda/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 978, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.InvalidArgumentError: logits and labels must be same size: logits_size=[2838080,2] labels_size=[2842000,2]
	 [[Node: SoftmaxCrossEntropyWithLogits = SoftmaxCrossEntropyWithLogits[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](Reshape_29, Reshape_30)]]
Caused by op 'SoftmaxCrossEntropyWithLogits', defined at:
  File "skin2.py", line 10, in <module>
    net = unet.Unet(channels=3, n_class=2, layers=3, features_root=64)
  File "/Users/MrBurdick/.local/lib/python3.5/site-packages/tf_unet-0.1.0-py3.5.egg/tf_unet/unet.py", line 198, in __init__
    tf.reshape(self.y, [-1, n_class])))
  File "/Users/MrBurdick/anaconda/lib/python3.5/site-packages/tensorflow/python/ops/nn_ops.py", line 491, in softmax_cross_entropy_with_logits
    precise_logits, labels, name=name)
  File "/Users/MrBurdick/anaconda/lib/python3.5/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 1427, in _softmax_cross_entropy_with_logits
    features=features, labels=labels, name=name)
  File "/Users/MrBurdick/anaconda/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 703, in apply_op
    op_def=op_def)
  File "/Users/MrBurdick/anaconda/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2317, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/Users/MrBurdick/anaconda/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1239, in __init__
    self._traceback = _extract_stack()

@JackBurdick
Copy link
Author

we also tried:

path = trainer.train(data_provider, output_path, training_iters=16, epochs=4)

data, _ = data_provider(1)

prediction = net.predict(path, data)

but were met with what looks like the same error

@jakeret
Copy link
Owner

jakeret commented Dec 4, 2016

Hard to tell from this. Can you provide the code and the data so that it can be reproduced?

Apart from that, this line prediction = net.predict(path, data_provider) is not going to work

@jakeret
Copy link
Owner

jakeret commented Dec 5, 2016

Seems like that there is a bug in the util.crop_to_shape function when the image size contains an odd number.

One way around this in the meantime it to resize your input:

from PIL import Image
import numpy as np

class SkinImageDataProvider(image_util.ImageDataProvider):
    
    def _load_file(self, path, dtype=np.float32):
        img = Image.open(path)
        return np.array(img.resize((500, 376)), dtype)

Your will of course lose information but the learning process is going to be much faster

@gingerly
Copy link

Hi,
Thanks for pointing out the bug. I got a different error after using your work around:

/DLRecon/tf_unet-master/tf_unet/util.py in combine_img_prediction(data, gt, pred)
    101     ch = data.shape[3]
    102     img = np.concatenate((to_rgb(crop_to_shape(data, pred.shape).reshape(-1, ny, ch)),
--> 103                           to_rgb(crop_to_shape(gt[..., 1], pred.shape).reshape(-1, ny, 1)),
    104                           to_rgb(pred[..., 1].reshape(-1, ny, 1))), axis=1)
    105     return img

IndexError: index 1 is out of bounds for axis 3 with size 1

@jakeret
Copy link
Owner

jakeret commented Apr 26, 2017

Do you have a stacktrace? When does the error happen?

What is the shape of your ground truth?

@jakeret jakeret added the bug label Apr 26, 2017
@moudsarewju
Copy link

Hello, same bug when I use my own dataset

generator = image_util.ImageDataProvider("train/*.png")
net = unet.Unet(channels=generator.channels, n_class=generator.n_class, layers=1, features_root=16)
trainer = unet.Trainer(net, optimizer="momentum", opt_kwargs=dict(momentum=0.2))
path = trainer.train(generator, "./unet_trained", training_iters=20, epochs=100)
prediction = net.predict("./unet_trained/model.cpkt", x_test)
2017-08-10 01:12:23,718 Verification error= 96.5%, loss= 0.7201
2017-08-10 01:12:24,015 Start optimization
2017-08-10 01:12:24,406 Iter 0, Minibatch Loss= 0.4491, Training Accuracy= 1.0000, Minibatch error= 0.0%
2017-08-10 01:12:24,646 Iter 1, Minibatch Loss= 0.2825, Training Accuracy= 0.9981, Minibatch error= 0.2%
2017-08-10 01:12:24,875 Iter 2, Minibatch Loss= 0.1428, Training Accuracy= 0.9958, Minibatch error= 0.4%
2017-08-10 01:12:25,086 Iter 3, Minibatch Loss= 0.0630, Training Accuracy= 0.9999, Minibatch error= 0.0%
2017-08-10 01:12:25,276 Iter 4, Minibatch Loss= 0.0599, Training Accuracy= 0.9886, Minibatch error= 1.1%
2017-08-10 01:12:25,468 Iter 5, Minibatch Loss= 0.0198, Training Accuracy= 0.9999, Minibatch error= 0.0%
2017-08-10 01:12:25,668 Iter 6, Minibatch Loss= 0.0147, Training Accuracy= 1.0000, Minibatch error= 0.0%
Traceback (most recent call last):
  File "demo_code.py", line 64, in <module>
    path = trainer.train(generator, "./unet_trained", training_iters=20, epochs=100)
  File "/media/hd01/unet/ss/tf_unet/unet.py", line 435, in train
    self.net.keep_prob: dropout})
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 789, in run
    run_metadata_ptr)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 997, in _run
    feed_dict_string, options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1132, in _do_run
    target_list, options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1152, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: logits and labels must be same size: logits_size=[64516,2] labels_size=[0,2]
         [[Node: SoftmaxCrossEntropyWithLogits = SoftmaxCrossEntropyWithLogits[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"](Reshape_7, Reshape_8)]]

Caused by op u'SoftmaxCrossEntropyWithLogits', defined at:
  File "demo_code.py", line 60, in <module>
    net = unet.Unet(channels=generator.channels, n_class=generator.n_class, layers=1, features_root=16)
  File "/media/hd01/unet/ss/tf_unet/unet.py", line 193, in __init__
    self.cost = self._get_cost(logits, cost, cost_kwargs)
  File "/media/hd01/unet/ss/tf_unet/unet.py", line 243, in _get_cost
    loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=flat_logits, labels=flat_labels))
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn_ops.py", line 1594, in softmax_cross_entropy_with_logits
    precise_logits, labels, name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_nn_ops.py", line 2380, in _softmax_cross_entropy_with_logits
    features=features, labels=labels, name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
    op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2506, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1269, in __init__
    self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): logits and labels must be same size: logits_size=[64516,2] labels_size=[0,2]
         [[Node: SoftmaxCrossEntropyWithLogits = SoftmaxCrossEntropyWithLogits[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"](Reshape_7, Reshape_8)]]```

@jakeret jakeret mentioned this issue Sep 8, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants