Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: 1only batches of spatial targets supported #123

Closed
alkeksibusiii opened this issue Jan 28, 2020 · 8 comments
Closed

RuntimeError: 1only batches of spatial targets supported #123

alkeksibusiii opened this issue Jan 28, 2020 · 8 comments

Comments

@alkeksibusiii
Copy link

Hello Alex,
when trying to train an 8 class u-net model with 86 images, soon after start I receive an error message (below):

INFO: Network:
3 input channels
8 output channels (classes)
Bilinear upscaling
INFO: Creating dataset with 86 examples
INFO: Starting training:
Epochs: 5
Batch size: 1
Learning rate: 0.1
Training size: 78
Validation size: 8
Checkpoints: True
Device: cuda
Images scaling: 0.5

Epoch 1/5: 0%
RuntimeError: 1only batches of spatial targets supported (non-empty 3D tensors) but got targets of size: : [1, 1, 384, 512]

I think it is related due to my uncertainty how to encode the masks correctly - currently placed
as one png per training image in the masks directory. But how to "one-hot-encode" 8 different classes in one png file ? Do the color values have to be like 1,2,4,8,16,32 .. or do I have to
modify the dataset loader respectively ?

@milesial
Copy link
Owner

Hi, you can't really one-hot encode your masks in a png image. You could however assign one color to each mask (e.g. (10, 0, 0) for the first class, (20, 0, 0) for the second one, ... ), and then modify the preprocessing function of the Dataset class to modify these colors into one-hot vectors after loading the masks.

@alkeksibusiii
Copy link
Author

Ok thanks for your quick response; so I have to modify your code to get it working with multiple classes, didn't know that. Will take some time here to try and check.

@tim-vdl
Copy link

tim-vdl commented Jan 30, 2020

Hi, you can't really one-hot encode your masks in a png image. You could however assign one color to each mask (e.g. (10, 0, 0) for the first class, (20, 0, 0) for the second one, ... ), and then modify the preprocessing function of the Dataset class to modify these colors into one-hot vectors after loading the masks.

Hi, I'm having the same issue with multiclass segmentation. My masks are grayscale PNGs in which every class has his own grayscale value.
@milesial What exactly do you mean with the one-hot vector? I've changed the preprocessing function so that the HxW mask image is transformed into a BxCxHxW array. Here, B is the batch size and C is the number of channels (= number of classes). Every channel is a binary image for each class, with value = 1 if the pixel belongs to the class. However, I still get the same error.

@alkeksibusiii Have you found a solution?

Thank you in advance,
Tim

@milesial
Copy link
Owner

milesial commented Feb 8, 2020

@tim-vdl I think you don't need to create a batch dimension, as this is the job of the Dataloader

@tim-vdl
Copy link

tim-vdl commented Feb 14, 2020

Hi,

I've found a solution for my problem. It turns out you don't need the one-hot encoding (CxHxW). I could just use my masks as they are: HxW where the pixel values range between 0 and C-1. I had to change some other parts in the code, however, to make it work:

  • In train.py, I changed line 78 from loss = criterion(masks_pred, true_masks.squeeze(1)) to loss = criterion(masks_pred, true_masks.squeeze(1))
  • In eval.py, I changed line 27 from tot += F.cross_entropy(pred.unsqueeze(dim=0), true_mask.unsqueeze(dim=0)).item() to tot += F.cross_entropy(pred.unsqueeze(dim=0), true_mask.unsqueeze(dim=0).squeeze(1)).item()
  • In dataset.py, I divide the image by 255 if img_trans.max() > 1, only when it is an input image, and not if it is a mask.

You can checkout my fork here

@alkeksibusiii maybe this works for you as well.

@zzzfox
Copy link

zzzfox commented Mar 6, 2020

@tim-vdl

Did you have to make any changes to your predict.py file in order for it to work?

@luosiwu
Copy link

luosiwu commented Jul 19, 2021

Hi,

I've found a solution for my problem. It turns out you don't need the one-hot encoding (CxHxW). I could just use my masks as they are: HxW where the pixel values range between 0 and C-1. I had to change some other parts in the code, however, to make it work:

  • In train.py, I changed line 78 from loss = criterion(masks_pred, true_masks.squeeze(1)) to loss = criterion(masks_pred, true_masks.squeeze(1))
  • In eval.py, I changed line 27 from tot += F.cross_entropy(pred.unsqueeze(dim=0), true_mask.unsqueeze(dim=0)).item() to tot += F.cross_entropy(pred.unsqueeze(dim=0), true_mask.unsqueeze(dim=0).squeeze(1)).item()
  • In dataset.py, I divide the image by 255 if img_trans.max() > 1, only when it is an input image, and not if it is a mask.

You can checkout my fork here

@alkeksibusiii maybe this works for you as well.

Hi,

I've found a solution for my problem. It turns out you don't need the one-hot encoding (CxHxW). I could just use my masks as they are: HxW where the pixel values range between 0 and C-1. I had to change some other parts in the code, however, to make it work:

  • In train.py, I changed line 78 from loss = criterion(masks_pred, true_masks.squeeze(1)) to loss = criterion(masks_pred, true_masks.squeeze(1))
  • In eval.py, I changed line 27 from tot += F.cross_entropy(pred.unsqueeze(dim=0), true_mask.unsqueeze(dim=0)).item() to tot += F.cross_entropy(pred.unsqueeze(dim=0), true_mask.unsqueeze(dim=0).squeeze(1)).item()
  • In dataset.py, I divide the image by 255 if img_trans.max() > 1, only when it is an input image, and not if it is a mask.

You can checkout my fork here

@alkeksibusiii maybe this works for you as well.

hello,thanks for your slolution. It works. But I get another problem:

File "train.py", line 180, in
val_percent=args.val / 100)
File "train.py", line 97, in train_net
val_score = eval_net(net, val_loader, device)
File "/home/xkjs2/NDisk/lsw/Pytorch-UNet-rs/eval.py", line 25, in eval_net
tot += F.cross_entropy(mask_pred.unsqueeze(dim=0), true_masks.unsqueeze(dim=0).squeeze(1)).item()
File "/home/xkjs2/NDisk/lsw/conda_env/snake/lib/python3.7/site-packages/torch/nn/functional.py", line 2693, in cross_entropy
return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
File "/home/xkjs2/NDisk/lsw/conda_env/snake/lib/python3.7/site-packages/torch/nn/functional.py", line 2397, in nll_loss
raise ValueError("Expected target size {}, got {}".format(out_size, target.size()))
ValueError: Expected target size (1, 7, 128, 128), got torch.Size([1, 1, 128, 128])

my mask image is single channel,the pixel values range between 0 and C-1

@milesial
Copy link
Owner

Closing, as this issue is now outdated after the recent code refactor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants