Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[for your support]multi-class problem #15

Open
wangxinyu199306 opened this issue Jun 26, 2018 · 38 comments

Comments

@wangxinyu199306
Copy link

@wangxinyu199306 wangxinyu199306 commented Jun 26, 2018

Hi, thanks for your code, it works well on my own dataset.
However, It seems like this project can only support the binary classification problem.
when I try to modify as following:
train.py -124 -> n_classes = 3
this function: loss = criterion(masks_probs_flat, true_masks_flat) ---from train.py 81
goes wrong because the shape of two parameters dismatch.

My mask image is a single 8-bit image with 4 different intensities (0,1,2,3) represent 4 classes(background,class1,class2,class3)
how to modify the code so that it can support a multi-class problem.
thanks advanced

@milesial

This comment has been minimized.

Copy link
Owner

@milesial milesial commented Jun 26, 2018

Hey!
Oh, I haven't tested it with more classes, so this isn't really a surprise.
Can you check that the img and true_masks (L. 68 - 69) fetched from the batch have correct shape ? If so, check the output of the net (masks_pred L. 75), it should have 4 classes.
This might be an issue related with the cropping or loading of the true masks.
Feel free to open a pull request if you fixed the problem.

@PacificHongyang

This comment has been minimized.

Copy link

@PacificHongyang PacificHongyang commented Jul 4, 2018

where is the directory checkpoints?

@TonyDongLB

This comment has been minimized.

Copy link

@TonyDongLB TonyDongLB commented Aug 3, 2018

I already fix this problem, you can just change the final layer from 1 channel to 4, and using a CrossEntropyLoss for 2D, but I don't know this model wether perform well in your project.
你可以直接把最后一层变为多通道的,之后修改损失函数,用交叉熵2D或者focal loss2D均可,但是不知道在多种类预测时候模型是否表现的好。
and thank for the code again.

@wangzhe0623

This comment has been minimized.

Copy link

@wangzhe0623 wangzhe0623 commented Oct 23, 2018

@TonyDongLB 你好啊,请教下,我在尝试用这份代码做多分类(5类),直接把最后一层变成多通道。,对应true_mask我处理成了uint8单通道的图像,标签为0-4,这样处理对吗?我用了nn.CrossEntropyLoss 损失,前面还是用sigmoid层可以吗?

@TonyDongLB

This comment has been minimized.

Copy link

@TonyDongLB TonyDongLB commented Oct 24, 2018

@wangzhe0623

This comment has been minimized.

Copy link

@wangzhe0623 wangzhe0623 commented Oct 24, 2018

@TonyDongLB 已经解决了,谢谢你哦~

@ChaoLi977

This comment has been minimized.

Copy link

@ChaoLi977 ChaoLi977 commented Nov 17, 2018

@wangzhe0623 Hi, how did you change the code? I modify the n_classes to 2 for 3-classes segmentation. But as @wangxinyu199306 said, it said the target and input of criterion have dismatch shapes. I think the true mask is N1WH, but the prediction masks is N2WH. So they are dismatch.

Then I use CrossEntropyLoss, but the shape still dismatch.

Could you please tell me which loss should be used?

@wangzhe0623

This comment has been minimized.

Copy link

@wangzhe0623 wangzhe0623 commented Nov 19, 2018

@ChaoLi977 CrossEntropyLoss worked well with my code. Have you removed Sigmoid layer? And if you want to predict 3 classes you should change 'n_classes' to '3', not 2.

@ChaoLi977

This comment has been minimized.

Copy link

@ChaoLi977 ChaoLi977 commented Nov 19, 2018

@wangzhe0623 Thanks a lot. I removed Sigmoid layer and use CrossEntropyLoss. Now it reports:"RuntimeError: Dimension out of range (expected to be in range of [-1, 0], but got 1)". What the format of your mask images? I use 8-bit images with value 0,1,2 as the categorical value. But I think maybe the label should be one-hot encoding, rather than categorical value.

@wangzhe0623

This comment has been minimized.

Copy link

@wangzhe0623 wangzhe0623 commented Nov 20, 2018

@ChaoLi977 I do not think anything wrong with your true mask, note that 'predicted mask' should not be flatten or followed by any layers.

@zhouzhubin

This comment has been minimized.

Copy link

@zhouzhubin zhouzhubin commented Nov 22, 2018

@wangzhe0623 你好,请问你一下,那个数据集是怎么做的,我了lableme做了,得到了json。然后我转化成lable.png的,但是我知道怎么搞才能,跟代码里train_mask的.gif格式的,你能给我说一下吗?我是刚搞这个语义分割,谢谢了啊

@zhouzhubin

This comment has been minimized.

Copy link

@zhouzhubin zhouzhubin commented Nov 23, 2018

@wangxinyu199306 你好,请问你一下,你自己的数据是怎么处理的?用什么标注工具得到跟代码里train_mask的.gif格式的,我用的是lableme来标注的,但是后续我不知道怎么处理了,麻烦能说一下吗?谢谢了啊

@wangzhe0623

This comment has been minimized.

Copy link

@wangzhe0623 wangzhe0623 commented Nov 23, 2018

@zhouzhubin 不一定做成gif的,把mask处理成单通道,每个像素值都为对应的category值就可以了。

@zhouzhubin

This comment has been minimized.

Copy link

@zhouzhubin zhouzhubin commented Nov 23, 2018

@wangzhe0623 谢谢了啊!我目前就只有一类的,我如果把mask处理成单通道,我对应的像素值0、255吗?我的背景是0,对应的类别是255,这个算一类还是两类?

@wangzhe0623

This comment has been minimized.

Copy link

@wangzhe0623 wangzhe0623 commented Nov 23, 2018

@zhouzhubin 两类的话,对饮的像素值应该是0和1,如果你用cross entropy损失的话,预测的类别数写成2,用作者代码中写的那个loss,类别数写1.

@zhouzhubin

This comment has been minimized.

Copy link

@zhouzhubin zhouzhubin commented Nov 23, 2018

@wangzhe0623 哦哦,谢谢了啊,我去试试看看

@louxy126

This comment has been minimized.

Copy link

@louxy126 louxy126 commented Dec 6, 2018

@wangxinyu199306 @wangzhe0623 @TonyDongLB 各位老哥,我想请教个问题:
我现在也在做多分类(10类),我根据你们的对话把unet_model.py中
return F.sigmoid(x) 改成了 return torch.nn.CrossEntropyLoss(x)
将train.py里line123
net = UNet(n_channels=3, n_classes=10)中的n_class=1改成了n_class=10
其他地方没改,但是在训练的时候报:
` Starting training:
Epochs: 5
Batch size: 10
Learning rate: 0.1
Training size: 698
Validation size: 36
Checkpoints: True
CUDA: False

Starting epoch 1/5.
/home/wrc/anaconda3/envs/unet/lib/python3.5/site-packages/torch/nn/modules/upsampling.py:122: UserWarning: nn.Upsampling is deprecated. Use nn.functional.interpolate instead.
warnings.warn("nn.Upsampling is deprecated. Use nn.functional.interpolate instead.")
Traceback (most recent call last):
File "/media/wrc/新加卷/lxy/Pytorch-UNet/train.py", line 139, in
img_scale=args.scale)
File "/media/wrc/新加卷/lxy/Pytorch-UNet/train.py", line 76, in train_net
masks_probs_flat = masks_pred.view(-1)
File "/home/wrc/anaconda3/envs/unet/lib/python3.5/site-packages/torch/nn/modules/module.py", line 518, in getattr
type(self).name, name))
AttributeError: 'CrossEntropyLoss' object has no attribute 'view'

Process finished with exit code 1`
想请问这是哪里出了问题?谢谢

@yuyijie1995

This comment has been minimized.

Copy link

@yuyijie1995 yuyijie1995 commented Dec 9, 2018

I made my customized dataset which has 2 classes and the mask's pixel number I choose is 0,1,2. And 0 is the background .I changed the unet model's class number as 2 and the Loss function is CrossEntropyLoss().But the error happened ,Could you give me some suggestions? @wangxinyu199306 @TonyDongLB
/home/wrc/anaconda3/envs/pytorch0.4/lib/python3.6/site-packages/torch/nn/modules/upsampling.py:122: UserWarning: nn.Upsampling is deprecated. Use nn.functional.interpolate instead.
warnings.warn("nn.Upsampling is deprecated. Use nn.functional.interpolate instead.")
/home/wrc/anaconda3/envs/pytorch0.4/lib/python3.6/site-packages/torch/nn/functional.py:1006: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")
Traceback (most recent call last):
File "/home/wrc/yuyijie/Pytorch-UNet/train.py", line 139, in
img_scale=args.scale)
File "/home/wrc/yuyijie/Pytorch-UNet/train.py", line 80, in train_net
loss = criterion(masks_probs_flat, true_masks_flat)
File "/home/wrc/anaconda3/envs/pytorch0.4/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/wrc/anaconda3/envs/pytorch0.4/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 862, in forward
ignore_index=self.ignore_index, reduction=self.reduction)
File "/home/wrc/anaconda3/envs/pytorch0.4/lib/python3.6/site-packages/torch/nn/functional.py", line 1550, in cross_entropy
return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
File "/home/wrc/anaconda3/envs/pytorch0.4/lib/python3.6/site-packages/torch/nn/functional.py", line 975, in log_softmax
return input.log_softmax(dim)
RuntimeError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

@yuyijie1995

This comment has been minimized.

Copy link

@yuyijie1995 yuyijie1995 commented Dec 9, 2018

@ChaoLi977 I met the same problem as you mentioned ,have you solved it?

@Jee-King

This comment has been minimized.

Copy link

@Jee-King Jee-King commented Dec 20, 2018

@louxy126
我把sigmoid改成softmax,损失函数改成torch.nn.CrossEntropyLoss().
需要对label和predict结果做一些处理,来满足crossentropyloss() 的需求。
大概是这样

            masks_pred = net(imgs)
            true_masks_flat = true_masks.long()
            n, c, h, w = masks_pred.size()
            masks_probs_flat = masks_pred.transpose(1, 2).transpose(2, 3).contiguous()
            masks_probs_flat = masks_probs_flat[true_masks.view(n, h, w, 1).repeat(1, 1, 1, c) >= 0]
            masks_probs_flat = masks_probs_flat.view(-1, c)
            # target: (n*h*w,)
            mask = true_masks_flat >= 0
            true_masks_flat = true_masks_flat[mask]
            # true_masks_flat = true_masks#.view(-1)
            print(masks_probs_flat.size(), true_masks_flat.size())
            loss = criterion(masks_probs_flat, true_masks_flat)

用的数据集是cityscapes

@Jee-King

This comment has been minimized.

Copy link

@Jee-King Jee-King commented Dec 20, 2018

@wangxinyu199306 @TonyDongLB 我有一个问题,通过网络预测得到的图片像素值在0和1之间。假设我的数据有35类,那么我应该怎么得到最终的图像呢?直接乘以35吗?

@Jee-King

This comment has been minimized.

Copy link

@Jee-King Jee-King commented Dec 20, 2018

I have a problem, the image pixel value obtained through network prediction is between 0 and 1. Suppose my data has 35 categories, so how should I get the final image? Multiply by 35 directly?

@milesial

This comment has been minimized.

Copy link
Owner

@milesial milesial commented Dec 20, 2018

@Ji-qing If you have 35 categories, your true masks should have 35 values for each pixel (one-hot encoded), and the prediction should output 35 values between 0 and 1 for each pixel. You wouldn't have a single 'image' but 35 layers of masks.

@Jee-King

This comment has been minimized.

Copy link

@Jee-King Jee-King commented Dec 21, 2018

@milesial Thank you for you reply.
I am a newer in semantic segmentation. I do not know how to prepare my datasets, such as voc2012. The label in folder SegmentationClass of voc2012 is colorful, and I want to convert the pixel value of the image to 0-20. But I do not know how to do.
And as you say, how to merge 35 layers of mask? Oo...

@milesial

This comment has been minimized.

Copy link
Owner

@milesial milesial commented Dec 21, 2018

@Ji-qing if the label has multiple colors, then you can just iterate over the pixels and check each color. If it is RED convert the value to [1, 0, 0, 0, ..], if it's BLUE to [0, 1, 0, 0, 0 ... ], and so on. You should be able to convert a W*H (if it has only 1 channel like voc2012 I believe) into a W*H*35 matrix. This preprocessing can be done in any way you like, but check out the OpenCV and Pillow libraries.
You can also do it the other way, and convert the 35 output masks into one, it only matters that the model output and the labels are the same shape so you can apply a loss to them. If the 35 layers are mutually exclusive, it's pretty simple to merge them ([1, 0, 0, 0, ...] to RED, ... ). Then your 'colors' would be your classes.

@yuyijie1995

This comment has been minimized.

Copy link

@yuyijie1995 yuyijie1995 commented Dec 21, 2018

@wangxinyu199306 @TonyDongLB 我有一个问题,通过网络预测得到的图片像素值在0和1之间。假设我的数据有35类,那么我应该怎么得到最终的图像呢?直接乘以35吗?

你好 能加个微信交流一下么 我也刚接触语义分割

@wangxinyu199306

This comment has been minimized.

Copy link
Author

@wangxinyu199306 wangxinyu199306 commented Dec 21, 2018

@ChaoLi977

This comment has been minimized.

Copy link

@ChaoLi977 ChaoLi977 commented Dec 22, 2018

@yuyijie1995

I think the mask should be one-hot encoded, just as the author said

@Ji-qing If you have 35 categories, your true masks should have 35 values for each pixel (one-hot encoded), and the prediction should output 35 values between 0 and 1 for each pixel. You wouldn't have a single 'image' but 35 layers of masks.

So if you have three classes, the mask value of the three classes should be (1,0,0), (0,1,0), (0,0,1). Though I haven't tried this modification, I think it is the crux of the matter.

@Jee-King

This comment has been minimized.

Copy link

@Jee-King Jee-King commented Dec 23, 2018

@milesial @ChaoLi977 @wangxinyu199306
Thank you very much for your reply.

@ThomasLiu1021

This comment has been minimized.

Copy link

@ThomasLiu1021 ThomasLiu1021 commented Mar 11, 2019

@wangzhe0623 Thanks a lot. I removed Sigmoid layer and use CrossEntropyLoss. Now it reports:"RuntimeError: Dimension out of range (expected to be in range of [-1, 0], but got 1)". What the format of your mask images? I use 8-bit images with value 0,1,2 as the categorical value. But I think maybe the label should be one-hot encoding, rather than categorical value.

Hi Have you solved this issues? Or Do you know what cause this? Thanks

@greatgeekgrace

This comment has been minimized.

Copy link

@greatgeekgrace greatgeekgrace commented Apr 8, 2019

@louxy126 请问您解决了么,我遇到同样的问题

@ZengXinyi

This comment has been minimized.

Copy link

@ZengXinyi ZengXinyi commented May 7, 2019

Hello, I have a question. My prediction is two types. The program runs through, but the prediction result is all black. Why is this?

@WangJie8682

This comment has been minimized.

Copy link

@WangJie8682 WangJie8682 commented Jun 10, 2019

Hello, I have a question. My prediction is two types. The program runs through, but the prediction result is all black. Why is this?

i have same problem,do you solve the preblem?

@jlevy44

This comment has been minimized.

Copy link

@jlevy44 jlevy44 commented Jun 19, 2019

I'm also getting a similar problem. I'm getting vertical lines of my predicted class across the photo. One column would be completely 1s, and another 2s. Makes the loss difficult to compute well.

@XiaoAHeng

This comment has been minimized.

Copy link

@XiaoAHeng XiaoAHeng commented Jul 15, 2019

@TonyDongLB 你好啊,请教下,我在尝试用这份代码做多分类(5类),直接把最后一层变成多通道。,对应true_mask我处理成了uint8单通道的图像,标签为0-4,这样处理对吗?我用了nn.CrossEntropyLoss 损失,前面还是用sigmoid层可以吗?

你好,请问如何将对应的 true_mask处理成uint8单通道的图像,标签为0-4?

@XiaoAHeng

This comment has been minimized.

Copy link

@XiaoAHeng XiaoAHeng commented Jul 15, 2019

Hi, thanks for your code, it works well on my own dataset.
However, It seems like this project can only support the binary classification problem.
when I try to modify as following:
train.py -124 -> n_classes = 3
this function: loss = criterion(masks_probs_flat, true_masks_flat) ---from train.py 81
goes wrong because the shape of two parameters dismatch.

My mask image is a single 8-bit image with 4 different intensities (0,1,2,3) represent 4 classes(background,class1,class2,class3)
how to modify the code so that it can support a multi-class problem.
thanks advanced

Can you share the method how to modify the code so that it can support a multi-class problem?
thanks advanced

@cheng321284

This comment has been minimized.

Copy link

@cheng321284 cheng321284 commented Aug 1, 2019

@Ji-qing If you have 35 categories, your true masks should have 35 values for each pixel (one-hot encoded), and the prediction should output 35 values between 0 and 1 for each pixel. You wouldn't have a single 'image' but 35 layers of masks.

how can I compute the dice coefficient between the 35 laysers feature map and the mask,the true mask is a image with pixel's gray from 0~34

@shixiaotong1101

This comment has been minimized.

Copy link

@shixiaotong1101 shixiaotong1101 commented Oct 21, 2019

@TonyDongLB 已经解决了,谢谢你哦~

你好,我想请教一下,具体多类别分割要怎么处理

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.