Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend the code for Cityscapes dataset #21

Open
muralabmahmuds opened this issue Oct 5, 2017 · 17 comments
Open

Extend the code for Cityscapes dataset #21

muralabmahmuds opened this issue Oct 5, 2017 · 17 comments

Comments

@muralabmahmuds
Copy link

muralabmahmuds commented Oct 5, 2017

Hi @tkuanlun350, thank you very much for your code.
As many other said, your codes are running perfectly for CamVid dataset.

I will extend it by using Cityscapes dataset, so I am trying to make some adjustments.
However, I still cannot set up the program since CamVid and Cityscapes are so different.
Cityscapes dataset has 19 object classes, 2975 train images, and 500 val images.
The image size is 1024x2048.

There is an error message:
ValueError: Dimensions must be equal, but are 19 and 11 for 'loss/Mul' (op: 'Mul') with input shapes: [10485760,19], [11].

So, would you like to tell us how to deal with it, please?

Thank you very much.
Mahmud

@tkuanlun350
Copy link
Owner

Hi,
You may need to change the classification layer and as well as Inputs pipeline, changing everything with 11 class to 19 class.
The current codebase is hard to extend to other dataset. I will makes some changes and test it in this week.
I am planning to change input pipeline to TF-records and tf-slim.

@muralabmahmuds
Copy link
Author

Hi, thank you for your reply.

I have made some modification and fortunately, it has been running for Cityscapes dataset.
I tried my best to change every variable related to image dimension, number of classes.
The last thing is about modifying your cal_loss function. I re-calculated all values based on my dataset.

However, the performance was not satisfying enough. Therefore, I have other questions.

  1. How to know the corresponds between classes? For example, if I have class 'A', 'B', and 'C', and I set the number of class with 3, then the validation result will show "class regarding the functionality of each py file #1 accuracy = 0.7", "class got it #2 accuracy = 0.9", "class About model checkpoint #3 accuracy = 0.5". So, what is actually class regarding the functionality of each py file #1 there? Is it 'A', 'B', or 'C'? Then how about other classes?

  2. During the training process, there are some classes with zero-accuracy. What probably makes that situation? Maybe you experienced the same thing.

  3. Can you give a brief clue how to modify the testing code, in order to know which image whose result is saved as "testing_image.png"? I am also wondering how to modify the color if I have 19 classes.

Anyway, looking forward to seeing your new codes.

Thanks a million for your cooperation.
Mahmud

@abhigoku10
Copy link

@muralabmahmuds can you pls suggest the locations where you have made the modifications in the current files to work it for cityscapes dataset , i am trying this for my application

Thanking you in advance

@muralabmahmuds
Copy link
Author

@abhigoku10
I modified four files: "Input.py", "main.py", "model.py", and "Utils.py".
Please have a look at the codes I attach below.
I mark with "#change" to some code lines that I modified (maybe I didn't mark some codes).

Note: please change the file extensions from ".txt" into ".py"

I hope it helps.

Utils.txt
main.txt
model.txt
Input.txt

@tom-bu
Copy link

tom-bu commented Jan 30, 2018

@muralabmahmuds Were you ever able to figure out which class corresponds to which and why you are getting 0 accuracy? I realized that the CamVid dataset that Alex Kendall provides on his GitHub is actually different from the official Camvid dataset. Alex has actually preprocessed the images, such that normal RGB labelled photos are converted to one channel pictures and labeled somehow. I'm facing the same problem as you, and I'd love to hear how others are dealing with this issue.

@muralabmahmuds
Copy link
Author

@Tom-Hao I am not really sure with that. However, I was thinking that numbers shown in validation results, i.e. "class #1 accuracy = 0.7", "class #2 accuracy = 0.9", "class #3 accuracy = 0.5", are for classes 'A', 'B', and 'C', respectively. That was based on somewhere in Alex Kendall's GitHub. Thus, we usually need to set the class labels with ordered integers starting from 0 to the number of class (i.e. 18 for Cityscapes in train_id encoding scheme) instead of arbitrary labels.

For the zero accuracy issue, I am also not sure with this. Please, just correct me if I am wrong. For example, if class 2 got 0-accuracy, I thought that in the validation time, there was no image taken as a sample that contained label 2. I think, but I am not sure, there is a parameter that limits how many images are for validation. However, when we test the trained model afterward, all class will be calculated as the non-zero results.

@tom-bu
Copy link

tom-bu commented Jan 30, 2018

How often do you get 0-accuracy. I realized you need to convert the RGB file to gray scale and that the different classes are associated with different gray indexes, I believe. Take a look at this thread. alexgkendall/caffe-segnet#3

@tom-bu
Copy link

tom-bu commented Jan 31, 2018

I am for certain now that the gray scale conversion code will dictate which classes corresponds to which. This step is missing from the SegNet Tutorial because the CamVid images given by Alex are already gray scaled labelled from 0-11. For the CamVid dataset, you can see which class number corresponds to which in the Utils file but if you're using a new dataset, you define your class numbers in the gray scale conversion.

@muralabmahmuds
Copy link
Author

@Tom-Hao Yes, sure. For training and performance calculation purposes, we need two kinds of sets: the original images and the annotations. The original image set is in the RGB mode, while the annotation set is in the single layer of uint8 mode. Right, the Cityscapes dataset also already has the annotation images. There are two versions: using 'id' label (0 ~ 33 + 255) and using 'train_id' label (0 ~ 19 + 255).

If you want to use 'id' label, it's already provided inside the folder 'gtFine' where the images are ended with "___labelIds.png". If you want to use 'train_id' label, find and run the appropriate code (if I am not mistaken, it is "createTrainIdLabelImgs.py" inside the folder "preparation"). It will result in another version of annotation image, ended with "___labelTrainIds.png". So, you don't need to change them by yourself.

@tom-bu
Copy link

tom-bu commented Feb 2, 2018

Oh ok, I was using the KITTI dataset, so I didn't know the Cityscape provided you a gray scale. Do you still have any questions about it then? I was able to get mine to work.

@tom-bu
Copy link

tom-bu commented Feb 6, 2018

@muralabmahmuds I'm looking at the Cityscape dataset, and noticed that all of the labeled test images are black. Is that the case for you?

@muralabmahmuds
Copy link
Author

hi @Tom-Hao
I already had the result of my Tf-SegNet for cityscapes, yet it's not good enough.
After training the model, I test it using training, validation, and testing sets.

Yes, the case also happened to me.

Please, note that the annotations are only available for training and validation sets, while the annotation of the testing set is not available. It means that you can only calculate the performance by your own way for training and validation images. On the other hand, you can get the performance evaluation of your testing result by submitting it to Cityscapes benchmarking https://www.cityscapes-dataset.com/benchmarks/.

@tom-bu
Copy link

tom-bu commented Feb 8, 2018

@muralabmahmuds Yeah, I trained my model on the Cityscape and tested it on the CamVid dataset, and it's not great. Since the Cityscape has so many more pictures, the original max step of 20,000 doesn't seem enough. How many steps do you train yours and what loss do you get? I've resized my Cityscape to 360x480 just to keep the parameters at a reasonable amount. I also have a question about the loss weights. If the ratio's are the same, does changing the magnitude do anything?

@aod148
Copy link

aod148 commented Feb 19, 2018

Hi @tkuanlun350 Thank you for your coding and @muralabmahmuds Thank you for your editing
I have some questions about cal_loss.
How do you calculated cal_loss in each class? in original coding there are 11 classes but you modify to 19 classes with new cal_loss.
Could you please tell me how to calculate cal_loss in each class? I also would like to apply this coding to my dataset.

Thank you in advance

@xiaomixiaomi123zm
Copy link

@muralabmahmuds Hello, I also want to try using the cityspaces dataset, but I did not change the image size. Can you provide you with the modified image or code? thank you very much~

@xiaomixiaomi123zm
Copy link

@tom-bu Hi, I also want to try using the cityspaces dataset, but I did not change the image size. Can you provide you with the modified image or code? thank you very much~

@unrivalle
Copy link

@tom-bu Hi , I want to create custom dataset for segmentation i am facing issue with masking .Can you please guide me?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants