Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running on own dataset #15

Closed
athus1990 opened this issue Oct 18, 2017 · 14 comments
Closed

Running on own dataset #15

athus1990 opened this issue Oct 18, 2017 · 14 comments

Comments

@athus1990
Copy link

Hi,
I am planning to run this on cityscapes dataset to check the results(Detection got from instance).However It seems extremely complicated to train a simple network. Just changing the image paths is not sufficient.
Are you planning to add any read me on how to train on other datasets?
If not could you let me know what all I files I would need to change to simply train ?

@fastlater
Copy link

fastlater commented Oct 19, 2017

@athus1990
Did you tried the training.py script?
If yes, you can see the scripts involved in the training process. In general, datasets.py, loader.py, and training.py.I recommend to you try to train using the VOC and COCO dataset with different images first. Then, create new files based on those 3 scripts and try to change the classes with your own classes. There is a demo.py also.
Did you prepare your data?
You need annotations and segmented images of your raw images. Raw images only are not enough. Then, you need to encode the training data into a protobuf file.
When you said: Just changing the image paths is not sufficient. what do you mean by that? Did you got and error?

@dvornikita
Copy link
Owner

Hi @athus1990,
I'm planning to rewrite the data pipeline with the new tensorflow Dataset functionality to make it neater.
For now, if you want to run it with another dataset, you need to create a new loader that reads images and their annotations (similar to voc_loader.py or coco_loader.py) and to write a function in datasets.py that encodes your data into TFRecords.
P.S. To train on a new dataset is almost never as easy as just changing the paths ;)

@athus1990
Copy link
Author

@fastlater @dvornikita Thanks a lot for the comments. Yea I prepared my data to match the VOC datatsets(xml annotations and image labels for segmentation).It is just that somewhere in the middle things fail and I need to debug using breakpoints in pycharm. It gets pretty complicated as even the data loader is using TF and parallel reading modules and unless you run the graph you wont know what went wrong.

I was just wondering if you could add a simple readme for the steps to be taken for new dataset. Or at least have an alternative simple data loader/reader like you said. I shall work on it and post my experiences or changes I made for it to work with cityscapes too.
Thanks a lot!

@hondaathma
Copy link

hondaathma commented Oct 20, 2017

@fastlater @dvornikita I have started training on cityscapes successfully. However I had a question regarding the classes.

image

As you can see in cityscapes dataset not all segmentation classes have detection. In your code I have used the same classnames so they match the same segmentation ids.Therefore for segmentation classids 0 to 10 there are no bounding boxes at all from xml files.Do you think this will affect the performance of the other classes that have detections?

Also only changed the VOCDataloader to support cityscapes data(classnames,imglocations and so on).Do you think I missed something else that I need to change in the Resnet50 Net to support this class imbalance.

@fastlater
Copy link

@hondaathma Well, I haven't train the model with new classes yet.
3 ideas for the missing bounding boxes:

  1. Check in the internet if someone added the boxes and uploaded the file somewhere.
  2. Create a code to check pixels through the segmented ground truth and automatically add the box.
  3. Manually add them using labelImg (desperate approach but just in case that to test this dataset is important for you).

If you wanna try with changes in the code, you will have to wait for @dvornikita reply.

@hondaathma
Copy link

hondaathma commented Oct 24, 2017

@fastlater A quick update,
I was able to still train with the imbalance is classes without much change to the code. that is the object detector predicts boxes for all the classes that have bboxes and the segmentation segments all the classes even background(like road,sidewalk etc.). The accuracy is not that great.But this might be because of the citsycapes image size is very huge and resizing to 300 or 512 doesnt make sense.

"Create a code to check pixels through the segmented ground truth and automatically add the box."
-is not a good idea. objects that are close together will be segmented as one cluster.putting a bounding box over all of them will just give one huge bbox.What we need is instance.
Any way the point is the net still works without the need for classes to have both bbox an segmentation Ground truth.
Thanks!
example of a good image(poor segmentation though!)

frankfurt_000000_000576_det_50
frankfurt_000000_000576_segmentation

@dvornikita
Copy link
Owner

Hi @hondaathma, I think it makes perfect sense to not detect the sky, fences, vegetation and so on. As for traffic signs, you can obtain boxes from instance segmentation masks. In the mode where you have fine annotation and ignoring crowd annotations for this class, I guess you will be able to train the model better. Ideally, you would like to not consider anchor boxes that match to crowd annotations at all in the final loss, to not confuse the network by penalizing these detections. Unfortunately, this functionality is not implemented in our pipeline.

@hondaathma
Copy link

Ideally, you would like to not consider anchor boxes that match to crowd annotations at all in the final loss, to not confuse the network by penalizing these detections.

Hey @dvornikita when you said crowd annotations you mean boxes that fall on sky,fences,vegetation and so on right(and not on car pedestrian and so on right)?

One more question. when you choose anchor boxes sizes how do you do it? for example if we find the mean bounding box size of pedestrians in the image can we use that as information to change the anchor size? If so where should I change or add more anchor sizes in the code?
Only place i see it is in 'aspect_ratios': [[2], [2, 3], [2, 3], [2, 3], [2], [2]], in config.py but I dont understand what these numbers mean?
Thanks for all your help.

@dvornikita
Copy link
Owner

@hondaathma I meant something different. Sometimes a bunch of cars or street signs could be segmented as one instance of its class with mark "crowd" or so. It typically happens when the instances are too small to label each separately and are close to each other so they could be grouped and labeled as one thing.

The initial tiling is not conditioned on anything (neither on the classes) so you would bias by doing so. If it is your plan, do it as follows:
Let's say we have L - a list of lists - which is parameter defining aspect ratios, let's say L = [[2], [2, 3], [2, 3], [2, 3], [2], [2]]. It's necessary that len(L) = n, where n is the number of layers from which you do detection - the number of layers in the upscale-stream. If L[i] = [2, 3] it means that for the i'th layer we use 5 sets of bounding boxes with different aspect ratios: [1, 2, 1/2, 3, 1/3]. If L[i] = [2] then it's [1, 2, 1/2] and so on.

@kecaiwu
Copy link

kecaiwu commented Nov 22, 2017

I want to test the blitznet on the cityscapes, however, i get the 'InvalidArgumentError (see above for traceback): assertion failed: [labels out of bound] [Condition x < y did not hold element-wise:x (mean_iou/confusion_matrix/control_dependency:0) = ] [11 11 11...] [y (mean_iou/ToInt64_2:0) = ] [21]' error, @hondaathma. Can you provide some advise?

@hondaathma
Copy link

@kecaiwu I believe something is wrong with the evaluation mean ap caluclation. In the code the meanAP for detection is calculated over all the classes. but mean in cityscapes has to be only from class 11 to class 20( as class1 to class 10 are all background,road etc..).

so i had to change this function evaluation.py

 def compute_ap(self):
        """computes average precision for all categories"""
        aps = {}
        for cid in range(11, self.loader.num_classes-1):
            cat_name = self.loader.ids_to_cats[cid]
            rec, prec = self.eval_category(cid)
            ap = voc_ap(rec, prec, self.loader.year == '07')
            aps[self.loader.ids_to_cats[cid]] = ap
        print 'calculated MAP for number of class: '+str(len(range(11, self.loader.num_classes)))
        return aps

Try debugging it step by step to see where exactly it is failing.

@kecaiwu
Copy link

kecaiwu commented Nov 23, 2017

Thank your for help, now is ok @hondaathma

@ghost
Copy link

ghost commented Dec 6, 2017

Hi @hondaathma
I just started using this network and also interested in training it on Cityscapes. Can I ask a few questions regarding how to train on Cityscapes? First question is where i can download the Cityscapes instance dataset? Their Download page seems not have instance segmentation? Second question is if you could share the process what you have changed to make it work on Cityscapes and how you turn the instance segmentation to bounding boxes? Really appreciate your help!

@dvornikita
Copy link
Owner

Closing due to lack of recent activity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants