Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do I train this model on a custom dataset? #9

Closed
qqaadir opened this issue Aug 6, 2020 · 8 comments
Closed

How do I train this model on a custom dataset? #9

qqaadir opened this issue Aug 6, 2020 · 8 comments

Comments

@qqaadir
Copy link

qqaadir commented Aug 6, 2020

Hi @soo89,

Wonderful work.

Could you please help understand how can I train this semi-supervised model on my own dataset. I only have images and xml files. I have not worked with VOC/COCO datasets before. Could you please give some suggestions about where should I start? Do I need to split my dataset into labelled and unlabelled datasets to perform semi-supervised learning?

I appreciate your response.

@soo89
Copy link
Owner

soo89 commented Aug 7, 2020

Thanks for your interest.

I suggest you try to train the basic CSD model with PASCAL VOC first.

And, please check the "voc07_consistency.py" file in the data folder.

If you understand how it works, you can apply this algorithm to your dataset.

@qqaadir
Copy link
Author

qqaadir commented Aug 7, 2020

@soo89
Thank you for the prompt response. I will check this "voc07_consistency.py" out. I have trained your model on VOC2007 and VOC2012 datasets. But I am having difficulty in understanding how semi-supervised learning is working. For semi-supervised learning, one should be having a set of labelled and unlabelled images as shown in Fig. 1 (d) of your paper. But VOC2007 and VOC2012 are fully labelled. How did you use partial labels in your code to perform semi-supervised learning? Could you please explain this part to me? Thank you.

@soo89
Copy link
Owner

soo89 commented Aug 7, 2020

please check 'voc07_consistency.py' in detail.

I did not use VOC2012 labeled data.

In other words, it is used the VOC07 images and labeled data and only VOC12 images.

    if (img_id[0][(len(img_id[0]) - 7):] == 'VOC2007'):
        target = ET.parse(self._annopath % img_id).getroot()
        img = cv2.imread(self._imgpath % img_id)
        semi = np.array([1])
    elif (img_id[0][(len(img_id[0]) - 7):] == 'VOC2012'):
        img = cv2.imread(self._imgpath % img_id)
        target = np.zeros([1, 5])
        semi = np.array([0])

@qqaadir
Copy link
Author

qqaadir commented Aug 7, 2020

@soo89 Got it. Thanks. But I am still not sure how to train on my own dataset. I guess, the easiest method that comes to my mind is to replace XML and JPG files in their respective folders with my own files and create trainval.txt files (filenames) for both VOC2007 and VOC2012. Also, I have to change my filenames to 000001.xml (and .jpg). What you think, am I missing something?

@soo89
Copy link
Owner

soo89 commented Aug 7, 2020

I also think that it seems the easiest way.
There seems to be no problem. Good luck.

@qqaadir
Copy link
Author

qqaadir commented Aug 7, 2020

@soo89 One last question before you close this thread. After training this model using train_csd.py and evaluating using eval.py, what will be the output? Will I get label files (.xml) in terms of bounding boxes for unlabelled data? Also, do I must perform training using train_ssd.py and train_csd.py and then run eval.py Or I can just run train_csd.py and then eval.py for semi-supervised?

@soo89
Copy link
Owner

soo89 commented Aug 7, 2020

it does not need a pre-trained supervised object detection model.
It means that you can train only with 'train_csd.py'.

The output of 'eval.py' is the mAP score for the VOC07 test.
To get a box and score, you need to find the code or make the code.

@qqaadir
Copy link
Author

qqaadir commented Aug 8, 2020

Thanks for all your answers to my questions. I will contact you again if i have any other questions. You can close this thread.

@soo89 soo89 closed this as completed Aug 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants