Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset Generation #2

Closed
XiongweiWu opened this issue Aug 10, 2020 · 3 comments
Closed

Dataset Generation #2

XiongweiWu opened this issue Aug 10, 2020 · 3 comments

Comments

@XiongweiWu
Copy link

Three questions:

  1. In 1_split_filter.py#L46-L48, to my point, sampled image should not contain objects in voc classes. However, this implementation seems only the image with tiny objects will be excluded;

  2. In 2_balance.py#L57, each category only contains no more than 80 instances?

  3. How to generate final_split_voc_10_shot_instances_train2017.json ?

@fanq15
Copy link
Owner

fanq15 commented Aug 10, 2020

  1. pick non-voc class
  2. 80 is the minimum instance number in each class
  3. You can use the given final_split_voc_10_shot_instances_train2017.json in the new_annotations dir for a fair comparison.

@XiongweiWu
Copy link
Author

XiongweiWu commented Aug 10, 2020

@fanq15

  1. So in your non-voc set, the images may also contain voc class instance (but not labeled) ?

  2. It seems that you first compute the total number of instance per class across all images stored in 'all_cls_dict', and then for each image, if one contained instance category number is less than 80 in 'all_cls_dict', then save all instances in this image for training, otherwise discard all the instances and remove the instances whose number is larger than 80. I am a bit confused about this file.

  3. Can u provide 30-shots json file?

@fanq15
Copy link
Owner

fanq15 commented Aug 10, 2020

  1. Yes. The voc instances are ignored.
  2. About the 2_balance.py:
    2.1. Yes, it should be the instance number per class. I fixed the expression in the former answer.
    2.2. There is a bug in the 2_balance.py and it actually does not balance the categories. But this bug does not affect the training and evaluation. I will fix this bug and see if the image balance can improve the performance.
  3. There is no 30-shot json file currently. I will add it later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants