-
Notifications
You must be signed in to change notification settings - Fork 45.4k
Description
System information
- What is the top-level directory of the model you are using: object_detection
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes and No
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04)
- TensorFlow installed from (source or binary): Source
- TensorFlow version (use command below): 1.7
- Python version: 2.7
- Bazel version (if compiling from source): 0.13.0
- GCC/Compiler version (if compiling from source): gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
- CUDA/cuDNN version: 9 / 7
- GPU model and memory: 4GB GTX 1050
- Exact command to reproduce:
I have a general question i am wondering for quite a bit now and nobody could give me an answer sofar:
Why is the num_classes variable in all object_detection/sample/configs set to 90 for the COCO dataset, although it only contains 80 classes? Has it to do something with the COCO/Stuff classes?
IF I want to train a model from scratch on the standard COCO 80 Classes, do i need to set num_classes to 80? Or 90?
The question is basically, what is the purpose of thise +10 classes?
EDIT 1: after closely comparing the the Coco Categories with the mscoco_label_map, it is clear that they both contain the same 80 classes but the label map tensorflow uses by default misses 10 class ids, seemingly random distributed.
Has anybody an idea why that is?
EDIT 2: which leads me to another question: which influence has the variable num_classes to the model dimensions? Where and to which value (probably some filters) does it get multiplied?
EDIT 3: Now i know that COCO initally had 91 classes but deleted 11 of them, so this explains the missing ids But also renews my big confusing question:
When training a new model you need to set the num_classes variable in the config. Default for COCO is 90, although there are only 80 classes in it. The label_map TF uses also contains 80 entries, but the IDs go up till 90, same as COCO obviously.
And here is the crux: the num_classes variable defines the spatial size of one dimenson of all convolutional filters, so if you set it to 90 but only use 80 classes you got 10 redundant dimensions.
The Problem is that the num_classes variable also sets the highest valid class_id. Therefore i dont know how i can train only a selected number of classes.
Lets say i want to train on the first ten class_ids and on the last (90th: toothbrush) then i still have to set the num_classes variable to 90 or the 90th id wont get added.
This is major confusing and i have no idea how to work around this problem.
It would be awesome if you have some deeper knowledge about this and can let me know!
Here is the link to some necessary files:
label_map_util where num_classes is used: https://github.com/tensorflow/models/blob/master/research/object_detection/utils/label_map_util.py
coco label_map with 90 ids but 80 classes: https://github.com/tensorflow/models/blob/master/research/object_detection/data/mscoco_label_map.pbtxt
Faster RCNN MetaArch where num_classes is used for spatial size: https://github.com/tensorflow/models/blob/master/research/object_detection/meta_architectures/faster_rcnn_meta_arch.py