-
Notifications
You must be signed in to change notification settings - Fork 45.2k
Description
Prerequisites
Please answer the following questions for yourself before submitting an issue.
- I am using the latest TensorFlow Model Garden release and TensorFlow 2.
- I am reporting the issue to the correct repository. (Model Garden official or research directory)
- I checked to make sure that this issue has not already been filed.
1. The entire URL of the file you are using
https://github.com/tensorflow/models/tree/master/research/object_detection
2. Describe the bug
Post July 2020 update of Object Detection(OD) API, MobileNet V1 training performance degraded significantly compared to older version of the API.
Object Detection training parameters are captured for a customized dataset and the observations are added as part of the issue.
With older API, OD module started giving significant results while training around 500th step. However the updated OD APIs did not produce any detection results or noticeable reduction of loss upto 2000 steps. Here attached the logs and details
3. Steps to reproduce
- Train Mobile Net V1 for a custom data with the latest release and June 2020 source code from the master branch
- Compare the training performance by monitoring Loss and Images
- Use Tensorboard for all parameters
Observe the results in tensorboard and your custom detection modules. Here, the tensorboard results are presented.
ssd_mobilenet_v1_coco.config.zip
4. Expected behavior
TF2 upgrade should not break the behavior of TF1, backward compatibility should be ensured
5. Additional context
TF OD APIs Pre July 2020 Release
Git Commit ID: 420a725
Training Steps: 701
Loss: 6.5624743
Accumulating evaluation results...
DONE (t=0.14s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.307
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.612
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.274
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.307
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.327
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.535
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.535
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.535
INFO:tensorflow:Finished evaluation at 2020-08-03-16:46:07
I0803 16:46:07.283921 4533624256 evaluation.py:275] Finished evaluation at 2020-08-03-16:46:07
INFO:tensorflow:Saving dict for global step 664: DetectionBoxes_Precision/mAP = 0.3067213, DetectionBoxes_Precision/mAP (large) = 0.30674505, DetectionBoxes_Precision/mAP (medium) = -1.0, DetectionBoxes_Precision/mAP (small) = -1.0, DetectionBoxes_Precision/mAP@.50IOU = 0.6123704, DetectionBoxes_Precision/mAP@.75IOU = 0.2738431, DetectionBoxes_Recall/AR@1 = 0.32653588, DetectionBoxes_Recall/AR@10 = 0.5347421, DetectionBoxes_Recall/AR@100 = 0.5347421, DetectionBoxes_Recall/AR@100 (large) = 0.5347421, DetectionBoxes_Recall/AR@100 (medium) = -1.0, DetectionBoxes_Recall/AR@100 (small) = -1.0, Loss/classification_loss = 5.283017, Loss/localization_loss = 0.9461733, Loss/regularization_loss = 0.33328548,
Loss/total_loss = 6.5624743, global_step = 664, learning_rate = 0.004, loss = 6.5624743


TF New API, Post July 2020 Release
Git Commit ID: 57253eb
Training Steps: 781
Loss: 15.350301
Training Log
Accumulating evaluation results...
DONE (t=0.14s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.001
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.006
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.001
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.001
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.017
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.031
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.031
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.031
INFO:tensorflow:Finished evaluation at 2020-08-03-14:46:19
I0803 14:46:19.184983 4519620032 evaluation.py:275] Finished evaluation at 2020-08-03-14:46:19
INFO:tensorflow:Saving dict for global step 781: DetectionBoxes_Precision/mAP = 0.0014864391, DetectionBoxes_Precision/mAP (large) = 0.0014864391, DetectionBoxes_Precision/mAP (medium) = -1.0, DetectionBoxes_Precision/mAP (small) = -1.0, DetectionBoxes_Precision/mAP@.50IOU = 0.0055461, DetectionBoxes_Precision/mAP@.75IOU = 0.00060619856, DetectionBoxes_Recall/AR@1 = 0.017431447, DetectionBoxes_Recall/AR@10 = 0.031408027, DetectionBoxes_Recall/AR@100 = 0.031408027, DetectionBoxes_Recall/AR@100 (large) = 0.031408027, DetectionBoxes_Recall/AR@100 (medium) = -1.0, DetectionBoxes_Recall/AR@100 (small) = -1.0, Loss/classification_loss = 12.582876, Loss/localization_loss = 2.3171208, Loss/regularization_loss = 0.45030272,
Loss/total_loss = 15.350301, global_step = 781, learning_rate = 0.004, loss = 15.350301


6. System information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Mac OS, Catalina - 10.15.6. 2.7 GHz Quad-Core Intel Core i7, 16 GB 2133 MHz LPDDR3
- TensorFlow version (use command below): v1.15.0-rc3-22-g590d6eef7e 1.15.0
- Python version: Python 3.7.7