Skip to content

Significant Performance/Precision degrade in Object Detection module post July release. #9030

@gowrishankarin

Description

@gowrishankarin

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • I am using the latest TensorFlow Model Garden release and TensorFlow 2.
  • I am reporting the issue to the correct repository. (Model Garden official or research directory)
  • I checked to make sure that this issue has not already been filed.

1. The entire URL of the file you are using

https://github.com/tensorflow/models/tree/master/research/object_detection

2. Describe the bug

Post July 2020 update of Object Detection(OD) API, MobileNet V1 training performance degraded significantly compared to older version of the API.
Object Detection training parameters are captured for a customized dataset and the observations are added as part of the issue.

With older API, OD module started giving significant results while training around 500th step. However the updated OD APIs did not produce any detection results or noticeable reduction of loss upto 2000 steps. Here attached the logs and details

3. Steps to reproduce

  • Train Mobile Net V1 for a custom data with the latest release and June 2020 source code from the master branch
  • Compare the training performance by monitoring Loss and Images
  • Use Tensorboard for all parameters
    Observe the results in tensorboard and your custom detection modules. Here, the tensorboard results are presented.

ssd_mobilenet_v1_coco.config.zip

4. Expected behavior

TF2 upgrade should not break the behavior of TF1, backward compatibility should be ensured

5. Additional context

TF OD APIs Pre July 2020 Release

Git Commit ID: 420a725
Training Steps: 701
Loss: 6.5624743

Accumulating evaluation results...
DONE (t=0.14s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.307
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.612
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.274
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.307
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.327
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.535
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.535
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.535
INFO:tensorflow:Finished evaluation at 2020-08-03-16:46:07
I0803 16:46:07.283921 4533624256 evaluation.py:275] Finished evaluation at 2020-08-03-16:46:07
INFO:tensorflow:Saving dict for global step 664: DetectionBoxes_Precision/mAP = 0.3067213, DetectionBoxes_Precision/mAP (large) = 0.30674505, DetectionBoxes_Precision/mAP (medium) = -1.0, DetectionBoxes_Precision/mAP (small) = -1.0, DetectionBoxes_Precision/mAP@.50IOU = 0.6123704, DetectionBoxes_Precision/mAP@.75IOU = 0.2738431, DetectionBoxes_Recall/AR@1 = 0.32653588, DetectionBoxes_Recall/AR@10 = 0.5347421, DetectionBoxes_Recall/AR@100 = 0.5347421, DetectionBoxes_Recall/AR@100 (large) = 0.5347421, DetectionBoxes_Recall/AR@100 (medium) = -1.0, DetectionBoxes_Recall/AR@100 (small) = -1.0, Loss/classification_loss = 5.283017, Loss/localization_loss = 0.9461733, Loss/regularization_loss = 0.33328548,
Loss/total_loss = 6.5624743, global_step = 664, learning_rate = 0.004, loss = 6.5624743
image
image

TF New API, Post July 2020 Release

Git Commit ID: 57253eb
Training Steps: 781
Loss: 15.350301

Training Log

Accumulating evaluation results...
DONE (t=0.14s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.001
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.006
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.001
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.001
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.017
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.031
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.031
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.031
INFO:tensorflow:Finished evaluation at 2020-08-03-14:46:19
I0803 14:46:19.184983 4519620032 evaluation.py:275] Finished evaluation at 2020-08-03-14:46:19
INFO:tensorflow:Saving dict for global step 781: DetectionBoxes_Precision/mAP = 0.0014864391, DetectionBoxes_Precision/mAP (large) = 0.0014864391, DetectionBoxes_Precision/mAP (medium) = -1.0, DetectionBoxes_Precision/mAP (small) = -1.0, DetectionBoxes_Precision/mAP@.50IOU = 0.0055461, DetectionBoxes_Precision/mAP@.75IOU = 0.00060619856, DetectionBoxes_Recall/AR@1 = 0.017431447, DetectionBoxes_Recall/AR@10 = 0.031408027, DetectionBoxes_Recall/AR@100 = 0.031408027, DetectionBoxes_Recall/AR@100 (large) = 0.031408027, DetectionBoxes_Recall/AR@100 (medium) = -1.0, DetectionBoxes_Recall/AR@100 (small) = -1.0, Loss/classification_loss = 12.582876, Loss/localization_loss = 2.3171208, Loss/regularization_loss = 0.45030272,
Loss/total_loss = 15.350301, global_step = 781, learning_rate = 0.004, loss = 15.350301
image
image

6. System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Mac OS, Catalina - 10.15.6. 2.7 GHz Quad-Core Intel Core i7, 16 GB 2133 MHz LPDDR3
  • TensorFlow version (use command below): v1.15.0-rc3-22-g590d6eef7e 1.15.0
  • Python version: Python 3.7.7

Metadata

Metadata

Assignees

Labels

models:researchmodels that come under research directorytype:bugBug in the code

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions