[OD] 6.0 inference regression between CPU and GPU. #2955

jakesabathia2 · 2020-01-24T20:34:21Z

6.0's CPU and GPU has prediction regression.
For the same image,
GPU:

#predictions
{'confidence': 0.6841502785682678, 'type': 'rectangle', 'coordinates': {'y': 167.98015236854553, 'x': 289.0607535839081, 'height': 238.21115112304688, 'width': 301.1848449707031}, 'label': 'Croissant'}
{'confidence': 0.5636424422264099, 'type': 'rectangle', 'coordinates': {'y': 168.73490810394287, 'x': 351.99418142437935, 'height': 183.61944580078125, 'width': 161.6607208251953}, 'label': 'Croissant'}
{'confidence': 0.5182375311851501, 'type': 'rectangle', 'coordinates': {'y': 101.34550631046295, 'x': 123.91261160373688, 'height': 176.6996612548828, 'width': 241.56607055664062}, 'label': 'Croissant'}
{'confidence': 0.4028134346008301, 'type': 'rectangle', 'coordinates': {'y': 210.60317158699036, 'x': 199.7049316763878, 'height': 175.12136840820312, 'width': 237.15415954589844}, 'label': 'Croissant'}
{'confidence': 0.1619817614555359, 'type': 'rectangle', 'coordinates': {'y': 106.30784332752228, 'x': 233.0048382282257, 'height': 219.98683166503906, 'width': 288.35015869140625}, 'label': 'Croissant'}
{'confidence': 0.10118640214204788, 'type': 'rectangle', 'coordinates': {'y': 213.20754289627075, 'x': 361.471713334322, 'height': 101.99461364746094, 'width': 127.2068862915039}, 'label': 'Croissant'}
{'confidence': 0.027622569352388382, 'type': 'rectangle', 'coordinates': {'y': 65.32773971557617, 'x': 268.4330843389034, 'height': 119.54254913330078, 'width': 164.8693084716797}, 'label': 'Croissant'}
{'confidence': 0.010357137769460678, 'type': 'rectangle', 'coordinates': {'y': 84.39711928367615, 'x': 118.78975331783295, 'height': 107.74749755859375, 'width': 172.99261474609375}, 'label': 'Croissant'}
{'confidence': 0.00467892037704587, 'type': 'rectangle', 'coordinates': {'y': 90.58223068714142, 'x': 315.67183434963226, 'height': 175.4208984375, 'width': 256.6126708984375}, 'label': 'Croissant'}

#evaluation
{'average_precision': {'Coffee': 0.0, 'Croissant': 0.23888888955116272, 'Waffle': 0.0, 'Bagel': 0.0, 'Egg': 0.0, 'Banana': 0.0}}

CPU:

{'confidence': 0.6783925890922546, 'type': 'rectangle', 'coordinates': {'y': 167.71613359451294, 'x': 288.49523663520813, 'height': 239.19659423828125, 'width': 301.3323974609375}, 'label': 'Croissant'}
{'confidence': 0.5581294298171997, 'type': 'rectangle', 'coordinates': {'y': 101.20467245578766, 'x': 123.50158989429474, 'height': 175.7798309326172, 'width': 240.85716247558594}, 'label': 'Croissant'}
{'confidence': 0.5499178171157837, 'type': 'rectangle', 'coordinates': {'y': 168.42872500419617, 'x': 351.87367647886276, 'height': 183.6824951171875, 'width': 160.7376708984375}, 'label': 'Croissant'}
{'confidence': 0.4112585783004761, 'type': 'rectangle', 'coordinates': {'y': 210.60165166854858, 'x': 200.05664974451065, 'height': 175.11065673828125, 'width': 237.96539306640625}, 'label': 'Croissant'}
{'confidence': 0.18294312059879303, 'type': 'rectangle', 'coordinates': {'y': 106.45350515842438, 'x': 233.51837396621704, 'height': 220.2215576171875, 'width': 289.7857971191406}, 'label': 'Croissant'}
{'confidence': 0.11228813976049423, 'type': 'rectangle', 'coordinates': {'y': 213.52950632572174, 'x': 361.25523895025253, 'height': 100.96805572509766, 'width': 126.85247039794922}, 'label': 'Croissant'}
{'confidence': 0.06214667111635208, 'type': 'rectangle', 'coordinates': {'y': 88.37236762046814, 'x': 264.0897363424301, 'height': 159.9544677734375, 'width': 179.37753295898438}, 'label': 'Croissant'}
{'confidence': 0.010909981094300747, 'type': 'rectangle', 'coordinates': {'y': 84.82984900474548, 'x': 119.08041089773178, 'height': 108.2442855834961, 'width': 171.14401245117188}, 'label': 'Croissant'}
{'confidence': 0.004227335564792156, 'type': 'rectangle', 'coordinates': {'y': 109.82267260551453, 'x': 322.89858385920525, 'height': 192.29393005371094, 'width': 212.18304443359375}, 'label': 'Croissant'}

#evaluation
{'average_precision': {'Coffee': 0.0, 'Croissant': 0.28333333134651184, 'Waffle': 0.0, 'Bagel': 0.0, 'Egg': 0.0, 'Banana': 0.0}}

The regression from both predict() and evaluate() are not neglectable.

[First Step] Compare 5.8's CPU, GPU and 6.0's CPU, GPU
I found that only 6.0's CPU has inference regression, that is to say we have issue in tensorflow.

[Second Step] Compare tensorflow with mxnet
I loaded the same weight to tf and mxnet model, and compare the output layer by layer,
the max error for the output feature tensor is 5 * 1e-5 magnitude, which is good.

[Third Step] Compare raw output from tensorflow and mxnet through tc's API
I compared the raw output tensor before the nms for tensorflow and mxnet through tc's API,
surprising the output tensor it self has error up to 0.7.

[Fourth Step] Compare raw input for tensorfow and mxnet model
I compared the input tensor for tensorflow and mxnet and found out they have error up to 0.17 hmm.

[Fifth Step] Mock out the augmenter
I resize all images to 412 * 412 beforehand to mock out the effect of the tf image augmenter, and observed perfect aligned result across tf and mxnet.

[Sixth Step] TF's resize bilinear has regression haha
Found some existing issue report for tensorflow's resize bilinear has regression from other open source API like cv2, mxnet haha.
So I replace the tf resize by cv2.

Now finally we got the same predictions and map!!
GPU:

{'confidence': 0.6841502785682678, 'type': 'rectangle', 'coordinates': {'y': 167.98015236854553, 'x': 289.0607535839081, 'height': 238.21115112304688, 'width': 301.1848449707031}, 'label': 'Croissant'}
{'confidence': 0.5636424422264099, 'type': 'rectangle', 'coordinates': {'y': 168.73490810394287, 'x': 351.99418142437935, 'height': 183.61944580078125, 'width': 161.6607208251953}, 'label': 'Croissant'}
{'confidence': 0.5182375311851501, 'type': 'rectangle', 'coordinates': {'y': 101.34550631046295, 'x': 123.91261160373688, 'height': 176.6996612548828, 'width': 241.56607055664062}, 'label': 'Croissant'}
{'confidence': 0.4028134346008301, 'type': 'rectangle', 'coordinates': {'y': 210.60317158699036, 'x': 199.7049316763878, 'height': 175.12136840820312, 'width': 237.15415954589844}, 'label': 'Croissant'}
{'confidence': 0.1619817614555359, 'type': 'rectangle', 'coordinates': {'y': 106.30784332752228, 'x': 233.0048382282257, 'height': 219.98683166503906, 'width': 288.35015869140625}, 'label': 'Croissant'}
{'confidence': 0.10118640214204788, 'type': 'rectangle', 'coordinates': {'y': 213.20754289627075, 'x': 361.471713334322, 'height': 101.99461364746094, 'width': 127.2068862915039}, 'label': 'Croissant'}
{'confidence': 0.027622569352388382, 'type': 'rectangle', 'coordinates': {'y': 65.32773971557617, 'x': 268.4330843389034, 'height': 119.54254913330078, 'width': 164.8693084716797}, 'label': 'Croissant'}
{'confidence': 0.010357137769460678, 'type': 'rectangle', 'coordinates': {'y': 84.39711928367615, 'x': 118.78975331783295, 'height': 107.74749755859375, 'width': 172.99261474609375}, 'label': 'Croissant'}
{'confidence': 0.00467892037704587, 'type': 'rectangle', 'coordinates': {'y': 90.58223068714142, 'x': 315.67183434963226, 'height': 175.4208984375, 'width': 256.6126708984375}, 'label': 'Croissant'}
{'average_precision': {'Coffee': 0.0, 'Croissant': 0.23888888955116272, 'Waffle': 0.0, 'Bagel': 0.0, 'Egg': 0.0, 'Banana': 0.0}}

CPU:

{'confidence': 0.6868035197257996, 'type': 'rectangle', 'coordinates': {'y': 167.95871257781982, 'x': 289.0740305185318, 'height': 238.32237243652344, 'width': 301.1378479003906}, 'label': 'Croissant'}
{'confidence': 0.5671409368515015, 'type': 'rectangle', 'coordinates': {'y': 168.71926188468933, 'x': 351.95813924074173, 'height': 183.58172607421875, 'width': 161.74087524414062}, 'label': 'Croissant'}
{'confidence': 0.5187163352966309, 'type': 'rectangle', 'coordinates': {'y': 101.33549273014069, 'x': 123.91094863414764, 'height': 176.7119598388672, 'width': 241.39181518554688}, 'label': 'Croissant'}
{'confidence': 0.4052538275718689, 'type': 'rectangle', 'coordinates': {'y': 210.60553193092346, 'x': 199.70719814300537, 'height': 175.0858612060547, 'width': 237.0017547607422}, 'label': 'Croissant'}
{'confidence': 0.1624731868505478, 'type': 'rectangle', 'coordinates': {'y': 106.30392730236053, 'x': 233.04037749767303, 'height': 220.01197814941406, 'width': 288.4682312011719}, 'label': 'Croissant'}
{'confidence': 0.10168814659118652, 'type': 'rectangle', 'coordinates': {'y': 213.19506168365479, 'x': 361.4436239004135, 'height': 101.99732971191406, 'width': 127.12554931640625}, 'label': 'Croissant'}
{'confidence': 0.027944037690758705, 'type': 'rectangle', 'coordinates': {'y': 65.33297896385193, 'x': 268.4473067522049, 'height': 119.56062316894531, 'width': 164.90599060058594}, 'label': 'Croissant'}
{'confidence': 0.010488401167094707, 'type': 'rectangle', 'coordinates': {'y': 84.40530896186829, 'x': 118.79546642303467, 'height': 107.73643493652344, 'width': 173.0646209716797}, 'label': 'Croissant'}
{'confidence': 0.004655329044908285, 'type': 'rectangle', 'coordinates': {'y': 90.5926376581192, 'x': 315.64265191555023, 'height': 175.38531494140625, 'width': 256.51568603515625}, 'label': 'Croissant'}
{'average_precision': {'Coffee': 0.0, 'Croissant': 0.23888888955116272, 'Waffle': 0.0, 'Bagel': 0.0, 'Egg': 0.0, 'Banana': 0.0}}

guihao-liang · 2020-01-24T21:01:04Z

OMG! I'm impressed by your effort to lock this down!

One fundamental question, if the input resizing produces different results, the output should be different between 5.8 and 6.0. Did we do any functional comparison between 5.8 and 6.0?

shantanuchhabra · 2020-01-24T21:06:46Z

src/python/turicreate/toolkits/object_detector/_tf_image_augmenter.py

@@ -8,6 +8,7 @@
 from __future__ import absolute_import as _

 import numpy as np
+import cv2


I may be wrong, but I'm not sure we have taken a dependency on OpenCV. Is there a compelling reason for us to take a dependency on OpenCV here?

Yes.
Tensorflow's resize function is not aligned with mps and Mxnet's,
which causes the inference regression as I show in the summary.
Numpy itself doesn't has resize method for image,
and that's why I'm using cv2's resize method now,
which is consistent between mps and mxnet.

We would need to get legal approval to depend on cv2, and it's a huge dependency to pull in if it's needed only for image resizing. We already resize images in many other places (see, e.g., the image_deep_feature_extraction code path).

If there is a way to use PIL or one of our C++ image resizing utilities instead, we should do that.

@hoytak Thanks for this suggestion!

jakesabathia2 · 2020-01-24T21:33:50Z

OMG! I'm impressed by your effort to lock this down!

One fundamental question, if the input resizing produces different results, the output should be different between 5.8 and 6.0. Did we do any functional comparison between 5.8 and 6.0?

I guess this should be done by benchmark's back comp.
But we only look at evaluation results (map) on the whole dataset, we might not be able to find this regression, since we average through the whole dataset.
So yes I think it is a great suggestion to compare predictions between 5.8 and 6.0 for each image in our benchmark pipeline @TobyRoseman .

jakesabathia2 · 2020-01-27T21:07:24Z

MAP difference comparison :

#gpu
{'average_precision_50': {'Coffee': 0.3550470471382141, 'Croissant': 0.560469388961792, 'Waffle': 0.6576665043830872, 'Bagel': 0.6644784808158875, 'Egg': 0.6086002588272095, 'Banana': 0.9863040447235107}}

#Opencv
{'average_precision_50': {'Coffee': 0.3550470471382141, 'Croissant': 0.5609853863716125, 'Waffle': 0.6552671790122986, 'Bagel': 0.665098249912262, 'Egg': 0.6083458662033081, 'Banana': 0.9863040447235107}}

#Turicreate build in resize
{'average_precision_50': {'Coffee': 0.3550470471382141, 'Croissant': 0.5899662971496582, 'Waffle': 0.6519489288330078, 'Bagel': 0.6636723279953003, 'Egg': 0.6076997518539429, 'Banana': 0.9895591139793396}}

#Tensorflow
{'average_precision_50': {'Coffee': 0.3550470471382141, 'Croissant': 0.5936917066574097, 'Waffle': 0.6564058661460876, 'Bagel': 0.6663496494293213, 'Egg': 0.5886093974113464, 'Banana': 0.9898990392684937}}

#PIL
{'average_precision_50': {'Coffee': 0.3774999976158142, 'Croissant': 0.5606743693351746, 'Waffle': 0.6497138142585754, 'Bagel': 0.6666355729103088, 'Egg': 0.5981951355934143, 'Banana': 0.98731929063797}}

@nickjong @srikris seems like if we are not using opencv, the build in image_resize function in turicreae produces the closest result.

nickjong · 2020-01-28T03:47:02Z

src/python/turicreate/toolkits/object_detector/_tf_image_augmenter.py

+        np_img /= 255.
+        return np_img
+
+    def resize_turicretae_image(image, output_shape):


*turicreate

Even then, I don't love the name of this function. But in the long-run, if this approach proves stable and accurate, we should move this resizing to the C++ side anyway. There's no point in calling the C++ resizing code from Python from C++, once we converge on the right algorithm

jakesabathia2 · 2020-01-28T19:29:19Z

pass gitlab.

shantanuchhabra · 2020-01-28T21:52:47Z

src/python/turicreate/toolkits/object_detector/_tf_image_augmenter.py

@@ -8,6 +8,8 @@
 from __future__ import absolute_import as _

 import numpy as np
+from PIL import Image
+import PIL


Why do we need to import PIL? Would the line above this one suffice?

change tf resize augmenter.

5ecbccb

jakesabathia2 requested review from srikris, guihao-liang and nickjong January 24, 2020 20:34

jakesabathia2 changed the title ~~change tf resize augmenter.~~ [OD] 6.0 inference regression between CPU and GPU. Jan 24, 2020

guihao-liang added this to the 6.1 milestone Jan 24, 2020

guihao-liang added bug object detection p2 labels Jan 24, 2020

shantanuchhabra suggested changes Jan 24, 2020

View reviewed changes

jakesabathia2 requested a review from shantanuchhabra January 24, 2020 21:34

guihao-liang mentioned this pull request Jan 24, 2020

[WIP] Pre-commit hooks for cpp files #2958

Closed

PIL image.

e56d6e8

jakesabathia2 added 2 commits January 27, 2020 13:14

remove file change

abf3d03

add turicreate resize function.

461fb11

nickjong approved these changes Jan 28, 2020

View reviewed changes

change name

3ef0326

shantanuchhabra approved these changes Jan 28, 2020

View reviewed changes

remove import PIL

f9fa072

jakesabathia2 merged commit a3b38c6 into apple:master Jan 28, 2020

jakesabathia2 deleted the od_eval_regression branch January 28, 2020 22:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[OD] 6.0 inference regression between CPU and GPU. #2955

[OD] 6.0 inference regression between CPU and GPU. #2955

jakesabathia2 commented Jan 24, 2020 •

edited

Loading

guihao-liang commented Jan 24, 2020

shantanuchhabra Jan 24, 2020

jakesabathia2 Jan 24, 2020

hoytak Jan 24, 2020

jakesabathia2 Jan 24, 2020

jakesabathia2 commented Jan 24, 2020

jakesabathia2 commented Jan 27, 2020 •

edited

Loading

nickjong Jan 28, 2020

jakesabathia2 commented Jan 28, 2020

shantanuchhabra Jan 28, 2020

jakesabathia2 Jan 28, 2020

[OD] 6.0 inference regression between CPU and GPU. #2955

[OD] 6.0 inference regression between CPU and GPU. #2955

Conversation

jakesabathia2 commented Jan 24, 2020 • edited Loading

guihao-liang commented Jan 24, 2020

shantanuchhabra Jan 24, 2020

Choose a reason for hiding this comment

jakesabathia2 Jan 24, 2020

Choose a reason for hiding this comment

hoytak Jan 24, 2020

Choose a reason for hiding this comment

jakesabathia2 Jan 24, 2020

Choose a reason for hiding this comment

jakesabathia2 commented Jan 24, 2020

jakesabathia2 commented Jan 27, 2020 • edited Loading

nickjong Jan 28, 2020

Choose a reason for hiding this comment

jakesabathia2 commented Jan 28, 2020

shantanuchhabra Jan 28, 2020

Choose a reason for hiding this comment

jakesabathia2 Jan 28, 2020

Choose a reason for hiding this comment

jakesabathia2 commented Jan 24, 2020 •

edited

Loading

jakesabathia2 commented Jan 27, 2020 •

edited

Loading