Divergence in classification between digits and GRE #27

rperdon · 2017-12-20T21:57:00Z

When running a digits exported model into the GRE, I am getting some divergence in the values of the classification when compared to Digits. In some cases, it is flipping the classification completely to the opposite classification with regards to a binary classifier. When using the Digits REST (non GRE), I am getting identical classifications to Digits so I am wondering where the divergence lies.

I altered the first line in the deploy.prototext of my exported model from
input: "data"
input_shape {
dim: 1
dim: 3
dim: 227
dim: 227
}

To

name: "AlexNet"
layer {
name: "data"
type: "Input"
top: "data"
input_param { shape: { dim: 1 dim: 3 dim: 227 dim: 227 } }
}

I mimicked this based off the deploy.prototext in the GRE model folder and altered the shape to coincide with my model.

I saw in a previous issue that there was a problem with varying confidence and am wondering if it is related. #3

flx42 · 2017-12-20T22:59:03Z

Are you using the caffe version of GRE?

rperdon · 2017-12-20T23:40:25Z

I am using the caffe version of GRE.

flx42 · 2017-12-21T01:59:31Z

The issue you mentioned had varying confidence results for the same image. Is this the case?

rperdon · 2017-12-21T02:20:33Z

It was actually the 2nd post of that thread that piqued my interest regarding a difference in confidence results.

flx42 · 2017-12-21T02:26:33Z

Ah yes. It might be the preprocessing that's the culprit. Unfortunately the current code is not very generic to this regard.

rperdon · 2017-12-21T03:12:53Z

I'm hoping we can identify the cause of the divergence as the GRE would work perfectly for what I need it to do.

flx42 · 2017-12-21T05:00:45Z

Unfortunately, it's challenging to debug since I don't have access to your model. Do you know what DIGITS is using in the pre-processing steps? Crops? resizes? Augmentation?

rperdon · 2017-12-21T14:55:54Z

For the Model portion; Subtract Mean, no crop option. On the Database, the default it had was squash image size 256x256, color image encoding png. On the deploy.prototxt, the dimensions are as above for the exported model. Let me know if you require more details.

I just finished running a comparison of the false positives of GRE vs Digits REST for the same model;
Sample set of random non-trained images: 6323

Digits REST False positive: 101 GRE False positive: 366

If I can get the GRE to match up to how the Digits REST API outputs it would greatly improve our classification times on our material. Due to limitations of our current image carving tool and the Digits REST api, large volumes of calls overwhelms the REST API. The GRE solves this problem for us.

rperdon · 2017-12-28T19:02:00Z

Any thoughts on this? I'm hoping Griffin and I can work out a solution with you.

flx42 · 2017-12-28T19:05:15Z

If you pre-resize the images to the right size (227x227?) before sending them to GRE, does it change anything?

rperdon · 2017-12-28T19:12:00Z

Just toss them into photoshop, do a bilinear resize to 227x227 for all the images prior to GRE? In the classification code you have, do you use opencv for that part? I wonder if I can do it there before its classified.

flx42 · 2017-12-28T19:13:28Z

Yes I use OpenCV for that.
For now I'm just trying to find out if it's the resize that's the culprit. So take this 227x227 image and send it to both GRE and DIGITS.

rperdon · 2017-12-28T19:14:22Z

I can test it out.

rperdon · 2017-12-28T20:15:04Z

Image blilinear resize to 227x227

Digits classify.py:
0: 21.5443
1: 78.4557%

GRE:
0: 0.5819
1: 0.4181

Digits unaltered image:
0: 21.8740
1: 78.1260

GRE unaltered image:
0: 0.553
1: 0.447

I find it interesting that a pre-resize of the image alters the values in Digits as well.

rperdon · 2017-12-28T21:01:30Z

I will try a different approach; I'm thinking since I've worked a lot with the classify.py file from digits, I can try to modify that to see if I can reproduce the same value that the GRE produces.

flx42 · 2017-12-28T21:20:38Z

Ok. And it should be pretty straightforward to compare the pre-processing steps from DIGITS and gpu-rest-engine. It's just a few lines of Python/C++ code.
I think the resize is the only piece that might differ (since you can have multiple interpolation algorithms). The other steps should produce the same results, but since we still have a divergence, it's probably that it's not the same operations that are used.

rperdon · 2017-12-28T21:32:01Z

Griffin has mentioned something about mapping values or weights being done differently with regards to DeepDetect. I'm wondering if something similar is happening here. I'll have to check back with him on this idea of what was happening.

laceyg · 2017-12-28T21:52:51Z

Originally I thought there may an issue with something like dimension ordering (e.g. NKHW) but it seems unlikely that this is the culprit here. I agree with @flx42 that something like resizing may be the problem - something like a different interpolation being used.

Let me see if I can reproduce this divergence with a simple example.

flx42 · 2017-12-28T23:18:41Z

I think we ruled out the resize now, since @rperdon tried with images that are already at 227x227. So the culprit is probably something else: mean subtraction, cropping, etc.

rperdon · 2017-12-29T14:58:49Z

I have played further with some other images which were flipping the classifications.

I resized in photoshop, baseline compression, (previous was png, no compression), bilinear to 227x227.

Digits:
0 95.79
1 4.21

GRE
0 0.501
1 0.499

Unaltered: (1024x1024)
Digits:
0 92.2
1 7.8

GRE
0 0.042
1 0.958

I made a mistake in the Digits classification example.py compared to Digits REST comparisons I was using: The default is squash and subtract pixel for Digits Classification example.py while in Digits I was working with subtract image. The model I am working as with the desired results is running from Digits REST, not the example.py so I will have to do some work to ensure that example.py is modified to represent what Digits is doing before being able to properly compare resizing code changes.

rperdon · 2017-12-29T15:23:03Z

NVIDIA/DIGITS#169

I think this may provide key insight to adapting GRE to work like Digits. I'm just wrapping my head around the implications and hopefully we can sort this out.

Edit: I retrained the model using mean pixel instead of mean image then reloaded the model into GRE.
GRE was still consistent with the images it flipped for classifications. So image vs pixel isn't the missing link.

rperdon · 2018-01-02T19:42:02Z

I did a test of the differences of the resize function of PIL which the classification example of Digits uses and OpenCV's resize function:

min; mean, max:

OpenCV
(15, 160.10562983950786, 254)
SciPy - used by PIL
(15, 160.21342674351661, 254)

Noting a discrepancy in the mean value of the image, I think this is one part of the divergence problem. I do suspect more reasons but to me its feeling like there is a particular "order" of operations that each image load, mean subtract (pixel/image), crop, resize order that there is no defined standard to follow.

I thought this was an interesting read about "unknown" operations done by various resize operations of matrices. In this example, opencv vs matlab

https://stackoverflow.com/questions/21997094/why-opencv-cv2-resize-gives-different-answer-than-matlab-imresize

flx42 · 2018-01-02T22:13:15Z

I would expect that a difference in resize could explain a divergence in the order of 10^-2 or 10^-3, not more.

rperdon · 2018-01-03T18:43:30Z

I did some further testing between scipy, opencv and skimage and found each produced different results leading me to believe something like anti-aliasing could be a problem.

Digits classification example uses scipy which its resize function is deprecated now.

https://docs.scipy.org/doc/scipy/reference/generated/scipy.misc.imresize.html

I want to read more into opencv's and skimage to see if I can get them to generate the same results.

rperdon · 2018-01-04T14:57:47Z

Some more reading info on differences.

short version: scipy does a conversion to uint8, is now deprecated
Something with a 90% accuracy in scipy resize, drops to 60% with skimage resize and drops to 53% accuracy in opencv.

flx42 · 2018-07-20T20:11:24Z

There has been other back and forth on email about this issue, I will close this.

flx42 closed this as completed Jul 20, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Divergence in classification between digits and GRE #27

Divergence in classification between digits and GRE #27

rperdon commented Dec 20, 2017

flx42 commented Dec 20, 2017

rperdon commented Dec 20, 2017

flx42 commented Dec 21, 2017

rperdon commented Dec 21, 2017

flx42 commented Dec 21, 2017

rperdon commented Dec 21, 2017

flx42 commented Dec 21, 2017

rperdon commented Dec 21, 2017 •

edited

rperdon commented Dec 28, 2017

flx42 commented Dec 28, 2017

rperdon commented Dec 28, 2017

flx42 commented Dec 28, 2017

rperdon commented Dec 28, 2017

rperdon commented Dec 28, 2017 •

edited

rperdon commented Dec 28, 2017

flx42 commented Dec 28, 2017 •

edited

rperdon commented Dec 28, 2017

laceyg commented Dec 28, 2017

flx42 commented Dec 28, 2017

rperdon commented Dec 29, 2017 •

edited

rperdon commented Dec 29, 2017 •

edited

rperdon commented Jan 2, 2018 •

edited

flx42 commented Jan 2, 2018

rperdon commented Jan 3, 2018

rperdon commented Jan 4, 2018

flx42 commented Jul 20, 2018

Divergence in classification between digits and GRE #27

Divergence in classification between digits and GRE #27

Comments

rperdon commented Dec 20, 2017

flx42 commented Dec 20, 2017

rperdon commented Dec 20, 2017

flx42 commented Dec 21, 2017

rperdon commented Dec 21, 2017

flx42 commented Dec 21, 2017

rperdon commented Dec 21, 2017

flx42 commented Dec 21, 2017

rperdon commented Dec 21, 2017 • edited

rperdon commented Dec 28, 2017

flx42 commented Dec 28, 2017

rperdon commented Dec 28, 2017

flx42 commented Dec 28, 2017

rperdon commented Dec 28, 2017

rperdon commented Dec 28, 2017 • edited

GRE: 0: 0.5819 1: 0.4181

rperdon commented Dec 28, 2017

flx42 commented Dec 28, 2017 • edited

rperdon commented Dec 28, 2017

laceyg commented Dec 28, 2017

flx42 commented Dec 28, 2017

rperdon commented Dec 29, 2017 • edited

rperdon commented Dec 29, 2017 • edited

rperdon commented Jan 2, 2018 • edited

flx42 commented Jan 2, 2018

rperdon commented Jan 3, 2018

rperdon commented Jan 4, 2018

flx42 commented Jul 20, 2018

rperdon commented Dec 21, 2017 •

edited

rperdon commented Dec 28, 2017 •

edited

GRE:
0: 0.5819
1: 0.4181

flx42 commented Dec 28, 2017 •

edited

rperdon commented Dec 29, 2017 •

edited

rperdon commented Dec 29, 2017 •

edited

rperdon commented Jan 2, 2018 •

edited