-
Notifications
You must be signed in to change notification settings - Fork 94
Divergence in classification between digits and GRE #27
Comments
Are you using the caffe version of GRE? |
I am using the caffe version of GRE. |
The issue you mentioned had varying confidence results for the same image. Is this the case? |
It was actually the 2nd post of that thread that piqued my interest regarding a difference in confidence results. |
Ah yes. It might be the preprocessing that's the culprit. Unfortunately the current code is not very generic to this regard. |
I'm hoping we can identify the cause of the divergence as the GRE would work perfectly for what I need it to do. |
Unfortunately, it's challenging to debug since I don't have access to your model. Do you know what DIGITS is using in the pre-processing steps? Crops? resizes? Augmentation? |
For the Model portion; Subtract Mean, no crop option. On the Database, the default it had was squash image size 256x256, color image encoding png. On the deploy.prototxt, the dimensions are as above for the exported model. Let me know if you require more details. I just finished running a comparison of the false positives of GRE vs Digits REST for the same model; Digits REST False positive: 101 GRE False positive: 366 If I can get the GRE to match up to how the Digits REST API outputs it would greatly improve our classification times on our material. Due to limitations of our current image carving tool and the Digits REST api, large volumes of calls overwhelms the REST API. The GRE solves this problem for us. |
Any thoughts on this? I'm hoping Griffin and I can work out a solution with you. |
If you pre-resize the images to the right size (227x227?) before sending them to GRE, does it change anything? |
Just toss them into photoshop, do a bilinear resize to 227x227 for all the images prior to GRE? In the classification code you have, do you use opencv for that part? I wonder if I can do it there before its classified. |
Yes I use OpenCV for that. |
I can test it out. |
Image blilinear resize to 227x227 Digits classify.py: GRE:
|
I will try a different approach; I'm thinking since I've worked a lot with the classify.py file from digits, I can try to modify that to see if I can reproduce the same value that the GRE produces. |
Ok. And it should be pretty straightforward to compare the pre-processing steps from DIGITS and gpu-rest-engine. It's just a few lines of Python/C++ code. |
Griffin has mentioned something about mapping values or weights being done differently with regards to DeepDetect. I'm wondering if something similar is happening here. I'll have to check back with him on this idea of what was happening. |
Originally I thought there may an issue with something like dimension ordering (e.g. NKHW) but it seems unlikely that this is the culprit here. I agree with @flx42 that something like resizing may be the problem - something like a different interpolation being used. Let me see if I can reproduce this divergence with a simple example. |
I think we ruled out the resize now, since @rperdon tried with images that are already at 227x227. So the culprit is probably something else: mean subtraction, cropping, etc. |
I have played further with some other images which were flipping the classifications. I resized in photoshop, baseline compression, (previous was png, no compression), bilinear to 227x227. Digits: GRE Unaltered: (1024x1024) GRE I made a mistake in the Digits classification example.py compared to Digits REST comparisons I was using: The default is squash and subtract pixel for Digits Classification example.py while in Digits I was working with subtract image. The model I am working as with the desired results is running from Digits REST, not the example.py so I will have to do some work to ensure that example.py is modified to represent what Digits is doing before being able to properly compare resizing code changes. |
I think this may provide key insight to adapting GRE to work like Digits. I'm just wrapping my head around the implications and hopefully we can sort this out. Edit: I retrained the model using mean pixel instead of mean image then reloaded the model into GRE. |
I did a test of the differences of the resize function of PIL which the classification example of Digits uses and OpenCV's resize function: min; mean, max: OpenCV Noting a discrepancy in the mean value of the image, I think this is one part of the divergence problem. I do suspect more reasons but to me its feeling like there is a particular "order" of operations that each image load, mean subtract (pixel/image), crop, resize order that there is no defined standard to follow. I thought this was an interesting read about "unknown" operations done by various resize operations of matrices. In this example, opencv vs matlab |
I would expect that a difference in resize could explain a divergence in the order of 10^-2 or 10^-3, not more. |
I did some further testing between scipy, opencv and skimage and found each produced different results leading me to believe something like anti-aliasing could be a problem. Digits classification example uses scipy which its resize function is deprecated now. https://docs.scipy.org/doc/scipy/reference/generated/scipy.misc.imresize.html I want to read more into opencv's and skimage to see if I can get them to generate the same results. |
Some more reading info on differences. short version: scipy does a conversion to uint8, is now deprecated |
There has been other back and forth on email about this issue, I will close this. |
When running a digits exported model into the GRE, I am getting some divergence in the values of the classification when compared to Digits. In some cases, it is flipping the classification completely to the opposite classification with regards to a binary classifier. When using the Digits REST (non GRE), I am getting identical classifications to Digits so I am wondering where the divergence lies.
I altered the first line in the deploy.prototext of my exported model from
input: "data"
input_shape {
dim: 1
dim: 3
dim: 227
dim: 227
}
To
name: "AlexNet"
layer {
name: "data"
type: "Input"
top: "data"
input_param { shape: { dim: 1 dim: 3 dim: 227 dim: 227 } }
}
I mimicked this based off the deploy.prototext in the GRE model folder and altered the shape to coincide with my model.
I saw in a previous issue that there was a problem with varying confidence and am wondering if it is related. #3
The text was updated successfully, but these errors were encountered: