NVDLA result of Alexnet is different with caffe. #45

MINZHIJI · 2018-05-17T09:48:06Z

[Solved] The problem is NVDLA runtime need to fit the image preprocessing with matching traning method.

Such as raw scaling, image subtracting and the order of RBG channel ... etc.

Environment:

OS: Ubuntu 14.04
NVDLA Version: NV_FULL
Model: Alexnet (Link)
Weight: BAIR/BVLC AlexNet Model (Link)
Image: Quail (Link) and some ImageNet images

Question:
I run Alexnet on NVDLA and caffe, but I get results are different.

Result: (Link)

JunningWu · 2018-05-21T07:44:43Z

mine neither

chagyun0213 · 2018-05-24T17:52:32Z

Hi
I had the same issue before, but I found it is due to miss understanding about data preprocessing. Understand the train_val.prototxt file to figure out what is preprocessing layer employed there, then you should do the same preprocessing during inference via deploy.txt or runtime engine. Usually it is RGB mean substraction along with/without scaling by 1/255.0, etc. Also you have to care about RGB or BGR order of your traing data and inference data. With these one, I got almost almost same result with Caffe but small accuracy deduction. I believe the deduction might be related with FP16 or weigth compression, or something I cannot understand yet.

MINZHIJI · 2018-05-28T01:58:59Z

@chagyun0213 Thank you for your answer. And Could you provide your example code?

ned-varnica · 2018-05-30T17:49:40Z

Hi @chagyun0213

Can you please share an example of network and code (AlexNet or ResNet) that you were able to get to work? For example, how do you feed the input image into the network and where do you apply raw scaling? Really would appreciate any answer on this. Thanks!

prasshantg · 2018-06-01T15:07:54Z

@ned-varnica previously we were normalizing input image by 255.0 by default which was causing to get incorrect results as pointed out by @chagyun0213 due to incorrect pre-processing. We have fixed it by providing runtime argument --normalize to specify value. Please try using 1.0 with this option. We were able to get correct results with it.

chagyun0213 · 2018-06-02T00:40:25Z

Here is key information.
Let's see the "train_val.txt" for BVLC AlexNet
https://github.com/BVLC/caffe/blob/master/models/bvlc_alexnet/train_val.prototxt

name: "AlexNet"
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: true
crop_size: 227
mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"
}
data_param {
source: "examples/imagenet/ilsvrc12_train_lmdb"
batch_size: 256
backend: LMDB
}
}
-- DELETED other layers --

AlexNet was trained on 227x227 color images. Since the training database "ilsvrc12_train_lmdb" is "BGR" fromat, you have to confirm whether input data for inference is written to DLA as the same "BGR" order, especially in createFF16ImageCopy() block.
Due to the "mean_file" option above, you have to use the same mean value for subtraction.
Its values are (BGR) = (104, 116,122). The subtraction also could be done easily in
createFF16ImageCopy() block.
No scaling was chosen.

As summary, the correct pre-processing for AlexNet included in BVLC is
input_DLA = 1.0*( input_pixel - [104,116,122])

I hope it would be sufficient information for your problem.

CJ

ned-varnica · 2018-06-06T23:26:09Z

@chagyun0213
thanks very much. This is very helpful!

MINZHIJI · 2018-06-07T02:49:47Z

I use cifar10 quick as inference model, and I ran some image include test images and train images.
Then, I ran cifar10 quick with caffe and nvdla, and I compared the results between caffe and nvdla.
I just want to ask the result whether correct or sensible.
Result analysis link
Analysis condition:

model: cifar10_quick (BLVC)
Image preprocessing:
- Caffe: image just scaling to [0..255] (raw_scale=255)
- Nvdla: image just scaling to [0..255] (--normalize 1.0)
Compare mechanism:
- Loss function: Absolute loss function and cross entropy
Images:
- Randomly get 42 images from Cifar10 dataset train/ test images and convert to .jpg

prasshantg · 2018-06-07T11:20:49Z

@MINZHIJI can you generate similar loss report only for top-5 or top-1? Results looks sensible, reviewing cases where we see mismatch in top-1/top-5.

prasshantg · 2018-06-19T07:37:58Z

@MINZHIJI results look good, please reopen issue if you see any problem

gitosu67 · 2020-01-31T23:12:06Z

Here is key information.
Let's see the "train_val.txt" for BVLC AlexNet
https://github.com/BVLC/caffe/blob/master/models/bvlc_alexnet/train_val.prototxt

name: "AlexNet"
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: true
crop_size: 227
mean_file: "data/ilsvrc12/imagenet_mean.binaryproto"
}
data_param {
source: "examples/imagenet/ilsvrc12_train_lmdb"
batch_size: 256
backend: LMDB
}
}
-- DELETED other layers --

AlexNet was trained on 227x227 color images. Since the training database "ilsvrc12_train_lmdb" is "BGR" fromat, you have to confirm whether input data for inference is written to DLA as the same "BGR" order, especially in createFF16ImageCopy() block.

Due to the "mean_file" option above, you have to use the same mean value for subtraction.
Its values are (BGR) = (104, 116,122). The subtraction also could be done easily in
createFF16ImageCopy() block.

No scaling was chosen.

As summary, the correct pre-processing for AlexNet included in BVLC is
input_DLA = 1.0*( input_pixel - [104,116,122])

I hope it would be sufficient information for your problem.

CJ

How are you getting the mean value?

prasshantg closed this as completed Jun 19, 2018

silvaurus mentioned this issue Jul 14, 2018

Alexnet Pre-processing #80

Open

anakin1028 mentioned this issue Aug 9, 2018

AlexNet NN on NVDLA small #70

Open

sean90175 mentioned this issue Apr 21, 2020

NVDLA accuracy drop and results are different from caffe. #193

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NVDLA result of Alexnet is different with caffe. #45

NVDLA result of Alexnet is different with caffe. #45

MINZHIJI commented May 17, 2018 •

edited

Loading

JunningWu commented May 21, 2018

chagyun0213 commented May 24, 2018

MINZHIJI commented May 28, 2018 •

edited

Loading

ned-varnica commented May 30, 2018

prasshantg commented Jun 1, 2018

chagyun0213 commented Jun 2, 2018

ned-varnica commented Jun 6, 2018

MINZHIJI commented Jun 7, 2018

prasshantg commented Jun 7, 2018

prasshantg commented Jun 19, 2018

gitosu67 commented Jan 31, 2020 •

edited

Loading

NVDLA result of Alexnet is different with caffe. #45

NVDLA result of Alexnet is different with caffe. #45

Comments

MINZHIJI commented May 17, 2018 • edited Loading

[Solved] The problem is NVDLA runtime need to fit the image preprocessing with matching traning method.

JunningWu commented May 21, 2018

chagyun0213 commented May 24, 2018

MINZHIJI commented May 28, 2018 • edited Loading

ned-varnica commented May 30, 2018

prasshantg commented Jun 1, 2018

chagyun0213 commented Jun 2, 2018

ned-varnica commented Jun 6, 2018

MINZHIJI commented Jun 7, 2018

prasshantg commented Jun 7, 2018

prasshantg commented Jun 19, 2018

gitosu67 commented Jan 31, 2020 • edited Loading

MINZHIJI commented May 17, 2018 •

edited

Loading

MINZHIJI commented May 28, 2018 •

edited

Loading

gitosu67 commented Jan 31, 2020 •

edited

Loading