ReQuEST accuracy evaluation on ImageNet: ArmCL 18.03 vs. TensorFlow 1.7 #431

psyhtest · 2018-04-16T16:36:32Z

While evaluating MobileNets using ArmCL for the ReQuEST@ASPLOS'18 competition, we observed noticeable discrepancies in predictions of ArmCL 18.03 vs. TensorFlow 1.7. We used the same pretrained MobileNets-v1 weights shared by Google in 2017.

For example, the standard accuracy metrics (Top 1 / Top 5 on the ImageNet validation set of 50,000 images in percent) of 4 models (with the width multiplier of 1.0) are given in the following tables.

Top 1

Model	TensorFlow 1.1 (?) (claimed)	TensorFlow 1.7 (measured)	ArmCL 18.03 (measured)
MobileNet_v1_1.0_224	70.700	70.466	70.464
MobileNet_v1_1.0_192	69.300	68.824	68.830
MobileNet_v1_1.0_160	67.200	66.504	66.458
MobileNet_v1_1.0_128	64.100	63.580	63.586

Top 5

Model	TensorFlow 1.1 (?) (claimed)	TensorFlow 1.7 (measured)	ArmCL 18.03 (measured)
MobileNet_v1_1.0_224	89.500	89.410	89.398
MobileNet_v1_1.0_192	88.900	88.466	88.474
MobileNet_v1_1.0_160	87.500	87.084	87.088
MobileNet_v1_1.0_128	85.300	84.928	84.940

psyhtest · 2018-04-16T17:30:01Z

We first noticed accuracy degradation (up to 0.5%) when comparing the ArmCL results with the TensorFlow accuracy figures claimed by Google researchers. However, when we measured the TensorFlow accuracy ourselves using the same methodology as for ArmCL, it became clear that degradation is probably due to differences in input preprocessing (e.g. resizing to a larger input resolution and then cropping to the model resolution; cf. with the ReQuEST artifact from Intel).

psyhtest · 2018-04-16T17:37:29Z

Now, a fraction of a percent might not sound like much (e.g. 0.002% means just 1 image out of 50,000, 0.006% - 3 images, etc.) but this is smoothed out over 50,000 images. In fact, for the first 500 images, our ArmCL based implementation disagrees with TensorFlow (and the correct label) about 3 images:

ILSVRC2012_val_00000060.JPEG

$ ck run program:image-classification-tf-py \
--env.CK_IMAGE_FILE=/home/anton/CK_TOOLS/dataset-imagenet-ilsvrc2012-val-min/ILSVRC2012_val_00000060.JPEG

---------------------------------------
ILSVRC2012_val_00000060.JPEG - (588) n03482405 hamper
0.47 - (588) n03482405 hamper
0.43 - (492) n03014705 chest
0.02 - (626) n03666591 lighter, light, igniter, ignitor
0.01 - (526) n03179701 desk
0.01 - (681) n03832673 notebook, notebook computer
---------------------------------------

$ ck run program:mobilenets-armcl-opencl \
--env.CK_IMAGE_FILE=/home/anton/CK_TOOLS/dataset-imagenet-ilsvrc2012-val-min/ILSVRC2012_val_00000060.JPEG

---------------------------------------
ILSVRC2012_val_00000060.JPEG - (588) n03482405 hamper
0.45 - (492) n03014705 chest
0.45 - (588) n03482405 hamper
0.02 - (626) n03666591 lighter, light, igniter, ignitor
0.01 - (526) n03179701 desk
0.01 - (681) n03832673 notebook, notebook computer
---------------------------------------

ILSVRC2012_val_00000302.JPEG

$ ck run program:image-classification-tf-py \
--env.CK_IMAGE_FILE=/home/anton/CK_TOOLS/dataset-imagenet-ilsvrc2012-val-min/ILSVRC2012_val_00000302.JPEG

---------------------------------------
ILSVRC2012_val_00000302.JPEG - (469) n02939185 caldron, cauldron
0.47 - (469) n02939185 caldron, cauldron
0.47 - (926) n07590611 hot pot, hotpot
0.02 - (925) n07584110 consomme
0.01 - (996) n13052670 hen-of-the-woods, hen of the woods, Poly...
0.01 - (809) n04263257 soup bowl
---------------------------------------

$ ck run program:mobilenets-armcl-opencl \
--env.CK_IMAGE_FILE=/home/anton/CK_TOOLS/dataset-imagenet-ilsvrc2012-val-min/ILSVRC2012_val_00000302.JPEG

---------------------------------------
ILSVRC2012_val_00000302.JPEG - (469) n02939185 caldron, cauldron
0.48 - (926) n07590611 hot pot, hotpot
0.47 - (469) n02939185 caldron, cauldron
0.02 - (925) n07584110 consomme
0.01 - (996) n13052670 hen-of-the-woods, hen of the woods, Poly...
0.01 - (809) n04263257 soup bowl
---------------------------------------

ILSVRC2012_val_00000313.JPEG

$ ck run program:image-classification-tf-py \
--env.CK_IMAGE_FILE=/home/anton/CK_TOOLS/dataset-imagenet-ilsvrc2012-val-min/ILSVRC2012_val_00000313.JPEG

---------------------------------------
ILSVRC2012_val_00000313.JPEG - (979) n09468604 valley, vale
0.17 - (979) n09468604 valley, vale
0.17 - (984) n11879895 rapeseed
0.12 - (978) n09428293 seashore, coast, seacoast, sea-coast
0.09 - (525) n03160309 dam, dike, dyke
0.07 - (975) n09332890 lakeside, lakeshore
---------------------------------------

$ ck run program:mobilenets-armcl-opencl \
--env.CK_IMAGE_FILE=/home/anton/CK_TOOLS/dataset-imagenet-ilsvrc2012-val-min/ILSVRC2012_val_00000313.JPEG

---------------------------------------
ILSVRC2012_val_00000313.JPEG - (979) n09468604 valley, vale
0.18 - (984) n11879895 rapeseed
0.17 - (979) n09468604 valley, vale
0.11 - (978) n09428293 seashore, coast, seacoast, sea-coast
0.09 - (525) n03160309 dam, dike, dyke
0.07 - (483) n02980441 castle
---------------------------------------

It's peculiar that for ILSVRC2012_val_00000060.JPEG TensorFlow separates "hamper" and "chest" considerably (47% vs. 43% probability), while ArmCL gives them roughly the same probability (45%). On the other hand, when TensorFlow gives roughly the same probability to the other two images, ArmCL separates them more (but pushes the incorrect answer to the top).

AnthonyBarbier · 2018-04-16T17:43:09Z

Hi Anton,
I only went quickly through the "MobileNets using ArmCL" repo as I was triaging this bug and didn't find the answer: how is ACL integrated with TensorFlow to run the graphs ? Is it using the graph API ? Hand written networks ? Through Android NN ?

psyhtest · 2018-04-16T17:55:14Z

hi @AnthonyARM, the ArmCL based implementation is built on top of your graph API example. The TensorFlow implementation is a separate Python program (which uses exactly the same pretrained weights though).

I'll provide full instructions how to reproduce this issue shortly using Collective Knowledge.

gmiodice · 2018-04-17T09:01:14Z

Hi @psyhtest

many thanks for reporting this. Do you know if you have the same discrepancy for both NEON and CL backend?

Thanks

psyhtest · 2018-04-17T09:31:34Z

Please find ArmCL instructions below. (TensorFlow instructions to follow shortly.)

Please me know if anything is unclear (or ask a friendly person from @sztaylor's team:)).

Installing artifact dependencies

$ sudo apt install python python-pip
$ sudo apt install libblas-dev liblapack-dev libatlas-base-dev
$ sudo python-numpy python-scipy
$ sudo pip install pillow
$ sudo pip install ck

Detecting GCC, Python, OpenCL

$ ck detect soft:compiler.gcc
$ ck detect soft:compiler.python
$ ck detect platform.gpgpu --opencl

Installing the MobileNets artifact

$ ck pull repo --url=https://github.com/dividiti/ck-request-asplos18-mobilenets-armcl-opencl
$ ck install ck-env:package:imagenet-2012-aux
$ ck install ck-env:package:imagenet-2012-val-min-resized
$ ck install ck-math:package:lib-armcl-opencl-18.03 --env.USE_GRAPH=ON --env.USE_NEON=ON
$ ck install ck-request-asplos18-mobilenets-armcl-opencl:package:weights-mobilenet-v1-1.0-224-npy

Running the classification program

To classify ILSVRC2012_val_00000001.JPEG:

$ ck compile program:mobilenets-armcl-opencl
$ ck run program:mobilenets-armcl-opencl

To classify another image (e.g. ILSVRC2012_val_00000060.JPEG):

$ ck virtual env --tags=imagenet,small_dataset
$ ck run program:mobilenets-armcl-opencl \
--env.CK_IMAGE_FILE=$CK_ENV_DATASET_IMAGENET_VAL/ILSVRC2012_val_00000060.JPEG
...
  (run ...)
executing code ...
Kernel path: /home/anton/CK_TOOLS/lib-armcl-opencl-18.03-gcc-6.3.0-linux-64/src/src/core/CL/cl_kernels/
Image list file: ../images-224-1-1.txt
Image count in file: 1
Batch list file: ../batches-224-1-1.txt
Batch count in file: 1

Prepare graph...

Run graph...

Batch 1 of 1
File: ../batches-224-1-1/ILSVRC2012_val_00000060.JPEG.npy
Loaded in 0.0215138 s
Classified in 0.0713635s
Test passed
-------------------------------
Graph loaded in 1.36456 s
All batches loaded in 0.0215138 s
All batches classified in 0.0713635 s
Average classification time: 0.0713635 s
-------------------------------

  (post processing: "python /home/anton/CK_REPOS/ck-request-asplos18-mobilenets-armcl-opencl/program/mobilenets-armcl-opencl/postprocess.py"

--------------------------------
Process results in predictions
---------------------------------------
ILSVRC2012_val_00000060.JPEG - (588) n03482405 hamper
0.45 - (492) n03014705 chest
0.45 - (588) n03482405 hamper
0.02 - (626) n03666591 lighter, light, igniter, ignitor
0.01 - (526) n03179701 desk
0.01 - (681) n03832673 notebook, notebook computer
---------------------------------------
Accuracy top 1: 0.000000 (0 of 1)
Accuracy top 5: 1.000000 (1 of 1)
--------------------------------

  (reading fine grain timers from tmp-ck-timer.json ...)

{
  "accuracy_top1": 0.0,
  "accuracy_top5": 1.0,
  "execution_time": 0.071363,
  "execution_time_sum": 1.45744,
  "frame_predictions": [
    {
      "accuracy_top1": "no",
      "accuracy_top5": "yes",
      "class_correct": 588,
      "class_topmost": 492,
      "file_name": "ILSVRC2012_val_00000060.JPEG"
    }
  ],
  "images_load_time_avg_s": 0.021514,
  "images_load_time_s": 0.021514,
  "prediction_time_avg_s": 0.071363,
  "prediction_time_total_s": 0.071363,
  "setup_time_s": 1.364563,
  "test_time_s ": 0.103095
}

Execution time: 0.071 sec.

psyhtest · 2018-04-17T09:34:41Z

@gmiodice I am afraid I don't know. Our ArmCL program currently supports only OpenCL. It should not be hard to extend it to support Neon.

psyhtest · 2018-04-17T10:05:06Z

Please find TensorFlow instructions below (assuming CK has been installed as per ArmCL instructions above.)

Installing artifact dependencies

$ sudo apt install liblapack-dev libatlas-dev
$ sudo pip install enum34 mock pillow wheel absl-py scipy
$ ck install ck-env:package:tool-bazel-0.11.1-linux

Installing the artifact

$ ck pull repo:ck-tensorflow
$ ck install package:tensorflowmodel-mobilenet-v1-1.0-224-py
$ ck install package:lib-tensorflow-1.7.0-src-cpu [--env.CK_HOST_CPU_NUMBER_OF_PROCESSORS=1]

NB: You may want to restrict the number of build threads to 1 or 2 on a dev board with < 4 GB RAM. For example, add --env.CK_HOST_CPU_NUMBER_OF_PROCESSORS=2 on HiKey 960 (3 GB RAM with swap enabled) or --env.CK_HOST_CPU_NUMBER_OF_PROCESSORS=1 on Tegra TX1 (4 GB RAM without swap enabled).

Running the classification program

To classify e.g. ILSVRC2012_val_00000060.JPEG:

$ ck virtual env --tags=imagenet,small_dataset
$ ck run program:image-classification-tf-py \
--env.CK_IMAGE_FILE=$CK_ENV_DATASET_IMAGENET_VAL/ILSVRC2012_val_00000060.JPEG

psyhtest mentioned this issue May 23, 2018

Formalize input preprocessing mlcommons/training#48

Closed

This was referenced Jun 14, 2018

Formalize input preprocessing mlcommons/training_policies#30

Closed

Rename ck-tensorflow:classification-* programs ctuning/ck-tensorflow#68

Closed

morgolock added the Question label Aug 14, 2018

psyhtest mentioned this issue Dec 11, 2018

Examples benchmarking problem in RaspberryPi 3B+ #599

Closed

mpekatsoula closed this as completed Feb 26, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ReQuEST accuracy evaluation on ImageNet: ArmCL 18.03 vs. TensorFlow 1.7 #431

ReQuEST accuracy evaluation on ImageNet: ArmCL 18.03 vs. TensorFlow 1.7 #431

psyhtest commented Apr 16, 2018 •

edited

Loading

psyhtest commented Apr 16, 2018

psyhtest commented Apr 16, 2018 •

edited

Loading

AnthonyBarbier commented Apr 16, 2018

psyhtest commented Apr 16, 2018 •

edited

Loading

gmiodice commented Apr 17, 2018

psyhtest commented Apr 17, 2018 •

edited

Loading

psyhtest commented Apr 17, 2018

psyhtest commented Apr 17, 2018 •

edited

Loading

ReQuEST accuracy evaluation on ImageNet: ArmCL 18.03 vs. TensorFlow 1.7 #431

ReQuEST accuracy evaluation on ImageNet: ArmCL 18.03 vs. TensorFlow 1.7 #431

Comments

psyhtest commented Apr 16, 2018 • edited Loading

Top 1

Top 5

psyhtest commented Apr 16, 2018

psyhtest commented Apr 16, 2018 • edited Loading

AnthonyBarbier commented Apr 16, 2018

psyhtest commented Apr 16, 2018 • edited Loading

gmiodice commented Apr 17, 2018

psyhtest commented Apr 17, 2018 • edited Loading

Installing artifact dependencies

Detecting GCC, Python, OpenCL

Installing the MobileNets artifact

Running the classification program

psyhtest commented Apr 17, 2018

psyhtest commented Apr 17, 2018 • edited Loading

Installing artifact dependencies

Installing the artifact

Running the classification program

psyhtest commented Apr 16, 2018 •

edited

Loading

psyhtest commented Apr 16, 2018 •

edited

Loading

psyhtest commented Apr 16, 2018 •

edited

Loading

psyhtest commented Apr 17, 2018 •

edited

Loading

psyhtest commented Apr 17, 2018 •

edited

Loading