Opencl - half floating point support and introduce layer fusion for inference #5745
Open
Commits
Show all changes
33 commits
Select commit
Hold shift + click to select a range
4556630
set MKL_USE_SINGLE_DYNAMIC_LIBRARY as disable.
gongzg c5ebc9e
Add optimized GEMM/GEMV into caffe greentea math library.
gongzg 3c9c415
Prepare to support layer fusions.
gongzg bc013a5
Implement layer fusion in spatial convolution engine.
gongzg e270af6
Add LRN fusion with Pooling layer.
gongzg a5dd297
Enable image based GEMM interface for inner product layer.
gongzg 6abe23d
Optimize BN layer for inference only.
gongzg 052c332
softmax layer cpu fwd - no need to max values with themselves
5a1d290
Refine zero copy support.
gongzg 0e4994a
Add new lt option to caffe tool.
gongzg 89f6315
Use explicit constant value type rather than the default double type.
gongzg 4eaf108
Fix a bug in inner product layer.
gongzg 742d803
Reduce the maximum block size for spatial convolution engine.
gongzg db3da6e
Simplify IDLF kernel's output logic.
gongzg d121f01
Add one infernece optimized model file for AlexNet.
gongzg ba5ac27
Disable viennacl cache mechanism during spatial engine's tuning phase.
gongzg dab271d
Fix segfault when VIENNACL_CACHE_PATH is not set.
gongzg 44b345a
Add fused activation function.
gongzg 1af4a95
Enable model fuse script to generate merged-model and adding an examp…
listenlink dd8555a
1, Enable gemm_fast_image blocks computing logic;
listenlink 6ab9944
Always allocate zero-copy capable memory.
gongzg 9681b26
Fix "nan" value bug for matvec_mul.cl
5cb43d8
Enable FP16 support for OpenCL backend.
gongzg ca71f1b
Optimize buffer based gemm_nt kernel with both float and fp16 versions.
wujunkai166 683bd2f
Add negative slope support for relu fusion.
gongzg b7f36b3
We still need EXAMPLES_SOURCE_DIR for some test cases.
gongzg 3733407
Fix two OCL kernel compilation warnings.
gongzg 5072f06
Fix inner product layer for non-intel platform.
gongzg 3031f4e
Fix kernel compilation issue for non-Intel Gen platforms.
gongzg 1a99aea
Adjust test cases for half precision.
gongzg 12f62b3
Fix lrn fusion for non-Intel Gen platform.
gongzg 03cd924
Lint fix.
gongzg e38c619
Move half.hpp's license to 3rdparty/half.
gongzg
Jump to file or symbol
Failed to load files and symbols.
| @@ -0,0 +1,20 @@ | ||
| +#!/bin/sh | ||
| +export CAFFE_ROOT="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"/../.. | ||
| +export PYTHONPATH=$CAFFE_ROOT/"python" | ||
| +# generate new fused model | ||
| +python $CAFFE_ROOT/tools/inference-optimize/model_fuse.py \ | ||
| + --indefinition $CAFFE_ROOT/models/bvlc_googlenet/deploy.prototxt \ | ||
| + --inmodel $CAFFE_ROOT/models/bvlc_googlenet/bvlc_googlenet.caffemodel \ | ||
| + --outdefinition $CAFFE_ROOT/models/bvlc_googlenet/fused_deploy.prototxt \ | ||
| + --outmodel $CAFFE_ROOT/models/bvlc_googlenet/fused_bvlc_googlenet.caffemodel \ | ||
| + --half_precision_mode=HALF_NONE \ | ||
| + | ||
| +#Use cpp_classfication to test | ||
| +$CAFFE_ROOT/build/examples/cpp_classification/classification.bin \ | ||
| + $CAFFE_ROOT/models/bvlc_googlenet/fused_deploy.prototxt \ | ||
| + $CAFFE_ROOT/models/bvlc_googlenet/fused_bvlc_googlenet.caffemodel \ | ||
| + $CAFFE_ROOT/data/ilsvrc12/imagenet_mean.binaryproto \ | ||
| + $CAFFE_ROOT/data/ilsvrc12/synset_words.txt \ | ||
| + $CAFFE_ROOT/examples/images/cat.jpg | ||
| + | ||
| + |
| @@ -0,0 +1,18 @@ | ||
| +# Using model fuse to run inference-ontpimzed caffe | ||
| + | ||
| +The example use fused-model prototxt and weightfile to using layer-fused classification. | ||
| + | ||
| +Take googlenet as an example: | ||
| + | ||
| +1. Download GoogleNet model form "Model Zoo" using following script: | ||
| +``` | ||
| + $CAFFE_ROOT/scripts/download_model_binary.py models/bvlc_googlenet | ||
| +``` | ||
| +2. ImageNet label file required by: | ||
| +``` | ||
| + $CAFFE_ROOT/data/ilsvrc12/get_ilsvrc_aux.sh | ||
| +``` | ||
| +3. Use model_fuse.py to generate fused model and cpp_classifcation to test the clasify funtionality with script: | ||
| +``` | ||
| + ./googlenet_inference_test.sh | ||
| +``` |
| @@ -0,0 +1,21 @@ | ||
| +The MIT License | ||
| + | ||
| +Copyright (c) 2012-2017 Christian Rau | ||
| + | ||
| +Permission is hereby granted, free of charge, to any person obtaining a copy | ||
| +of this software and associated documentation files (the "Software"), to deal | ||
| +in the Software without restriction, including without limitation the rights | ||
| +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
| +copies of the Software, and to permit persons to whom the Software is | ||
| +furnished to do so, subject to the following conditions: | ||
| + | ||
| +The above copyright notice and this permission notice shall be included in | ||
| +all copies or substantial portions of the Software. | ||
| + | ||
| +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
| +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
| +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
| +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
| +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
| +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN | ||
| +THE SOFTWARE. |
Oops, something went wrong.