Compute Library for Deep Neural Networks (clDNN)
Compute Library for Deep Neural Networks (clDNN) is an open source performance
library for Deep Learning (DL) applications intended for acceleration of
DL Inference on Intel® Processor Graphics – including HD Graphics and
clDNN includes highly optimized building blocks for implementation of convolutional neural networks (CNN) with C and C++ interfaces. We created this project to enable the DL community to innovate on Intel® processors.
Usages supported: Image recognition, image detection, and image segmentation.
Validated Topologies: AlexNet*, VGG(16,19)*, GoogleNet(v1,v2,v3)*, ResNet(50,101,152)* Faster R-CNN*, Squeezenet*, SSD_googlenet*, SSD_VGG*, PVANET*, PVANET_REID*, age_gender*, FCN* and yolo*.
As with any technical preview, APIs may change in future updates.
clDNN is licensed is licensed under Apache License Version 2.0.
clDNN uses 3rd-party components licensed under following licenses:
- boost under Boost* Software License - Version 1.0
- googletest under Google* License
- OpenCL™ ICD and C++ Wrapper under Khronos™ License
The latest clDNN documentation is at GitHub pages.
There is also inline documentation available that can be generated with Doxygen.
Accelerate Deep Learning Inference with Intel® Processor Graphics whitepaper link.
Intel® OpenVino™ Toolkit and clDNN
clDNN is released also together with Intel® OpenVino™ Toolkit, which contains:
- Model Optimizer a Python*-based command line tool, which imports trained models from popular deep learning frameworks such as Caffe*, TensorFlow*, and Apache MXNet*.
- Inference Engine an execution engine which uses a common API to deliver inference solutions on the platform of your choice (for example GPU with clDNN library)
You can find more information here.
New features: - support for img_info=4 in proposal_gpu - support images format in winograd - support for 2 or more inputs in eltwise - priority and throttle hints - deconvolution_grad_input primitive - fc_grad_input and fc_grad_weights primitives Bug fixes: - tensor fixes (i.e. less operator fix) - cascade concat fixes - winograd fixes for bfyx format - auto-tuning fixes for weights calculation UX: - memory pool (reusing memory buffers) - added choosen kernel name in graph dump - flush memory functionality Performance: - graph optimizations - depth-concatenation with fused relu optimization - winograd optimizations - deconvolution optimizations (i.e bfyx opt)
New features: - fused winograd - image support for weights - yolo_region primitive support - yolo_reorg primitive support Bug fixes: - winograd bias fix - mean subtract fix UX: - update boost to 1.64.0 - extend graph dumps Performance: - update offline caches for newer drivers - conv1x1 byxf optimization - conv1x1 with images - cascade depth concatenation fuse optimization
New features: - split primitive - upsampling primitive - add preliminary Coffe Lake support - uint8 weights support - versioning - offline autotuner cache - Winograd phase 1 - not used yet Bug fixes: - in-place crop optimization bug fix - output spatial padding in yxfb kernels fix - local work sizes fix in softmax - underflow fix in batch normalization - average pooling corner case fix UX: - graph logger, dumps graphwiz format files - extended documentation with API diagram and graph compilation steps Performance: - softmax optimization - lrn within channel optimization - priorbox optimization - constant propagation
New features: - OOOQ execution model implementation - depthwise separable convolution implementation - kernel auto-tuner implementation Bug fixes: - dump hidden layer fix - run single layer fix - reshape fix UX: - enable RTTI - better error handling/reporting Performance: - lrn optimization - dynamic pruning for sparse fc layers - reorder optimization - concatenation optimization - eltwise optimization - activation fusing
Added: - kernel selector - custom layer Changed: - performance improvments - bug fixes (deconvolution, softmax, reshape) - apply fixes from community reported issues
Added: - step by step tutorial Changed: - perfomance optimization for: softmax, fully connected, eltwise, reshape - bug fixes (conformance)
- initial drop of clDNN
Please report issues and suggestions GitHub issues.
How to Contribute
We welcome community contributions to clDNN. If you have an idea how to improve the library:
- Share your proposal via GitHub issues
- Ensure you can build the product and run all the examples with your patch
- In the case of a larger feature, create a test
- Submit a pull request
We will review your contribution and, if any additional fixes or modifications are necessary, may provide feedback to guide you. When accepted, your pull request will be merged into our internal and GitHub repositories.
clDNN supports Intel® HD Graphics and Intel® Iris® Graphics and is optimized for
- Codename Skylake:
- Intel® HD Graphics 510 (GT1, client market)
- Intel® HD Graphics 515 (GT2, client market)
- Intel® HD Graphics 520 (GT2, client market)
- Intel® HD Graphics 530 (GT2, client market)
- Intel® Iris® Graphics 540 (GT3e, client market)
- Intel® Iris® Graphics 550 (GT3e, client market)
- Intel® Iris® Pro Graphics 580 (GT4e, client market)
- Intel® HD Graphics P530 (GT2, server market)
- Intel® Iris® Pro Graphics P555 (GT3e, server market)
- Intel® Iris® Pro Graphics P580 (GT4e, server market)
- Codename Apollolake:
- Intel® HD Graphics 500
- Intel® HD Graphics 505
- Codename Kabylake:
- Intel® HD Graphics 610 (GT1, client market)
- Intel® HD Graphics 615 (GT2, client market)
- Intel® HD Graphics 620 (GT2, client market)
- Intel® HD Graphics 630 (GT2, client market)
- Intel® Iris® Graphics 640 (GT3e, client market)
- Intel® Iris® Graphics 650 (GT3e, client market)
- Intel® HD Graphics P630 (GT2, server market)
- Intel® Iris® Pro Graphics 630 (GT2, server market)
clDNN currently uses OpenCL™ with multiple Intel® OpenCL™ extensions and requires Intel® Graphics Driver to run.
clDNN requires CPU with Intel® SSE/Intel® AVX support.
The software dependencies are:
- CMake* 3.9 or later
(the project is compatible with CMake 3.1, but, due to issues with boost libraries resolution in CMake 3.4.3, with CheckCXXCompilerFlag module in CMake 3.5.2 and hard dependency on supported boost version based on version of CMake, we strongly recommend 3.9+)
NOTE: In rare situation when update of CMake is not possible, you can try to update / override only FindBoost.cmake module. You can do that by downloading FindBoost.cmake file from newer version of CMake (e.g. from here) and putting the file into common/boost/cmake/modules directory (create it if necessary). This directory will be attached to the list of modules if your CMake version is lower than 3.9.
- C++ compiler with partiall or full C++14 standard support compatible with:
- GNU* Compiler Collection 5.2 or later
- clang 3.5 or later
- Intel® C++ Compiler 17.0 or later
- Visual C++ 2015 (MSVC++ 19.0) or later
Intel® CPU intrinsics header (
<immintrin.h>) must be available during compilation.
- python™ 2.7 or later (scripts are both compatible with python™ 2.7.x and python™ 3.x)
- (optional) Doxygen* 1.8.13 or later
Needed for manual generation of documentation from inline comments or running
docscustom target which will generate it automatically.
GraphViz* (2.38 or later) is also recommended to generate documentation with all embedded diagrams.
(Make sure that
dotapplication is visible in the
The software was validated on:
- CentOS* 7.2 with GNU* Compiler Collection 5.2 (64-bit only), using Intel® Graphics Compute Runtime for OpenCL(TM) .
- Windows® 10 and Windows® Server 2012 R2 with MSVC 14.0, using Intel® Graphics Driver for Windows* [24.20] driver package.
More information on Intel® OpenCL™ drivers can be found here.
Download clDNN source code or clone the repository to your system:
git clone https://github.com/intel/cldnn.git
Satisfy all software dependencies and ensure that the versions are correct before building.
clDNN uses multiple 3rd-party components. They are stored in binary form in
common subdirectory. Currently they are prepared for MSVC++ and GCC*. They will be cloned with repository.
clDNN uses a CMake-based build system. You can use CMake command-line tool or CMake GUI (
cmake-gui) to generate required solution.
For Windows system, you can call in
@REM Generate 32-bit solution (solution contains multiple build configurations)... cmake -E make_directory build && cd build && cmake -G "Visual Studio 14 2015" .. @REM Generate 64-bit solution (solution contains multiple build configurations)... cmake -E make_directory build && cd build && cmake -G "Visual Studio 14 2015 Win64" ..
Created solution can be opened in Visual Studio 2015 or built using appropriate
(you can also use
cmake --build . to select build tool automatically).
For Unix and Linux systems:
@REM Create GNU makefile for release clDNN and build it... cmake -E make_directory build && cd build && cmake -DCMAKE_BUILD_TYPE=Release .. && make @REM Create Ninja makefile for debug clDNN and build it... cmake -E make_directory build && cd build && cmake -G Ninja -DCMAKE_BUILD_TYPE=Debug .. && ninja -k 20
You can call also scripts in main directory of project which will create solutions/makefiles for clDNN (they
will generate solutions/makefiles in
build subdirectory and binary outputs will be written to
create_msvc_mscc.bat(Windows*, Visual Studio* 2015)
create_unixmake_gcc.sh [Y|N] [<devtoolset-version>](Linux*, GNU* or Ninja* makefiles, optional devtoolset support)
- If you specify the first parameter as
Y, the Ninja makefiles will be generated.
- If you specify second parameter (number), the CMake will be called via
- If you specify the first parameter as
CMake solution offers multiple options which you can specify using normal CMake syntax (
|CMAKE_BUILD_TYPE||STRING||Build configuration that will be used by generated makefiles (it does not affect multi-configuration generators like generators for Visual Studio solutions). Currently supported:
|CMAKE_INSTALL_PREFIX||PATH||Install directory prefix.|
|CLDNN__ARCHITECTURE_TARGET||STRING||Architecture of target system (where binary output will be deployed). CMake will try to detect it automatically (based on selected generator type, host OS and compiler properties). Specify this option only if CMake has problem with detection. Currently supported:
|CLDNN__OUTPUT_DIR (CLDNN__OUTPUT_BIN_DIR, CLDNN__OUTPUT_LIB_DIR)||PATH||Location where built artifacts will be written to. It is set automatically to roughly
|CMake advanced option||Type||Description|
|PYTHON_EXECUTABLE||FILEPATH||Path to Python interpreter. CMake will try to detect Python. Specify this option only if CMake has problem with locating Python.|
|CLDNN__BOOST_VERSION||STRING||Version of boost prebuilded binaries to use (from
|CLDNN__IOCL_ICD_USE_EXTERNAL||BOOL||Use this option to enable use of external Intel® OpenCL™ SDK as a source for ICD binaries and headers (based on
|CLDNN__IOCL_ICD_VERSION||STRING||Version of Intel® OpenCL™ ICD binaries and headers to use (from
|CLDNN__COMPILE_LINK_ALLOW_UNSAFE_SIZE_OPT||BOOL||Allow unsafe optimizations during linking (like aggressive dead code elimination, etc.). Default:
|CLDNN__COMPILE_LINK_USE_STATIC_RUNTIME||BOOL||Link with static C++ runtime. Default:
|CLDNN__INCLUDE_CORE||BOOL||Include core clDNN library project in generated makefiles/solutions. Default:
|CLDNN__INCLUDE_TESTS||BOOL||Include tests application project (based on googletest framework) in generated makefiles/solutions . Default:
|CLDNN__RUN_TESTS||BOOL||Run tests after building
|CLDNN__CMAKE_DEBUG||BOOL||Enable extended debug messages in CMake. Default:
clDNN includes unit tests implemented using the googletest framework. To validate your build, run
tests target, e.g.:
(Make sure that both
CLDNN__RUN_TESTS were set to
ON when invoking CMake.)
Documentation templates and configuration files are stored in
docs subdirectory. You can simply call:
cd docs && doxygen
to generate HTML documentation in
There is also custom CMake target named
docs which will generate documentation in
CLDNN__OUTPUT_BIN_DIR/html directory. For example, when using Unix makefiles, you can run:
in order to create it.
install target will place the API header files and libraries in
C:/Program Files/clDNN or
C:/Program Files (x86)/clDNN on Windows). To change
the installation path, use the option
-DCMAKE_INSTALL_PREFIX=<prefix> when invoking CMake.
* Other names and brands may be claimed as the property of others.
Copyright © 2017, Intel® Corporation