Version: 0.1 alpha2 Project Page: https://github.com/imisra/esvmTestCPP
HOG and Spatial Convolution on SIMD Architecture
Authors: Ishan Misra, Abhinav Shrivastava, Martial Hebert.
The details and aim of this project can be found in our tech report. Please cite it if you use this code for any purpose. The code can be downloaded freely from our github project page.
Ishan Misra, Abhinav Shrivastava, Martial Hebert - ”HOG and Spatial Convolution on SIMD Architecture” CMU Tech Report XXXX (2013)
One may also use this code just for computing HOG features or performing spatial-convolution (without anything to do with Exemplar SVMs). The code was designed to be modular to allow for such a use-case.
If you are using this pipeline for testing, it is assumed that you have basic familiarity with the MATLAB esvm code.
Disclaimer: This is alpha quality software. (Notice the alpha2 in the version number). Alpha is Latin for “doesn’t work and may burn your computer”. The code hasn’t been tested very thoroughly, and we will try to fix any bugs that you report.
The code depends on the following open-source projects
- ISPC (http://ispc.github.com) : For optimized SIMD code generation
- OpenCV (http://opencv.willowgarage.com) : For Image I/O.
- OpenMP and
pthreads: For spawning threads (in ISPC as well as
omp parallel forconstructs).
It requires a
x86-64 ISA compatible machine. The code is
released for a
GNU/Linux compatible Operating System. There are a
few dependencies on the
GNU C Compiler (GCC) mainly due to
macros defined in
esvm_utils.h. The dependencies on
GCC will be resolved in future releases.
- ISPC Setup : The code base includes the ISPC binary for
x86-64. (version 1.3.0 as of this writing). So nothing special needs to be done for this setup. If you are using a 32 bit Operating System, you will need to compile ISPC from source. As of this writing pre-built 32-bit binaries are not available from the ISPC github page.
- OpenCV : Any version
2.xshould suffice. In reality, any version above
1.2will be fine, but may need a change in the includes (since OpenCV
2.xhas a different way of including header files).
demos/Makefile: The Makefile in the
demosdirectory compiles demos. One may need to change architecture specific flags like
corei7-avxetc. depending on the exact CPU model. These flags are marked separately for convenience (as
ARCHflags). Discarding architecture specific flags generally affects performance, but maybe useful if you just want to try out the code.
##+begin_src sh :results output #tree -d ##+end_src
. ├── common: ISPC 64-bit Linux binary and internal files (Task system) ├── demos: contains demo files for HOG, Exemplar testing. ├── internal ├── matlab-files: easy conversion between HOG format in MATLAB/C++ codebases └── sample-data: sample data for demo files to run └── exemplars ├── exemplar-mat-files └── exemplar-txt-files 8 directories
Input format for Exemplars
Suppose you have $C$ classes, and for each class you have $Ni$ ($i=1\ldots C$) exemplars. The input format as of this version is
descFile: This is a file containing names of $C$ classes followed by the name of a
classDescFile. The format is
classDescFile: This file contains 4 fields per line. The first field is the path to the
txtfile containing the exemplar data, the second and third field are the number of rows and columns respectively. The fourth field is the offset ($b$ in the SVM decision function $wTx+b$) which is exemplar-specific.
Generating the exemplar data files
The exemplar data is written in ASCII files. These files can be
generated by using the
writeHogTxt function in C++. It is also
expected that the user will have trained exemplars in the form of
.mat files from the MATLAB esvm code. In order to convert these
.mat files and generate the necessary
classDescFiles, there are helper scripts in the
directory. The script
convert_mat_txt.m is provided for
reference. The functions
used for reading and writing HOG features or exemplars (since
exemplars and HOG features are both 3D arrays of the form $m×
The parameters for exemplar testing can be put together in the
struct esvmParameters. A user can get default parameters by
calling the function
esvmDefaultParameters. These default
parameters correspond to default parameters from the MATLAB esvm
code. The following are the
main fields to be concerned with
levelsPerOctave: Defines the number of times an image is resized between two scalings of 1/2. A larger value means tighter bounding box (in terms of “where exactly is the object ?”). An empirical maximum and minimum are between 10 and 3. The actual value is application specific.
maxHogLevels: Maximum number of HOG levels computed. The actual value also depends on
minHogDim: Minimum dimension of HOG before any sort of zero-padding.
minImageScale: A number between 0 and 1. Determines the minimum scaling factor for resizing the image.
useMexResize: A boolean parameter. When set to true (the default) image resizing uses a C++ version from the original MATLAB esvm code. Setting this to false, uses the native OpenCV image resizing which is faster.
detectionThreshold: A number between 0 and 1. The threshold for exemplar detection. A higher threshold means lesser false positives (but also a lower detection rate).
nmsOverlapThreshold: A number between 0 and 1. The non-maximal-suppression threshold. Decides when to consider two overlapping detections as two different detections.
maxWindowsPerExemplar: Maximum number of detections per exemplar.
maxTotalBoxesPerExemplar: This value is used for pre-allocation of memory. It should be greater than
userTasks: Maximum number of threads to spawn. Usually setting this number equal to 1 or 2 times the number of physical cores gives a reasonable performance.
Bounding box information
The bounding boxes are stored in
struct esvmBoxes. It internally
stores them in a
float array. It is recommended to use
pre-defined macros for accessing/copying the bounding boxes. These
are defined in
demos directory contains an
example showing how to use them.
Detection precision depends on which image resize function is
used. As far as we can tell, it is best to use the same resize function
for training and testing. The default option of
uses the resize function from the MATLAB implementation of
Exemplar-SVM. If speed is an issue, then one can switch over to the
OpenCV resize function, but the detection results will differ.
Another thing to note is that the HOG implementation uses
precision for computing the features (as opposed to
double in the
MATLAB HOG implementation of Pedro Felzenszwalb).
Read the Tech-Report for more details on how the performance compares to the MATLAB testing pipeline.
The detection demos aren’t even close to perfect
Yes. It is just a demo. You will need to adjust the thresholds depending on your particular dataset/exemplars.
I am getting an
Illegal Instruction when running demos on a Virtual Machine
This happens because a lot of the VMs do not support
sse4-2 instructions. In the
Makefile set the variables
ARCHFLAGS to blank (i.e. just delete whatever
is front of them, but keep the “=” sign). This should generally resolve the issue. This of
course means that you are not using SIMD code now.
You can try setting the
ARCHFLAGS to blank and
You keep mentioning
C++, but all of your programming is
Correct. I mention
C++ because I did use a few
libraries. There were a few headaches using
classes and our flavor of SIMD optimizations (ISPC).
Can I use this for HOG computation only ?
Yes. Check out examples (
demo02) in the
Can I use this for Convolution computation only ?
Yes. Check out examples (
demo00) in the
What HOG feature do you compute ?
It is based on the paper
- P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan, ”Object detection with discriminatively trained part based models”, PAMI 2010
It is different from the HOG popularized by the “Pedestrian detection” application from Navneet Dalal’s paper (N. Dalal and B. Triggs, ”Histograms of oriented gradients for human detection”, CVPR 2005).
This latest reincarnation of the HOG feature is generally considered to be more discriminative than the earlier versions, for object detection tasks.
Is this library thread-safe ?
Unfortunately, no. The reason has to do with the ISPC task implementation. A request for changing this has been filed ( https://groups.google.com/forum/#!topic/ispc-users/FgQgCVFMWTs) and as soon as this gets fixed, the library should be thread-safe.
- BLOCK convolution: Model convolution as matrix multiplication and use ATLAS for performing the matrix multiplication. Hope to achieve speeds comparable to the MATLAB design.
- Reduce memory reads/writes in NMS
- Better I/O format for Exemplars. This will involve changing the
read/writefunctions in MATLAB and C++. No changes expected in the API. I need feedback from users as to what they would like!
- Fix dependency issues on GCC and Linux. The
memaligncalls need to be changed.
- Include a 32 bit binary for ISPC ?
parameters->flipImageto be implemented.