Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse files

minor fixes

  • Loading branch information...
commit 1d798b109384f0cfe76a8a5601fdba87c6bbc5b5 1 parent 176d262
@shinpei0208 authored
View
2  draft/abstract.tex
@@ -1,5 +1,5 @@
\begin{abstract}
-Visione-based object detection using camera sensors is an essential piece
+Vision-based object detection using camera sensors is an essential piece
of perception for autonomous vehicles.
Various combinations of features and models can be applied to increase
the quality and the speed of object detection.
View
4 draft/assumption.tex
@@ -7,7 +7,7 @@ \section{Assumption}
environment can be downloaded from NVIDIA's website \cite{NVIDIA_NVCC}.
Input images are loaded from pre-captured JPEG files, since we focus on
a high computational cost of image processing.
-Systemized coordinations of computations and I/O devices are outside the
+Systematized coordination of computations and I/O devices are outside the
scope of this paper.
The use of multiple GPUs is also not in consideration.
@@ -37,7 +37,7 @@ \section{Assumption}
A brief concept of this approach is illustrated in
Fig.~\ref{fig:deformable_model}.
Although these models achieve a high detection rate, the computational
-cost of scoring similarity of an imput image and the models using HOG
+cost of scoring similarity of an input image and the models using HOG
features is very expensive.
Specifically they include $2$ root filters and $12$ part filters, each
of which needs to be scored against $32$ resized images.
View
2  draft/conclusion.tex
@@ -26,7 +26,7 @@ \section{Conclusion}
the impact of GPUs in performance.
Our conclusion is that GPUs are promising to meet the required
performance of vision-based object detection in the real world, while
-performance optimizations remain open problems.
+performance optimization remain open problems.
In future work, we plan to complement this work with systematized
coordination of computations and I/O devices.
View
BIN  draft/draft.pdf
Binary file not shown
View
4 draft/evaluation.tex
@@ -94,7 +94,7 @@ \subsection{Experimental Results}
state-of-the-art GPUs for practical vehicle detection.
Fig. \ref{fig:double_exe_time} shows the execution times of all variants
-of the vehicle detection problem configuired to use double-precision
+of the vehicle detection problem configured to use double-precision
floating-point operations.
Unlike the single-precision scenario, the Kepler GPUs outperform the
Fermi GPUs.
@@ -171,6 +171,6 @@ \subsection{Experimental Results}
can contain up to 32 threads and a set of two warps is executed every
two cycles according to the NVIDIA GPU architecture.
Having less threads per block looses parallelism while introducing more
-threads could cause resource confliction within a block.
+threads could cause resource conflict within a block.
Therefore a more in-depth investigation is required to truly optimize
performance.
View
4 draft/implementation.tex
@@ -202,7 +202,7 @@ \subsection{GPU Programming}
}
\end{lstlisting}
-As aforementioned, GPU programming involes some trade-offs.
+As aforementioned, GPU programming involves some trade-offs.
It is not straightforward to address these trade-offs due to a complex
architecture of the GPU.
For example, parallel threads may conflict on some functional unit.
@@ -215,7 +215,7 @@ \subsection{GPU Programming}
still providing much better performance than CPU implementations.
An optimization of GPU programming is left open for future work.
-Listing~\ref{lst:detect}~and~\ref{lst:hog} illustrate the remainig parts
+Listing~\ref{lst:detect}~and~\ref{lst:hog} illustrate the remaining parts
of the program that we parallelize using the GPU.
We also unroll all the loops of these blocks to accelerate computations
on the GPU.
View
4 draft/introduction.tex
@@ -37,7 +37,7 @@ \section{Introduction}
consideration of real-world applications using deformable part models
\cite{Felzenszwalb10}.
While this is a popular vision-based object detection approach, what
-remains an open question is a genelized programming technique and a
+remains an open question is a generalized programming technique and a
quantification of performance characteristics for practical use.
We begin with an analysis of traditional CPU
implementations to find fundamental performance bottlenecks of HOG-based
@@ -45,7 +45,7 @@ \section{Introduction}
This analysis reasons about our approach to GPU implementations that we
offload only the detected compute-intensive blocks of the object
detection program to the GPU.
-The experimental results obstained from a real-world vehicle detection
+The experimental results obtained from a real-world vehicle detection
program show that commodity GPUs outperform high-performance multi-core
CPUs by 3x to 5x in frame-rate, though at least another 5x improvement
would be needed to deploy in the real world.
Please sign in to comment.
Something went wrong with that request. Please try again.