Permalink
Browse files

updated Evaluation

  • Loading branch information...
1 parent 040222a commit 1450f52a4dafee27cfc36e186c89949337a053a1 @shinpei0208 committed Apr 3, 2013
Showing with 29 additions and 10 deletions.
  1. +1 −1 draft/assumption.tex
  2. BIN draft/draft.pdf
  3. +26 −7 draft/evaluation.tex
  4. +1 −1 draft/implementation.tex
  5. +1 −1 draft/introduction.tex
View
@@ -35,7 +35,7 @@ \section{Assumption}
independently.
In consequence, there are approximately 10 million computational
blocks for a single high-definition image, while the frame-rate needs to
-meet $10$ to $20$ frames per second (FPS) for practical use.
+meet 10$\sim$20 frames per second (FPS) for practical use.
This data-parallel compute-intensive nature of HOG-based object
detection motivates the use of GPUs in this paper.
View
Binary file not shown.
View
@@ -30,12 +30,12 @@ \subsection{Experimental Results}
\begin{figure}[t]
\begin{center}
\includegraphics[width=\hsize]{fig/float_exe_time.eps}\\
- \caption{Execution times of the single precision floating point program.}
+ \caption{computation times of the single precision floating point program.}
\label{fig:float_exe_time}
\end{center}
\end{figure}
-Fig.~\ref{fig:float_exe_time} shows the execution times of all variants
+Fig.~\ref{fig:float_exe_time} shows the computation times of all variants
of the vehicle detection program configured to use the single precision
for floating operations.
The dimensions of input images are 640$\times$480 pixels.
@@ -77,12 +77,12 @@ \subsection{Experimental Results}
\begin{figure}[t]
\begin{center}
\includegraphics[width=\hsize]{fig/double_exe_time.eps}\\
- \caption{Execution times of the double precision floating point program.}
+ \caption{computation times of the double precision floating point program.}
\label{fig:double_exe_time}
\end{center}
\end{figure}
-Fig. \ref{fig:double_exe_time} shows the execution times of all variants
+Fig. \ref{fig:double_exe_time} shows the computation times of all variants
of the vehicle detection problem configuired to use the double precision
for floating operations.
Unlike the single-precision scenario, the Kepler GPUs outperform the
@@ -98,13 +98,13 @@ \subsection{Experimental Results}
\begin{figure}[t]
\begin{center}
\includegraphics[width=\hsize]{fig/time_on_image_size.eps}\\
- \caption{Impact of the image size on execution times.}
+ \caption{Impact of the image size on computation times.}
\label{fig:time_on_image_size}
\end{center}
\end{figure}
Fig. \ref{fig:time_on_image_size} shows the impact of the image size on
-execution times.
+computation times.
We herein use the program configured to use the single precision for
floating-point operations.
The GPU implementation uses the GeForce GTX 580 GPU, which is the best
@@ -114,4 +114,23 @@ \subsection{Experimental Results}
image size.
This means that the benefit of our GPU implementations as compared to
the traditional CPU implementations would hold for more high-resolution
-image processing.
+image processing.
+
+\begin{figure}[t]
+ \begin{center}
+ %\includegraphics[width=\hsize]{fig/time_on_image_size.eps}\\
+ ADD FIGURE HERE
+ \caption{Impact of the block and thread shapes on computation times.}
+ \label{fig:time_on_block_thread_shapes}
+ \end{center}
+\end{figure}
+
+
+\begin{figure}[t]
+ \begin{center}
+ %\includegraphics[width=\hsize]{fig/time_on_image_size.eps}\\
+ ADD FIGURE HERE
+ \caption{The breakdown of computation times of the GPU implementation.}
+ \label{fig:breakdown_gpu}
+ \end{center}
+\end{figure}
View
@@ -79,7 +79,7 @@ \subsection{Program Analysis}
\begin{figure}[t]
\begin{center}
\includegraphics[width=0.5\hsize]{fig/breakdown.eps}\\
- \caption{The breakdown of computation times.}
+ \caption{The breakdown of computation times of the sequential implementation.}
\label{fig:breakdown}
\end{center}
\end{figure}
View
@@ -47,7 +47,7 @@ \section{Introduction}
using the GPU step by step to minimize its makespan.
The experimental results obstained from a real-world car detection
program using a commodity GPU show that the GPU outperforms the CPU by
-1.5x to 3x in frame-rate, while another 2x improvement would be needed
+3x to 5x in frame-rate, while another 5x improvement would be needed
at least to deploy in the real world.
\textbf{Organization:}

0 comments on commit 1450f52

Please sign in to comment.