Skip to content

Commit

Permalink
a
Browse files Browse the repository at this point in the history
  • Loading branch information
naishh committed Jun 26, 2012
1 parent d37fe7e commit cf58544
Show file tree
Hide file tree
Showing 9 changed files with 43 additions and 42 deletions.
9 changes: 5 additions & 4 deletions ch0_abstract.tex
@@ -1,9 +1,10 @@
\begin{abstract}
% afkijken van conclusion
We build an automatic system that adds semantics to a urban scene.
A 3D model was extracted by combining the \emph{FIT3D toolbox\cite{FIT3D}} combined with a skyline detection algorithm.
The skyline detector extracts straight line segments from the upper edges of an
image. These are used to set the heights of the walls of the 3D model.
We build an automatic system that adds semantics to an urban scene.
A 3D model was extracted by combining the \emph{FIT3D toolbox\cite{FIT3D}} with a skyline detection algorithm.
The skyline detector extracts straight line segments from the upper edge of an
image. These are used to determine the wall heights of an extended 2D model extracted from
\emph{Openstreetmap\cite{Openstreetmap}}.
Next, windows are extracted by two different methods. The first method
searches for connected horizontal and vertical edges which come from the window
frame. The second method works on rectified facades and consist of 1) the
Expand Down
26 changes: 13 additions & 13 deletions ch1_intro.tex
@@ -1,7 +1,7 @@
\section{Introduction}
When we humans look at an urban scene we immediately can tell which part
represents a building, a tree, a door, a window or a parked car.
Even if the scene suffers from high occlusion (a tree occluding the largest part
Even if the scene suffers from high occlusion (e.g. a tree occluding the largest part
of a building) or extreme perspective distortion (a building seen from the
corners of your eye) we perform this task with a very high accuracy.
For a computer system however, this task is far from trivial.\\
Expand All @@ -12,22 +12,21 @@ \section{Introduction}

The most important reason of our excellent visual perception is that we combine
a series of depth cues \cite{psy} (which enables us to experience depth) with top down
processing (which enables us to classify objects).\\

processing (which enables us to classify objects).
One of the most important depth cue is stereopsis. We use two eyes and look at
the same scene from slightly different angles. This makes it possible to
triangulate the distance to an object with a high degree of accuracy
\cite{psy} \cite{hartley}. \\

Objects are classified using a widely supported theory in psychology: top down processing \cite{anderson}
If we want to perceive a window, we tap into the neurons that are activated according to
Objects are classified using a widely supported theory in psychology: top down
processing \cite{anderson}. If we want to perceive a window, we tap into the neurons that are activated according to
our (generalized) description of a window. I.e. because windows are often
rectangular shaped and the color stands out we tap into the neurons that
perceive straight parallel lines, orthogonal corners and intense color changes.
These visual processes are extremely informative if we want to build a computer system
that acts accordingly. \\

This thesis is about our work of an automatic system that adds semantics to a
This thesis is about our work of an automatic system that adds semantics to an
urban scene inspired on the human brain. We use stereopsis to generate a 3D
model of the building. This model is improved by detecting the building contour
using edge features. The 3D model is used to rectify a facade which makes
Expand Down Expand Up @@ -94,7 +93,7 @@ \subsection{Application examples}
structure explicit. For a more accurate disambiguation, other types of
contextual information are desired. The semantical interpretation of the
facade can provide this need. In this context, window detection can be used
as a strong discriminator.
as a strong discriminator.\\

We can conclude that semantic interpretation plays an important role in the
interpretation of urban scenes and is applied in a wide range of domains.
Expand All @@ -103,15 +102,16 @@ \subsection{Application examples}
\subsection{Thesis outline}
The outline of this thesis is as follows:\\

We start with explaining basic computer vision techniques and the 3rd party
software \emph{FIT3D toolbox} in Chapter 2. These techniques are the driving
We start with explaining basic computer vision techniques and the \emph{FIT3D toolbox} in Chapter 2.
These techniques are the driving
force behind the algorithms used in both skyline detection and window detection.
In Chapter 3 we explain a novel application of skyline detection: the detection
of building contours in urban scenes. Next we use this result in Chapter 4 to
extract a 3D model of a building.
In Chapter 5 we start a new topic: window detection. We propose a window
of building contours in urban scenes. Next, we use this result to
extract a 3D model of a building in Chapter 4.

In Chapter 5 we start a new topic: window detection. First we propose a window
detection method that operates on an unrectified facade. The second method uses
a rectified facade. We discuss and compare two window alignment methods and two
a rectified facade. We discuss and compare two window alignment and
classification methods. We conclude in Chapter 6 and we finish with additional
results in the Appendices.\\

Expand Down
32 changes: 14 additions & 18 deletions ch2_preliminaries.tex
Expand Up @@ -3,7 +3,7 @@ \section{Preliminaries on Computer Vision}
\label{sec:ch2}
In this chapter we discuss the basic computer vision techniques that are used for
the skyline detection, and window detection. Furthermore we discuss 3rd
party software, the \emph{FIT3D toolbox} \cite{FIT3D} which is used in 3D building extraction and facade rectification.
party software, the \emph{FIT3D toolbox} \cite{FIT3D} which is used for 3D building extraction and facade rectification.

\subsection{Hough transform}
\label{sec:prelimHough}
Expand Down Expand Up @@ -31,20 +31,22 @@ \subsubsection{Theory}

This means that a point in $(x,y)$ space appears as a sinusoidal
curve in the Hough parameter $(r,\theta)$ space. Furthermore a line in
$(x,y)$ space appears as a point in $(r,\theta)$ space. Let's see an example, the
following image is transformed into the space $(r,\theta)$.
$(x,y)$ space appears as a point in $(r,\theta)$ space.\\
Let's see an example, the image of Figure \ref{fig:HoughTransform_edge.eps}
is transformed into the space $(r,\theta)$.

%todo introduce the accumulator array,
%todo transform gif to eps
\fig{HoughTransform_edge.eps}{An input image, consisting of eight straight lines, for the Hough transform}{0.5}
\figsHor{HoughTransform_peaks}{HoughTransform_peaks1.eps}{HoughTransform_peaks2.eps}{Hough transform} {$(r, \theta)$ values}{$(r, \theta)$ accumulator array (quantized)}
\clearpage


As you can see for every edge point
in Figure \ref{fig:HoughTransform_edge.eps}
a curve is generated in $(r,\theta)$ space in Figure
\ref{fig:HoughTransform_peaks1.eps}.
On eight positions (white dots) the number of intersecting sinusoidal
On eight positions (dots) the number of intersecting sinusoidal
curves is high. These position correspond to the eight separate straight
line segments in Figure \ref{fig:HoughTransform_edge.eps} .

Expand All @@ -53,18 +55,14 @@ \subsubsection{Theory}

\subsubsection{Implementation}
The input of a Hough transform is a binary image. In our research it is the output of
the skyline detector (\ref{sec:skylinedetection}). In the case of window
detection (\ref{sec:windowDetection}) it is the output of an edge image.\\
the skyline detector (Chapter \ref{sec:skylinedetection}). In the case of window
detection (Chapter \ref{sec:windowDetection}) it is the output of an edge image.\\

The Hough transform develops an accumulator array of a quantized parameter space $(r, \theta)$.

It loops through the binary image and for each positive value
The Hough transform develops an accumulator array of a quantized parameter space $(r, \theta)$. It loops through the binary image and for each positive value
it generates all possible lines, quantized $(r, \theta)$ pairs, that intersect with this point.
For each candidate it increases a vote in the accumulator array.
Lines $(r, \theta)$ that receive a large amount of votes
i.e. the dots in Figure \ref{fig:HoughTransform_peaks1.eps} are the found straight lines in the $(x,y)$ space.

These positions are found by looking for local maxima in the accumulator array.
i.e. the dots in Figure \ref{fig:HoughTransform_peaks1.eps} are the found straight lines in the $(x,y)$ space. These positions are found by looking for local maxima in the accumulator array.

\subsubsection{$\theta$ constrained Hough transform}
The accumulator array consist of two dimensions $r$ and $\theta$.
Expand All @@ -76,13 +74,11 @@ \subsubsection{$\theta$ constrained Hough transform}
For example the skyline of a building will appear about horizontal. If we
want to detect windows we would like to detect edges in the horizontal and vertical directions.
This can easily achieved by adjusting the $\theta$ range.
For example if one would detect lines in the horizontal direction of
a photograph of a building taken by a user, $\theta = [-10..0..10]$.Although
only $\theta = 0$ presents an exact horizontal line we broaden the interval
because the user hardly ever holds the camera exactly orthogonal.
For example if one would detect lines in the horizontal direction, $\theta = [-10..0..10]$. Intervals are used because in practise the lines
often differ slightly from an exact horizontal line where $\theta = 0$.

\subsubsection{MATLAB\cite{matlab} parameters}
We used a standard MATLAB\cite{matlab} implementation of the Hough transform for straight lines. This implementation comes with some interesting parameters:\\
We used a standard MATLAB\cite{matlab} implementation of the Hough transform. This implementation comes with some interesting parameters:\\

The \emph{MinimumLength} parameter specifies the minimum length that a line must have to be valid. This is especially interesting if we want to detect a large straight skyline or if we want to discard lines that are to small to form for example a window.\\

Expand Down Expand Up @@ -171,7 +167,7 @@ \subsection{Camera calibration}
explained next.


\subsection{FIT3D toolbox \ref{FIT3D}}
\subsection{FIT3D toolbox \cite{FIT3D}}
\label{sec:prelimFIT3D}
%todo give FIT3D a proper intro
The \emph{FIT3D toolbox} \ref{FIT3D} is used for several aims in this thesis.
Expand Down
6 changes: 3 additions & 3 deletions ch5_windowDetection.tex
Expand Up @@ -328,7 +328,7 @@ \subsubsection{3D plane based rectification}
computational very expensive as each pixel needs to be projected. To keep the
computational cost to a minimum we project only the necessary data. Since we
are using Hough lines we project only the coordinates of the endpoints of the found Hough lines.
This is allowed because the projective transformation is we apply is a affine
This is allowed because the projective transformation is we apply is an affine
transformation which preserves the
straightness of the lines \cite{linearalgebra}. Note that this means we apply the edge detection and
Houghline extraction on the unrectified image.\\
Expand Down Expand Up @@ -734,7 +734,7 @@ \subsubsection{Basic window classification (based on line amount)}
non-window areas: the window classification, we developed two different methods for this.

Instead of classifying each block independently, we classify full rows and
columns of blocks as window or non-window areas. This approach results in a accurate
columns of blocks as window or non-window areas. This approach results in an accurate
classification as it combines a full blockrow and blockcolumn as evidence for a singular
window.

Expand Down Expand Up @@ -1027,7 +1027,7 @@ \subsubsection{Improved window classification (based on shape of the histogram f
window in Figure areas are false negatively classified, see the left side of
Figure \ref{fig:w_Dirk4Trans_ImClassRect.eps}. This is caused by a small area
between the windows that is classified as a non-window area. This could be
solved by by adding a minimum size constraint of a area to be threaded as a
solved by by adding a minimum size constraint of an area to be threaded as a
non-window area. In this way small negatively classified areas cannot interrupt
the adjacent windows.

Expand Down
4 changes: 2 additions & 2 deletions commandsPre.tex
Expand Up @@ -46,11 +46,11 @@
\begin{figure}[!ht]
\centering
\subfigure[#5]{
\includegraphics[width=6cm]{img/#2}
\includegraphics[width=5cm]{img/#2}
\label{fig:#2}
}
\subfigure[#6]{
\includegraphics[width=6cm]{img/#3}
\includegraphics[width=5cm]{img/#3}
\label{fig:#3}
}
\caption{#4}
Expand Down
2 changes: 1 addition & 1 deletion header.tex
Expand Up @@ -12,7 +12,7 @@
%\subject {subject comes here}
%\keywords {keywords come here}

\title {\LARGE \sc{[DRAFT]Semantic annotation of urban scenes:}\\Skyline and window detection}
\title {\LARGE \sc{Semantic annotation of urban scenes:}\\Skyline and window detection}
%\title{\huge{Dynamic Programming For \\Extensive Form Games With \\Imperfect
%Information}}

Expand Down
Binary file modified main.pdf
Binary file not shown.
4 changes: 3 additions & 1 deletion references.bib
Expand Up @@ -274,7 +274,9 @@ @Book{anderson
}

@MISC{kovesi,
title = {P. D. Kovesi. MATLAB and Octave functions for computer vision and image processing. School of Computer Science \& Software Engineering, The University of Western Australia. Available from: }
title = {P. D. Kovesi. MATLAB and Octave functions for computer vision and
image processing. School of Computer Science \& Software Engineering, The
University of Western Australia. Available from: },
howpublished = "\url{http://www.csse.uwa.edu.au/~pk/research/matlabfns/}"
}

Expand Down
2 changes: 2 additions & 0 deletions todalooMain.txt
Expand Up @@ -10,6 +10,8 @@ Hoe bereik ik dit, door simpelweg te DOEN, lukt dit niet:


---------------------------------------------------------------------------------
ch1
L and H patterns uitleggen en opzoeken

---------------------------------------------------------------------------------
todos staan op 2 plekken!
Expand Down

0 comments on commit cf58544

Please sign in to comment.