Skip to content

Commit

Permalink
Modifications after presentation
Browse files Browse the repository at this point in the history
  • Loading branch information
joaovcarvalho committed Mar 7, 2016
1 parent 69dd1e3 commit 834412b
Show file tree
Hide file tree
Showing 4 changed files with 271 additions and 229 deletions.
10 changes: 5 additions & 5 deletions AR01/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -182,9 +182,9 @@ mat4 generateFrustrumWithCamera(float nearf, float farf) {
}
}

Rt.at<double>(0, 3) = -tvec.at<double>(0, 0) + 4;
Rt.at<double>(1, 3) = -tvec.at<double>(1, 0) - 1;
Rt.at<double>(2, 3) = tvec.at<double>(2, 0) + 4;
Rt.at<double>(0, 3) = -tvec.at<double>(0, 0);
Rt.at<double>(1, 3) = -tvec.at<double>(1, 0);
Rt.at<double>(2, 3) = tvec.at<double>(2, 0) ;

Rt.at<double>(3, 3) = 1.0;

Expand Down Expand Up @@ -280,7 +280,7 @@ void display() {

if (cameraCalibraded && found) {
mat4 mvp = generateFrustrumWithCamera(0.1, 5000);
sun->rotate(0, 1.0f, 0);
//sun->rotate(0, 1.0f, 0);
sun->display(mvp);
}

Expand Down Expand Up @@ -319,7 +319,7 @@ void init()
cout << "Textures finished loading." << endl;

// Instantitate the objects
sun = new Object(nonDiffuseShader, "models/Earth.obj", sunTexture);
sun = new Object(nonDiffuseShader, "models/space_frigate.obj", sunTexture);
cout << "Objects finished loading." << endl;

// Doing initial transformations
Expand Down
244 changes: 132 additions & 112 deletions report.lyx
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,104 @@ The goal of this application is to implement a simple Augmented Reality
the 3D model.
\end_layout

\begin_layout Subsection
Pipeline/Brief Approach
\end_layout

\begin_layout Standard
The approach used was to use the following technologies: C++ as main programming
language,OpenCV as computer vision library, OpenGL as the graphical library
and GLEW/FreeGLUT for the integration of the C++ application with the OpenGL.
We are going to use the chessboard as marker for identification of the
plane, which the 3D model is going to be projected above.
\end_layout

\begin_layout Paragraph
The Pipeline
\end_layout

\begin_layout Standard
The pipeline used for the implementation, which is going to be explain in
detail later, was the following:
\end_layout

\begin_layout Standard
- Get the camera and store N frames when the chessboard is detected.
\end_layout

\begin_layout Standard
- Use the N frames to calibrate the camera and get the camera intrinsic
matrix.
\end_layout

\begin_layout Standard
- Now use every frame to get the extrinsic matrix using the intrinsic matrix
and the detected chessboard.
\end_layout

\begin_layout Standard
- Convert the rotation vector to a rotation matrix using the Rodrigues operation.
\end_layout

\begin_layout Standard
- Generate the frustrum projection matrix and RT ( Rotation and Translation)
matrix using the rotation matrix,translated vector and intrinsic matrix.
\end_layout

\begin_layout Standard
- Generate a polygon and use the frame as texture to it.
Render it on the OpenGL screen.
\end_layout

\begin_layout Standard
- Render the model using the projection and RT matrix as Model-View-Projection
matrix to the OpenGL screen in front of the polygon holding the frame as
texture.
\end_layout

\begin_layout Subsection
Challenges
\end_layout

\begin_layout Standard
The challenges of the project are:
\end_layout

\begin_layout Standard
- Detection of the markers
\end_layout

\begin_layout Standard
- Calibration of the camera and finding intrinsic matrix.
\end_layout

\begin_layout Standard
- Finding the extrinsic matrix using correspondences of points.
\end_layout

\begin_layout Standard
- Finding the Model-View-Projection matrix using the extrinsic and intrinsic
matrices.
\end_layout

\begin_layout Standard
- Sending the MVP matrix to the graphics pipeline and rendering the 3D model
correctly.
\end_layout

\begin_layout Standard
Others kinds of challenges that I faced were:
\end_layout

\begin_layout Standard
- Integration of the computer vision library with the computer graphics
library.
\end_layout

\begin_layout Section
Approach
\end_layout

\begin_layout Subsection
Problem to be solved
\end_layout
Expand All @@ -111,7 +209,8 @@ When approaching this problem we need to think about the camera model we
We need to understand the process involved when the camera takes a picture
of the world.
We are dealing with a transformation from the 3D point ( position of the
object in the world) to a 2D ( image pixel ).
object in the world) to a 2D point ( image pixel ) that is made by the
camera.
\end_layout

\begin_layout Standard
Expand Down Expand Up @@ -187,7 +286,7 @@ With the definition of P and p we can think about the process of the camera
In theory this process works by projecting the rays through a hole in a
lightproof box and the rays are going to be projected in the back of the
box as shown below.
This is the idea behind the pinhole camera.
This is the idea behind the pinhole camera used in the modern cameras.
\end_layout

\begin_layout Standard
Expand Down Expand Up @@ -395,7 +494,7 @@ Z
\begin_layout Standard
The RT matrix is used to rotate and translate the camera, so it holds informatio
n about the orientation and position of the camera in the space.
The RT matrix is also know as the extrinsic matrix as it holds information
The RT matrix is also known as the extrinsic matrix as it holds information
of the world around the camera.
The K matrix is called intrinsic matrix and it holds information about
how the camera works( field of view, aperture, etc.).
Expand Down Expand Up @@ -434,104 +533,6 @@ camera calibration
.
\end_layout

\begin_layout Subsection
Pipeline
\end_layout

\begin_layout Standard
The approach used was to use the following technologies: C++ as main programming
language,OpenCV as computer vision library, OpenGL as the graphical library
and GLEW/FreeGLUT for the integration of the C++ application with the OpenGL.
We are going to use the chessboard as marker for identification of the
plane which the object that is going to be projected.
The image used for identification is the following.
\end_layout

\begin_layout Paragraph
The Pipeline
\end_layout

\begin_layout Standard
The pipeline used for the implementation was the following:
\end_layout

\begin_layout Standard
- Get the camera and store N frames when the chessboard is detected.
\end_layout

\begin_layout Standard
- Use the N frames to calibrate the camera and get the camera intrinsic
matrix.
\end_layout

\begin_layout Standard
- Now use every frame to get the extrinsic matrix using the intrinsic matrix
and the detected chessboard.
\end_layout

\begin_layout Standard
- Convert the rotation vector to a rotation matrix using the Rodrigues operation.
\end_layout

\begin_layout Standard
- Generate the frustrum projection matrix and RT ( Rotation and Translation)
matrix using the rotation matrix,translated vector and intrinsic matrix.
\end_layout

\begin_layout Standard
- Generate a polygon and use the frame as texture to it.
Render it on the OpenGL screen.
\end_layout

\begin_layout Standard
- Render the model using the projection and RT matrix as Model-View-Projection
matrix to the OpenGL screen in front of the polygon holding the frame as
texture.
\end_layout

\begin_layout Subsection
Challenges
\end_layout

\begin_layout Standard
The challenges of the project are:
\end_layout

\begin_layout Standard
- Detection of the markers
\end_layout

\begin_layout Standard
- Calibration of the camera and finding intrinsic matrix.
\end_layout

\begin_layout Standard
- Finding the extrinsic matrix using correspondences of points.
\end_layout

\begin_layout Standard
- Finding the Model-View-Projection matrix using the extrinsic and intrinsic
matrices.
\end_layout

\begin_layout Standard
- Sending the MVP matrix to the graphics pipeline and rendering the 3D model
correctly.
\end_layout

\begin_layout Standard
Others kinds of challenges that I faced were:
\end_layout

\begin_layout Standard
- Integration of the computer vision library with the computer graphics
library.
\end_layout

\begin_layout Section
Approach
\end_layout

\begin_layout Subsection
Image Capturing and Marker Detection
\end_layout
Expand All @@ -550,6 +551,8 @@ A Chessboard, as showed in the image, was used because of its regular pattern
get the connected components ( or contours as in OpenCV ), for each countour
it checks the aspect size and box size ( as given by the minimum rect area
).
This is methods that the chessboard detection is composed of, but I could
not find more information on why it works and how exactly is done.
\end_layout

\begin_layout Standard
Expand All @@ -569,7 +572,7 @@ This is going to let us know if the chessboard is present in the scene and
where are the positions of the corners in the image.
We are going to use theses images to calibrate the camera and calculate
the extrinsic matrix.
So when the chessboard is present we store this image in an array of images
So when the chessboard is present, we store this image in an array of images
that we are going to use to calibrate.
\end_layout

Expand Down Expand Up @@ -598,8 +601,11 @@ p=C*P
In our case we have the points of the chessboard from the image and we can
generate a set of 3D points for the chessboard by knowing how many squares
we have and giving a certain size to them.
Is important to give them a Z coordinate as well, in this case because
it's a plane, we can set the z coordinate to be 0 to all points.
It is important to give them a Z coordinate as well.
In this case because it's a plane, we can set the z coordinate to be 0
to all points.
We are considering one of the points to be the origin and the others to
stay in the plane where z is equal 0.
\end_layout

\begin_layout Standard
Expand Down Expand Up @@ -666,9 +672,8 @@ key "key-1"

\end_inset

as great inspiration, is to use: a closed-form solution, followed by a
nonlinear refinement based on the maximum likelihood criterion as described
by Zhang
, is to use: a closed-form solution, followed by a nonlinear refinement
based on the maximum likelihood criterion as described by Zhang
\begin_inset CommandInset citation
LatexCommand cite
key "key-2"
Expand Down Expand Up @@ -755,7 +760,7 @@ In the last step we have found both the intrinsic matrix and the extrinsic
We have calibrated the camera and now we need to get every frame and find
the rotation and translations vectors.
With this we can mount the RT matrix to be passed to OpenGL.
We are going to use the solvePnP function from OpenGL that follows the
We are going to use the solvePnP function from OpenCV that follows the
same method as the calibration function, only it only calculates the rotation
vector and translation vector already knowing the intrinsic matrix.
With this we use the Rodrigues method to get the rotation matrix.
Expand Down Expand Up @@ -861,7 +866,10 @@ Our intrinsic matrix give us the distance between the image plane and the

\begin_layout Standard
Similarly we can define the top and bottom parameters.
That way we can define our perspective projection.
This way we can define our perspective projection.
The near and far are chosen easily by placing a low near and a great far.
This values are arbitrary but you need to consider the size of the board
to know the distance from the camera to it and not occlude the 3D model.
\end_layout

\begin_layout Subsection
Expand All @@ -871,13 +879,15 @@ Using the frame as a texture and rendering in the OpenGL screen
\begin_layout Standard
In order to display the image and the 3D model we need to render it in some
way.
I decided to render the frame and the 3D in the OpenGL pipeline and screen.
I decided to render the frame and the 3D model in the OpenGL pipeline and
screen.
My idea was to generate a simple plane to use as medium to display the
frame and render the 3D model in front of it.
I used the frame data as a texture to the plane.
Also I temporally disabled the z-buffer test before rendering the plane
so the 3D model is always in front of it.
I used a simple shader to render the plane.
More can be found on the source code of the project.
\end_layout

\begin_layout Standard
Expand All @@ -900,6 +910,10 @@ Finally I used the Assimp Library to load the 3D model from a .obj file.
Also I used a texture in order to improve the graphics and to more easily
distinguish the rotations.
The 3D model is a sphere with the texture of the sun.
I render it using a simple shader using texture and receive a MVP matrix
as parameter.
Every vector is multiplied by the matrix in order to get the final position.
No illumination model was used in order to keep it simple.
\end_layout

\begin_layout Section
Expand All @@ -911,11 +925,11 @@ Results
\end_layout

\begin_layout Standard
The results were good after a series of complications with the integration
The results were good, after a series of complications with the integration
of the libraries.
I had some major problems with Visual Studio, OpenCV and OpenGL to integrate
all together.
After that most of the steps went okay.
After that, most of the steps went okay.
You can see some of the results below:
\end_layout

Expand Down Expand Up @@ -976,6 +990,12 @@ Other great tip is to wait a certain number of frames between acquiring
A simple counter that is restarted every time you store an image is enough.
\end_layout

\begin_layout Standard
After much experimenting with the number of frames that we use for the camera
calibration, I realized that 15 is the good number to use.
It doesn't take much time to calibrate and give us good results.
\end_layout

\begin_layout Section
Conclusion
\end_layout
Expand Down

0 comments on commit 834412b

Please sign in to comment.