Augmented Reality: Render 3D object in a video frame

(Work in Progress)

This repository is a simple implementation of rendering 3D objects in a video frame. It is insipired by this work: Augmented reality with Python and OpenCV

Details:

The link provided above gives a very detailed explanation on this topic. To cut it short, given below are some important points that covers the basic idea of this project.

Capture a reference image (a 2D flat surface) on which you wish to render your 3D object on (shown below is an example of how it should look like).

(ps: this is not a paid promotion :P)

Capture a video frame that contains the above mentioned refernce image.
Download an .obj file. (I used clara.io which was also recomended in the website above).
what is a .obj file ? - Wiki link
Extract feature keypoints and feature descriptors from the refernce image.
Loop over every frame in the video:
Extract feature keypoints and feature descriptors.
Find best matches between the descriptors from reference image and descriptors extracted from video frame.
Estimate the homography matrix based on these matched descriptors (and keypoints). This homography matrix basically tells us the transformation that the refernce image went through in the video frame.
Once we have this transformation matrix, we can trasnform the 3D object to correctly match the orientation of the refrence flat surface in the video frame.
Next, is the tricky part.

Let's start by first describing the homography matrix mentioned above:
Source: F. Moreno
We have the calibration matrix or Intrinisc matrix (blue shaded)and external calibration matrix or extrinic matrix (red shaded). The extrinsic matrix is made up of rotation and translation matrix. On the left hand side we have the u, v coordinates (in the image plane) of a given point p (any 3D point denoted as [x,y,z]) expressed in the camera coordinate. The combination of intrinsic and extrinsic camera parameters is called the projective/homography matrix.

We assume that any position on the flat surface plane (reference image) can be described by a 3D position p = [x, y, z]^T. Here, the z-coordinate represents directions perpendicular to the plane, and is hence always zero. This modifies the above equation to p = [x, y, 0]^T.

Due to the above reason, we drop the third column in the rotation matrix as the z-coordinate of all the points we wanted to map was 0.
Source: F. Moreno
However, in order to project all the points in the 3D object, we now want to project points whose z-coordinate is different than 0.
The basic idea now is to first extract [R1, R2, t] in the extrinsic camera matrix by multiplying the inverse of the intrinsic camera matrix (assuming we know this) with the homography matrix that we calculated earlier. We then find a new pair of orthonormal basis similar to (R1, R2) and then computer R3. [R1, R2, R3, t] is the new extrinsic matrix. Combining this extrinsic matrix with our previously mentioned intrinsic matrix gives us a new homography matrix that will help us to place any point of the 3D object in our video frame.
If this was difficult for you to understand, I highly recommend visiting the website mentioned above

Requirements:

ToDo

Implementation Details:

I have tried to make this repository as simple as possible. There are only two things to keep in mind while running this repository.

config.yml file:

This yaml file contains the following fields:

DATASET:
- INPUT_DIR: Path to the folder where all the images are stored. (default=test)
- REF_IMG: The reference image that we are looking for in every video frame.
- VIDEO_PATH: path where the video file is saved.
- OUTPUT_DIR: Path to the folder where all the results should be stored. (default=results)
- RENDERED_OBJ: path to the .obj file wthat you wish to render in the video frame.
FEATURES:
- FEATURE_DESCRIPTORS: Default is set (Other choices are provided in comments)
- FEATURE_MATCHING: Default is set (Other choices are provided in comments)
- FEATURE_MATCHING_THRESHOLD: Default is set (Other choices are provided in comments)
CAMERA_PARAMETERS:
- INTRINSIC: intrinsic camera parameters which is a 3x3 matrix (list of list here)
RENDERING:
- SCALE_FACTOR:

One can simply change the parameters in the config file to try the effect of the different techniques.

Command to run the program python -m run --c [path to config.yml]

I have kept the path to config.yml as an argument so that the user can have multiple config files corresponding to different projects (with different images and varied feature attributes)

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
doc_images		doc_images
results		results
test		test
README.md		README.md
augment.py		augment.py
config.py		config.py
config.yml		config.yml
obj_loader.py		obj_loader.py
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

doc_images

doc_images

results

results

test

test

README.md

README.md

augment.py

augment.py

config.py

config.py

config.yml

config.yml

obj_loader.py

obj_loader.py

run.py

run.py

Repository files navigation

Augmented Reality: Render 3D object in a video frame

Details:

Requirements:

Implementation Details:

About

Releases

Packages

Languages

Arshita27/Augmented-Reality--rendering-object-in-video

Folders and files

Latest commit

History

Repository files navigation

Augmented Reality: Render 3D object in a video frame

Details:

Requirements:

Implementation Details:

About

Resources

Stars

Watchers

Forks

Languages