Skip to content

Rintarooo/dtam-1

 
 

Repository files navigation

DTAM: Dense Tracking and Mapping in Real-time

usage

docker build -t rin/python:opencv ./python_docker/
docker run --rm -it -v $HOME/coding/:/opt/coding/ -w /opt/coding/ rin/python:opencv
python mapper.py

you can see generated photometric_loss_vs_depth__Fountain_P11.png.




Overview

This is an implementation of DTAM: Dense Tracking and Mapping in Real-Time - Richard A. Newcombe. This paper proposes using all pixels instead of just some collection of feature points for tracking camera pose as well as dense mapping of environment. Merits are shown over feature based tracking as tracking is more robust to motion blur, camera defocus, even at high framerates. On the mapping side, depth of all pixels is obtained with smooth surface assumption for low-textured regions. System was designed for AR/VR application where we can first build map of the confined environment and then use this for realtime tracking with GPU.

Motivation

Reasons for implementation:

  • Better understanding of paper
  • Paul Forster implemented OpenDTAM under GSOC program, elaborately logged weekly. Implementation seems rough but is well commented with citations. Further, direct GPU impl. is difficult to understand.
  • impl. will further be extended for navigation purposes with coarser, trajectory relevant regions reconstruction.

Resources

Dense Visual SLAM - R.A. Newcombe, Phd. Thesis

On Benchmarking Camera Calibration and Multi-View Stereo for High Resolution Imagery - C. Strecha

super3d - Kitware impl. (looks similar to DTAM)

Telesculptor - End to end 3d reconstruction for UAV videos (Kitware)

Development log

New entry is created with snippet in vscode using prefix log. User defined snippets doc.

  • 03 Aug '20: Read thesis Ch. 1-5 Convex Optimisation Based Multi-view Stereo Depth Estimation. Need to read further about how dual formulation is derived. Revisit general primal-dual optimiztion.

  • 04 Aug '20: Read thesis Ch. 6: Surface Representation, Integration and Prediction, Ch. 8: Direct Tracking from Surface Models, 9.2: DTAM: Dense Tracking and Mapping in Real-time

  • 05 Aug '20: To begin with, we will first implement depth estimation of keyframe (5.3 Global Cost Volume Optimisation). This assumes that, we already have pose of cameras. To test, I will use the same fountain-P11 dataset used by author. Camera poses for all 11 images are given. Will exhaustively search for all discrete depth at all iterations. Then incrementally, reduce search to bounds by parabola, then increase solution accuracy with subsample optimization with single Newton step. In complete framework, system is bootstrapped with poses given by PTAM till first keyframe. After that system uses dense methods using virtual camera loss.

  • 06 Aug '20: Setup cmake project with opencv dependency. Intellisense was not working in vscode. Turns out had somehow missed popup asking use compile_commands.json from build dir for intellisense which created entry "compileCommands": "${workspaceFolder}/build/compile_commands.json". Build tested with read image.

  • 16 Aug '20: c++ implementation discontinued. Reasons include Eigen library is not well documented, eg. constructor for matrix from array, is it row major or column major. Debugging becomes tough for non-native types, can't visualize matrix other than raw data. Unable to get natvis file with vscode cmake debug task. natvis seems well documented for visual studio rather than vscode. Was able to successfully visualize with python lldb script. But still can't debug matrix experession like inverse. All this demanded too much attention on the impl. side than theoretical. I decide to use python instead as all of above problems are solved there. One shortcoming is ofcorse that we loose static type checking and separate static impl. will be needed for production and GPU. But pros far outweight cons as rapid prototyping is need of current time.

  • 18 Aug '20: Photometric loss vs depth plotted for a marker point in Fountain-P11 dataset. Photometric loss along epipolar line can be noisy without clear minimum if only 1 corresponding frame is considered but, clear minimum can be seen by taking all the images in sequence with covisibility. Average has a clear local minima. Alt text

About

implementation for DTAM: Dense Tracking and mapping in Real-Time by RA Newcombe

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 62.8%
  • C++ 22.8%
  • Dockerfile 6.1%
  • OpenEdge ABL 5.9%
  • CMake 2.4%