EgoCOL: Egocentric Camera pose estimation for Open-world 3D object Localization

Cristhian Forigua, Maria Escobar, Jordi Pont-Tuset, Kevis-Kokitsi Maninis, Pablo Arbeláez
Center for Research and Formation in Artificial Intelligence .(CINFONIA), Universidad de los Andes, Bogotá 111711, Colombia.

[arXiv]

We present EgoCOL, an egocentric camera pose estimation method for open-world 3D object localization. Our method leverages sparse camera pose reconstructions in a two-fold manner, video and scan independently, to estimate the camera pose of egocentric frames in 3D renders with high recall and precision. We extensively evaluate our method on the Visual Query (VQ) 3D object localization Ego4D benchmark. EgoCOL can estimate 62% and 59% more camera poses than the Ego4D baseline in the Ego4D Visual Queries 3D Localization challenge at CVPR 2023 in the val and test sets, respectively.

Installation instructions

Please follow the installation instructions from the Ego4D Episodic Memory repository.
You need to install COLMAP to compute the reconstructions. Please follow these instructions to install it.
Finally, you need to install the Open3D library. Follow these instructions to install it.

Data

Please follow the instructions from the Ego4D Episodic Memory repository to download the VQ3D data here.

Run EgoCOL

First, you need to compute the initial PnP camera poses by using the camera pose estimatio workflow proposed by Ego4D. Follow these instructions to compute them.

Once you have computed the initial camera poses you can use colmap to create the sparse reconstrutions using both the video and clip configurations:

$ cd colmap
$ python run_registrations.py --input_poses_dir {PATH_CLIPS_CAMERA_POSES} \
 --clips_dir {PATH_CLIPS_FRAMES} --output_dir {OUTPUT_PATH_COLMAP}

Similarly, run must run the registration for the scan configuration:

$ python run_registrations_by_scans.py --input_poses_dir {PATH_CLIPS_CAMERA_POSES} \
--clips_dir {PATH_CLIPS_FRAMES} --output_dir {OUTPUT_PATH_COLMAP_SCAN} --camera_intrinsics_filename {PATH_TO_INTRINSICS} --query_filename {PATH_TO_QUERY_ANNOT_FILE}

You get the folders {PATH_CLIPS_CAMERA_POSES}, {PATH_CLIPS_FRAMES}, {PATH_TO_INTRINSICS} and {PATH_TO_QUERY_ANNOT_FILE} by running the camera pose estimation worflow proposed by Ego4D. You can use the defaul value of each argument in the .py files to help you locate the right paths.

Then, you can compute the procrustes transformation between the PnP and sparse points by running the next lines. Make sure to change the paths for the "--annotations_dir", "--input_dir_colmap" and "--clips_dir" flags before you run the code.

$ python extract_dict_from_colmap.py
$ python extract_dict_from_colmap_by_scans.py

Then run the following lines:

$ python transform_ext.py --constrain --filter
$ python transform_ext_by_scan.py --constrain --filter

Make sure to change the paths for the flags. Also change the paths in lines 341 and 370 for the transform_ext.py and the lines 286, 207 and 369 for the transform_ext_by_scan.py. The filter and constrain flags are to apply 3D constrain

Evaluate

Center scan

To evaluate our method you can use the center of the scan as:

Compute Ground-Truth vector in query frame coordinate system for queries with pose estimated.

$ python scripts/prepare_ground_truth_for_queries.py --input_dir {PATH_CLIPS_CAMERA_POSES} --vq3d_queries {VQ3D_QUERIES_ANNOT_JSON_FILE} --output_filename {OUTPUT_JSON_FILE} --vq2d_queries {VQ2D_QUERIES_ANNOT_JSON_FILE} --check_colmap

Compute 3D vector predictions

$ python3 scripts/run.py --input_dir {PATH_CLIPS_CAMERA_POSES} --output_filename {OUTPUT_RUN_JSON_FILE} --vq2d_results {VQ2D_RESULTS_JSON_FILE} --vq2d_annot {VQ2D_ANNOT_JSON_FILE} --vq2d_queries {VQ2D_QUERIES_ANNOT_JSON_FILE} --vq3d_queries {OUTPUT_JSON_FILE} --check_colmap --constrain --baseline_center

Run evaluation

$ python scripts/eval.py --vq3d_results {OUTPUT_RUN_JSON_FILE}

License and Acknowledgement

This project borrows heavily from Episodic Memory Ego4D Repository, we thank the authors for their contributions to the community.

Contact

If you have any question, please email cd.forigua@uniandes.edu.co

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
VQ2D		VQ2D
VQ3D		VQ3D
colmap		colmap
img		img
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VQ2D

VQ2D

VQ3D

VQ3D

colmap

colmap

img

img

README.md

README.md

Repository files navigation

EgoCOL: Egocentric Camera pose estimation for Open-world 3D object Localization

Installation instructions

Data

Run EgoCOL

Evaluate

Center scan

License and Acknowledgement

Contact

About

Releases

Packages

Languages

BCV-Uniandes/EgoCOL

Folders and files

Latest commit

History

Repository files navigation

EgoCOL: Egocentric Camera pose estimation for Open-world 3D object Localization

Installation instructions

Data

Run EgoCOL

Evaluate

Center scan

License and Acknowledgement

Contact

About

Resources

Stars

Watchers

Forks

Languages