Skip to content

waynechu1109/FINS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FINS: Fast Image-to-Neural Surface

License: MIT arXiv Project Page

Wei-Teng Chu1, Tianyi Zhang2, Matthew Johnson-Roberson5, Weiming Zhi3,4,5

1 Dept. of Electrical Engineering, Stanford University · 2 Aurora · 3 School of Computer Science, The University of Sydney · 4 Australian Centre for Robotics · 5 College of Connected Computing, Vanderbilt University

@misc{chu2025fins,
  title         = {Efficient Construction of Implicit Surface Models From a Single Image for Motion Generation}, 
  author        = {Wei-Teng Chu and Tianyi Zhang and Matthew Johnson-Roberson and Weiming Zhi},
  year          = {2025},
  eprint        = {2509.20681},
  archivePrefix = {arXiv},
  primaryClass  = {cs.RO},
  url           = {https://arxiv.org/abs/2509.20681}, 
}

Table of Contents

Overview

FINS: Fast Image-to-Neural Surface reconstructs high-fidelity signed distance fields (SDFs) from as little as a single RGB image in just a few seconds.

Unlike traditional neural surface methods that require dense multi-view supervision and long optimization times, FINS leverages pretrained 3D foundation models to generate geometric priors, combined with multi-resolution hash encoding and lightweight SDF heads for rapid convergence.

The resulting implicit representation enables real-time surface reconstruction and supports downstream robotics tasks such as motion planning, obstacle avoidance, and surface following.

FINS bridges single-image perception and fast neural implicit modeling, making SDF construction practical for real-world robotic systems.

Prerequisites

This repository is designed to run with Docker only. Please make sure the following are ready before installation:

  • Linux environment (validated on Ubuntu) with an NVIDIA GPU.
  • Docker Engine.
  • NVIDIA Driver installed on host.
  • NVIDIA Container Toolkit (--gpus all support).

Optional:

  • f3d for quick point cloud visualization.

Quick Start

git clone https://github.com/waynechu1109/FINS.git
cd FINS

Docker

# pull docker image from docker hub
sudo docker pull waynechu1109/droplab_research:latest

# run docker  
docker run -it --gpus all \
  -p 8000:8000 \
  -e DISPLAY=$DISPLAY \
  -v /tmp/.X11-unix:/tmp/.X11-unix \
  -v $HOME/FINS:/FINS \
  -v /etc/passwd:/etc/passwd:ro \
  -v /etc/group:/etc/group:ro \
  --name FINS \
  waynechu1109/droplab_research:latest /bin/bash

Dataset Preparation

  • We used DTU Training dataset for experiments. Please download the preprocessed DTU dataset provided by MVSNet.
  • The data should be prepared in the structure as:
data/
├── dtu_105_09/
│   └── dtu_105_09.png
├── dtu_108_32/
│   └── dtu_108_32.png
└── ...
  • You can also try custom data.

Image Preprocess

Clone VGGT first.

# clone VGGT for preprocess data
mkdir deps && cd deps
git clone https://github.com/facebookresearch/vggt.git
cd ..

The image should be placed in data/<image_name>/<image_name>.png. VGGT can generate point cloud with only a single image.

cd tools

# vggt preprocess
python3 vggt_pointcloud_generate.py --file dtu_118_60 --thres 65 --max_points 90000
  • To tune the confidence threshold in percentage, set --thres.
  • When the scene is concave, set --concave true. The direction of point clouds' normals are important for the training.
  • When the computing resource is limited, set --max_points. The default value is 200,000. You can also tune higher if higher mesh quality is needed.

For more options, see python3 vggt_pointcloud_generate.py -h.

After preprocess, you can find the preprocessed point cloud file in data/vggt_preprocessed/<file_name>. It is convenient to view preprocessed point clouds with F3D. You can simply install it with sudo apt install f3d.

Training and Inferring

The script for the whole pipeline can be found in scripts/experiment.sh, which include the commands for both training and inferring. If you want to run series training (for example, multiple scenes at a single run), see scripts/run_exp_series.sh.

# Start series training
./scripts/run_exp_series.sh

The results can be found in output/.

Acknowledgements

This project would not have been possible without prior work such as VGGT and DUSt3R. We thank the authors of these works and the broader research community for making this project possible.

Citation

If you find this repository useful, please cite our arXiv paper:

@misc{chu2025fins,
  title         = {Efficient Construction of Implicit Surface Models From a Single Image for Motion Generation}, 
  author        = {Wei-Teng Chu and Tianyi Zhang and Matthew Johnson-Roberson and Weiming Zhi},
  year          = {2025},
  eprint        = {2509.20681},
  archivePrefix = {arXiv},
  primaryClass  = {cs.RO},
  url           = {https://arxiv.org/abs/2509.20681}, 
}

License

This project is licensed under the MIT License. See LICENSE for details.

About

[ICRA 2026 Accepted] FINS: Fast Image-to-Neural Surface

Resources

License

Stars

Watchers

Forks

Contributors