Set-up, training and testing custom pose estimation pipelines is non-trivial. It can be a tedious and time-consuming process. This repository aims to simplify this.
The main contributions can be summarized as follows:
-
a Docker container ready to run an extended version of OnePose++
-
OnePose++ extended with:
- DeepSingleCameraCalibration for running inference on in-the-wild videos
- CoTracker2 for pose estimation optimization, improving the pose tracking performance by leveraging temporal cues as well1.
-
A low-entry demo to help understand the whole pipeline and readily debug/test the code.
-
custom data for Spot & instructions on how you can create the synthetic data for your own use-case
Having a CUDA-enabled GPU is a must. The code was tested on the following GPUs:
- NVIDIA GeForce RTX 2080
with the following OS & driver versions:
DISTRIB_DESCRIPTION="Ubuntu 20.04.6 LTS"
NVIDIA-SMI (Driver Versions) 470.223.02
CUDA Version: 11.4
Docker Version: 24.0.7, build afdd53b
Set up the code by cloning the repository, initializing the submodules and downloading the necessary models and demo data:
git clone git@github.com:mizeller/OnePose_ST.git
cd OnePose_ST
git submodule update --init --recursive
mkdir -p data weight
The pre-trained models for OnePose++, LoFTR and CoTracker2 as well as the demo data can be found here. Place the model files in weight/
and the demo data in data/
.
At this point, the project directory should look like this:
.
├── assets
...
├── data
│ └── spot_demo
├── submodules
│ ├── CoTracker
│ ├── DeepLM
│ └── LoFTR
└── weight
├── LoFTR_wsize9.ckpt
├── OnePosePlus_model.ckpt
└── cotracker2.pth
To set up the docker container either build it locally
docker build -t="mizeller/spot_pose_estimation:00" .
or pull a pre-built container from DockerHub:
docker pull mizeller/spot_pose_estimation:00
Next, the container needs to be run. Again, there are several options to do this.
In case you're using VSCode's devcontainer
feature, simply press CTRL+SHIFT+P
and select Rebuild and Reopen in Container
.
This will re-open the project in a docker container.
Alternatively, you can run the docker container directly from the terminal. The following command mounts the ${REPO_ROOT}
in the container. Note that the shared memory size is set to 32GB, change it to your hardware if necessary.
REPO_ROOT=$(pwd)
docker run --gpus all --shm-size=32g -w /workspaces/OnePose_ST -v ${REPO_ROOT}:/workspaces/OnePose_ST -it mizeller/spot_pose_estimation:00
To test the set up (training and inference), run the demo script from a terminal in the docker container: sh demo.sh
. This will run the following steps:
- Parse the demo data
- Train the OnePose++ model for Spot
- Run inference on the demo data captured using my phone
The results will be saved in the temp/
directory.
FYI: There are also custom debug entry points for each step of the pipeline. Have a look at the .vscode/launch.json
.
TODO: add comments about synthetic data pipeline & clean up the other repo as well
This repository is essentially a fork of the original OnePose++ repository - for more details, have a look at the original source here. Thanks to the original authors for their great work!
This repository uses several submodules, please refer to the respective repositories for their licenses.
This project was developed as part of the Semester Thesis for my (Michel Zeller) MSc. Mechanical Engineering at ETH Zurich. The project was supervised by Dr. Hermann Blum (ETH, Computer Vision and Geometry Group) and Francesco Milano (ETH, Autonomous Systems Lab).
Footnotes
-
Note: As of this writing, CoTracker2 is still a work-in-progress. The online tracker can only run on every 4th frame which does not suffice for optimizing the pose estimation. That's why we currently use CoTracker as a post-processing step to optimize the poses for a given sequence. The 'yet' in this reply by the authors suggests that this feauture will be added to CoTracker in the future. A possible initial implementation is on this feature branch. It has not been updated in a while... ↩