Doduo: Dense Visual Correspondence from Unsupervised Semantic-Aware Flow

News

2023-09-26: Initial code release.

Huggingface Model

We provide a pre-trained version of the Doduo model on the Huggingface model hub. To use this model, run the following Python command:

from transformers import AutoModel
from PIL import Image
model = AutoModel.from_pretrained("stevetod/doduo", trust_remote_code=True)
frame_src = Image.open("path/to/src/frame.png")
frame_dst = Image.open("path/to/dst/frame.png")
flow = model(frame_src, frame_dst)

Installation

Create a conda environment and install the necessary packages. You can modify the pytorch and cuda version in the env.yaml file.

conda env create -f env.yaml

The data path is stored in .env. Run cp .env.example .env command to create an .env file. You can modify this file to change your data path.

Data and Pretrained Model

Training

We use frames from Youtube VOS dataset for training. Download the data from this source.

Note: We use Mask2Former to generate instance masks for visible region discovery. You can find the predicted masks here. After downloading, unzip the file and place it in the Youtube-VOS/train/ directory.

Testing

Point Correspondence

We evaluate point correspondence on DAVIS val set from TAP-Vid dataset. Please download the data from here.

Pretrained Model

You can download the pretrained model using this link.

Demos

We provide two demonstration notebooks for Doduo:

Visualizing correspondence with any local checkpoint: Make sure you have installed the necessary environment before you initiate this notebook.
Visualizing correspondence with the Huggingface model: No environment installation is required to initiate this notebook.

Training

You can use the following Python commands to start training the model:

# single GPU debug
python src/train.py model.mixed_precision=True experiment=doduo_train debug=fdr

# multiple GPUs + wandb logging
torchrun --rdzv_backend=c10d --rdzv_endpoint=localhost:0 --nnodes=1 --nproc_per_node=4 src/train.py model.mixed_precision=True experiment=doduo_train logger=wandb_csv

Testing

Apply the following Python command, replacing "/path/to/ckpt" with your specific path:

python src/eval.py experiment=doduo_train ckpt_path=/path/to/ckpt

Related Repositories

Our code is based on this fantastic template Lightning-Hydra-Template.
We use Unimatch as our backbone.

Citing

@inproceedings{jiang2023doduo,
   title={Doduo: Dense Visual Correspondence from Unsupervised Semantic-Aware Flow},
   author={Jiang, Zhenyu and Jiang, Hanwen and Zhu, Yuke},
   booktitle={arXiv preprint arXiv:2309.15110},
   year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
configs		configs
data		data
notebooks		notebooks
src		src
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
env.yaml		env.yaml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

License

UT-Austin-RPL/Doduo

Folders and files

Latest commit

History

Repository files navigation

Doduo: Dense Visual Correspondence from Unsupervised Semantic-Aware Flow

News

Huggingface Model

Installation

Data and Pretrained Model

Training

Testing

Point Correspondence

Pretrained Model

Demos

Training

Testing

Related Repositories

Citing

About

Resources

License

Stars

Watchers

Forks

Languages