Predicting Point Tracks from Internet Videos enables Diverse Zero-Shot Manipulation

This repo contains code for the paper Predicting Point Tracks from Internet Videos enables Diverse Zero-Shot Manipulation

Installation

Follow the environment.yml file for creating conda environment and installing dependencies.

Training and Inference

For training the point track prediction model, run the following after changing the number of nodes / GPUs per node, batch size as needed

torchrun --nnodes=1 --nproc_per_node=8 train_track_pred.py --global-batch-size=480 --data-path=<folder with data files>

Specify path to initial image, goal image, and checkpoint (trained model is in this link). The visualization will be saved in the folder save_tracK_pred.

python inference_track_pred.py --ckpt=<path to model> --init=<path to initial image> --goal=<path to goal image>

For any questions about the project, feel free to email Homanga Bharadhwaj hbharadh@cs.cmu.edu

License

The code is licensed under CC-BY-NC License.md

Acknowledgement

The code in this repo is based on Diffusion Transformers https://github.com/facebookresearch/DiT and uses open-source packages like diffusers, scipy, opencv, numpy, pytorch

Citation

If you find the repository helpful, please consider citing our paper

@misc{bharadhwaj2024track2act,
      title={Track2Act: Predicting Point Tracks from Internet Videos enables Diverse Zero-shot Robot Manipulation}, 
      author={Homanga Bharadhwaj and Roozbeh Mottaghi and Abhinav Gupta and Shubham Tulsiani},
      year={2024},
      eprint={2405.01527},
      archivePrefix={arXiv},
      primaryClass={cs.RO}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
__pycache__		__pycache__
diffusion		diffusion
save_tracK_pred		save_tracK_pred
static		static
LICENSE.md		LICENSE.md
README.md		README.md
environment.yml		environment.yml
inference_track_pred.py		inference_track_pred.py
setup.py		setup.py
single_script.py		single_script.py
train_track_pred.py		train_track_pred.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pycache

pycache

diffusion

diffusion

save_tracK_pred

save_tracK_pred

static

static

LICENSE.md

LICENSE.md

README.md

README.md

environment.yml

environment.yml

inference_track_pred.py

inference_track_pred.py

setup.py

setup.py

single_script.py

single_script.py

train_track_pred.py

train_track_pred.py

Repository files navigation

Predicting Point Tracks from Internet Videos enables Diverse Zero-Shot Manipulation

Installation

Training and Inference

License

Acknowledgement

Citation

About

Releases

Packages

Languages

License

homangab/Track-2-Act

Folders and files

Latest commit

History

Repository files navigation

Predicting Point Tracks from Internet Videos enables Diverse Zero-Shot Manipulation

Installation

Training and Inference

License

Acknowledgement

Citation

About

Resources

License

Stars

Watchers

Forks

Languages