uORF with BO-QSA

This repo is forked from uORF and modified by Yu Liu. We adapt BO-QSA to uORF to further investigate the effectiveness and generality of BO-QSA.

uORF: [ICLR22] Unsupervised Discovery of Object Radiance Fields by Hong-Xing Yu, Leonidas J. Guibas, Jiajun Wu

BO-QSA: [ICLR2023] Improving Object-centric Learning With Query Optimization by Baoxiong Jia*, YuLiu*, Siyuan Huang

Project website: uORF, BO-QSA

Main modifications

change the model.py, uorf_gan_model, uorf_nogan_model in models and *.sh in scripts to adapt BO-QSA to uORF. We only modify the initialization and optimization method of the Slot-Attention module in uORF, leaving all other hyperparameters unchanged.
add vis.py, vis_utils.py in utils, uorf_vis_model.py in models, and vis_*.sh in scripts to visualize the results of uORF and BO-QSA.
add generate_video.ipynb to generate video and gif of the results. (For visulization, we sample 128 views of a scene and render them with the learned uORF. Because a RTX 3090 can only render 8-14 views at a time, we render 8 views at a time and sample 16 times for each scene.)

Environment

We recommend using Conda:

conda env create -f environment.yml
conda activate uorf-3090

or install the packages listed therein. Please make sure you have NVIDIA drivers supporting CUDA 11.0, or modify the version specifictions in environment.yml.

Data and model

Please download datasets and models here. If you want to train on your own dataset or generate your own dataset similar to our used ones, please refer to this README.

Evaluation

We assume you have a GPU. If you have already downloaded and unzipped the datasets and models into the root directory, simply run

bash scripts/eval_nvs_seg_chair.sh

from the root directory. Replace the script filename with eval_nvs_seg_clevr.sh, eval_nvs_seg_diverse.sh, and eval_scene_manip.sh for different evaluations. Results will be saved into ./results/. During evaluation, the results on-the-fly will also be sent to visdom in a nicer form, which can be accessed from localhost:8077.

Training

We assume you have a GPU with no less than 24GB memory (evaluation does not require this as rendering can be done ray-wise but some losses are defined on the image space), e.g., 3090. Then run

bash scripts/train_clevr_567.sh

or other training scripts. If you unzip datasets on some other place, add the location as the first parameter:

bash scripts/train_clevr_567.sh PATH_TO_DATASET

Training takes ~6 days on a 3090 for CLEVR-567 and Room-Chair, and ~9 days for Room-Diverse. It can take even longer for less powerful GPUs (e.g., ~10 days on a titan RTX for CLEVR-567 and Room-Chair). During training, visualization will be sent to localhost:8077.

Bibtex

@inproceedings{yu2022unsupervised,
  author    = {Yu, Hong-Xing and Guibas, Leonidas J. and Wu, Jiajun},
  title     = {Unsupervised Discovery of Object Radiance Fields},
  booktitle = {International Conference on Learning Representations},
  year      = {2022},
}

Acknowledgement

Our code framework is adapted from Jun-Yan Zhu's CycleGAN. Some code related to adversarial loss is adapted from a pytorch implementation of StyleGAN2. Some snippets are adapted from pytorch slot attention and NeRF. If you find any problem please don't hesitate to email me at koven@stanford.edu or open an issue.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
data		data
models		models
options		options
scripts		scripts
util		util
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
diverse.gif		diverse.gif
environment.yml		environment.yml
generate_video.ipynb		generate_video.ipynb
test.py		test.py
test.slurm		test.slurm
train.slurm		train.slurm
train_with_gan.py		train_with_gan.py
train_without_gan.py		train_without_gan.py
vis.py		vis.py

License

YuLiu-LY/uORF

Folders and files

Latest commit

History

Repository files navigation

uORF with BO-QSA

Main modifications

Environment

Data and model

Evaluation

Training

Bibtex

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Languages