Factorizable Net (F-Net)

This is pytorch implementation of our ECCV-2018 paper: Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation. This project is based on our previous work: Multi-level Scene Description Network.

Progress

Guide for Project Setup
Guide for Model Evaluation with pretrained model
Guide for Model Training
Uploading pretrained model and format-compatible datasets.
Update the Model link for VG-DR-Net (We will upload a new model by Aug. 27).
Update the Dataset link for VG-DR-Net.
A demonstration of our Factorizable Net
Migrate to PyTorch 1.0.1
Multi-GPU support (beta version): one image per GPU

Updates

Feb 26, 2019: Now we release our beta [Multi-GPU] version of Factorizable Net. Find the stable version at branch 0.3.1
Aug 28, 2018: Bug fix for running the evaluation with "--use_gt_boxes". VG-DR-Net has some self-relations, e.g. A-relation-A. Previously, we assumed there is no such relation. This commit may influence the model performance on Scene Graph Generation.

Project Settings

Install the requirements (you can use pip or Anaconda):

conda install pip pyyaml sympy h5py cython numpy scipy click
conda install -c menpo opencv3
conda install pytorch torchvision cudatoolkit=8.0 -c pytorch 
pip install easydict

Clone the Factorizable Net repository

git clone git@github.com:yikang-li/FactorizableNet.git

Build the Cython modules for nms, roi pooling,roi align modules
```
cd lib
make all
cd ..
```
Download the three datasets VG-MSDN, VG-DR-Net, VRD to F-Net/data. And extract the folders with tar xzvf ${Dataset}.tgz. We have converted the original annotations to json version.
Download Visual Genome images and VRD images.
Link the image data folder to target folder: ln -s /path/to/images F-Net/data/${Dataset}/images
- p.s. You can change the default data directory by modifying dir in options/data_xxx.json.
[optional] Download the pretrained RPN for Visual Genome and VRD. Place them into output/.
[optional] Download the pretrained Factorizable Net on VG-MSDN, VG-DR-Net and VG-VRD, and place them to output/trained_models/

Project Organization

There are several subfolders contained:

lib: dataset Loader, NMS, ROI-Pooling, evaluation metrics, etc. are listed in the folder.
options: configurations for Data, RPN, F-Net and hyperparameters.
models: model definitions for RPN, Factorizable and related modules.
data: containing VG-DR-Net (svg/), VG-MSDN (visual_genome/) and VRD (VRD/).
output: storing the trained model, checkpoints and loggers.

Evaluation with our Pretrained Models

Pretrained models on VG-MSDN, VG-DR-Net and VG-VRD are provided. --evaluate is provided to enable evaluation mode. Additionally, we also provide --use_gt_boxes to fed the ground-truth object bounding boxes instead of RPN proposals.

Evaluation on VG-MSDN with pretrained. Scene Graph Generation results: Recall@50: 12.984%, Recall@100: 16.506%.

CUDA_VISIBLE_DEVICES=0 python train_FN.py --evaluate --dataset_option=normal \
	--path_opt options/models/VG-MSDN.yaml \
	--pretrained_model output/trained_models/Model-VG-MSDN.h5

Evaluation on VG-VRD with pretrained. : Scene Graph Generation results: Recall@50: 19.453%, Recall@100: 24.640%.

CUDA_VISIBLE_DEVICES=0 python train_FN.py --evaluate \
	--path_opt options/models/VRD.yaml \
	--pretrained_model output/trained_models/Model-VRD.h5

Evaluation on VG-DR-Net with pretrained. Scene Graph Generation results: Recall@50: 19.807%, Recall@100: 25.488%.

CUDA_VISIBLE_DEVICES=0 python train_FN.py --evaluate --dataset_option=normal \
	--path_opt options/models/VG-DR-Net.yaml \
	--pretrained_model output/trained_models/Model-VG-DR-Net.h5

Training

Training Region Proposal Network (RPN). The shared conv layers are fixed. We also provide pretrained RPN on Visual Genome and VRD.

 # Train RPN for VG-MSDN and VG-DR-Net
 CUDA_VISIBLE_DEVICES=0 python train_rpn.py --dataset_option=normal 
 
 # Train RPN for VRD
 CUDA_VISIBLE_DEVICES=0 python train_rpn_VRD.py

Training Factorizable Net: detailed training options are included in options/models/.

 # Train F-Net on VG-MSDN:
 CUDA_VISIBLE_DEVICES=0 python train_FN.py --dataset_option=normal \
 	--path_opt options/models/VG-MSDN.yaml --rpn output/RPN.h5
 	
 # Train F-Net on VRD:
 CUDA_VISIBLE_DEVICES=0 python train_FN.py  \
 	--path_opt options/models/VRD.yaml --rpn output/RPN_VRD.h5
 	
 # Train F-Net on VG-DR-Net:
 CUDA_VISIBLE_DEVICES=0 python train_FN.py --dataset_option=normal \
 	--path_opt options/models/VG-DR-Net.yaml --rpn output/RPN.h5

--rpn xxx.h5 can be ignored in end-to-end training from pretrained VGG16. Sometime, unexpected and confusing errors appear. Ignore it and restart to training.

For better results, we usually re-train the model with additional epochs by resuming the training from the checkpoint with --resume ckpt:

 # Resume F-Net training on VG-MSDN:
 CUDA_VISIBLE_DEVICES=0 python train_FN.py --dataset_option=normal \
 	--path_opt options/models/VG-MSDN.yaml --resume ckpt --epochs 30

Acknowledgement

We thank longcw for his generous release of the PyTorch Implementation of Faster R-CNN.

Reference

If you find our project helpful, your citations are highly appreciated:

@inproceedings{li2018fnet,
author={Li, Yikang and Ouyang, Wanli and Bolei, Zhou and Jianping, Shi and Chao, Zhang and Wang, Xiaogang},
title={Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation},
booktitle = {ECCV},
year = {2018}
}

We also have two papers regarding to scene graph generation / visual relationship detection:

@inproceedings{li2017msdn,
author={Li, Yikang and Ouyang, Wanli and Zhou, Bolei and Wang, Kun and Wang, Xiaogang},
title={Scene graph generation from objects, phrases and region captions},
booktitle = {ICCV},
year = {2017}
}

@inproceedings{li2017vip,
author={Li, Yikang and Ouyang, Wanli and Zhou, Bolei and Wang, Kun and Wang, Xiaogang},
title={ViP-CNN: Visual Phrase Guided Convolutional Neural Network},
booktitle = {CVPR},
year = {2017}
}

License:

The pre-trained models and the Factorizable Network technique are released for uncommercial use.

Contact Yikang LI if you have questions.

Name		Name	Last commit message	Last commit date
Latest commit History 115 Commits
lib		lib
models		models
options		options
scripts		scripts
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
eval_39588769.err		eval_39588769.err
eval_39588769.out		eval_39588769.out
eval_39589172.err		eval_39589172.err
eval_39589172.out		eval_39589172.out
eval_39595671.err		eval_39595671.err
eval_39595671.out		eval_39595671.out
requirements.txt		requirements.txt
test.py		test.py
train.sh		train.sh
train_FN.py		train_FN.py
train_FN_debug.py		train_FN_debug.py
train_rpn.py		train_rpn.py
train_rpn_VRD.py		train_rpn_VRD.py
visualize_graph.py		visualize_graph.py
visualize_gt_graphs.py		visualize_gt_graphs.py

KernelA/FactorizableNet

Folders and files

Latest commit

History

Repository files navigation

Factorizable Net (F-Net)

Progress

Updates

Project Settings

Project Organization

Evaluation with our Pretrained Models

Training

Acknowledgement

Reference

License:

About

Topics

Resources

Stars

Watchers

Forks

Languages