Skip to content
Visual relationships for e.g. person is talking to another person and a clock above the person offer a comprehensive scene. Inspired by the recent advances in relational representation learning of knowledge bases and convolutional object detection networks, we used a Visual Translation Embedding network (VTransE) for visual relation.
Python Jupyter Notebook
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
Reading a npz file.ipynb
annotations_test.json Add files via upload Jul 17, 2019

vtranse/STA Tensorflow

visual translation embedding network for visual relation detection, CVPR 2017, tensorflow

Shuffle-Then-Assemble: Learning Object-Agnostic Visual Relationship Features, ECCV, tensorflow


  1. Install ipython, if you do not have ipython, you can install this tool (strongly recommended:
pip install ipython
  1. Install TensorFlow v1.3.0 or newer type.
pip install tensorflow-gpu==1.3.0

3.Download this repository or clone with Git

git clone
  1. Install easydict
pip install easydict

Training and Testing Vtranse

1. Download dataset (VRD dataset is used as example)

a). Download the dataset form, and the file is named as ''.

b). Use the following commend to unzip the downloaded data:

unzip -d sg_dataset

c).In the path where you put vtranse folder, use the following commend to make a new folder 'dataset/VRD':

mkdir -p ~/dataset/VRD/json_dataset
mkdir -p ~/dataset/VRD/sg_dataset

d). Move the files in sg_dataset into the created dataset, by using the following commends:

mv sg_dataset/annotations_test.json dataset/VRD/json_dataset
mv sg_dataset/annotations_train.json dataset/VRD/json_dataset
mv sg_dataset/sg_test_images dataset/VRD/sg_dataset
mv sg_dataset/sg_train_images dataset/VRD/sg_dataset

e). Change the root path in file 'vtranse/model/': open this file and find the term '__C.DIR' which is named as '/home/yangxu/rd' to suitable path where you put this vtrase folder.

f). Pre-process the VRD dataset to the vrd_roidb.npz which can be used to train the network. Open ipython using following commend:


And then use following commend to pre-process data in vrd folder:

run process/

After runing this file, you will find that there is one 'vrd_roidb.npz' file in the foloder 'vtranse/input'

2. Training

a). Download pre-trained model of faster-rcnn on VRD dataset from, and the file names are '', 'vrd_vgg_pretrained.ckpt.index', 'vrd_vgg_pretrained.ckpt.meta' and 'vrd_vgg_pretrained.ckpt.pkl'. After downloading them, using the following commend to move them into the 'vtranse/pre_trained' file:

mv vtranse/pretrained_para
mv vrd_vgg_pretrained.ckpt.index vtranse/pretrained_para
mv vrd_vgg_pretrained.ckpt.meta vtranse/pretrained_para
mv vrd_vgg_pretrained.ckpt.pkl vtranse/pretrained_para

b). Create a folder which is used to save the trained results

mkdir -p ~vtranse/pred_para/vrd_vgg

c). After downloading and moving files to suitable folder, using 'vtranse/train_file/' to train vtranse network on VRD dataset.

run train_file/

d). When training, you can see the results like that:

t: 100.0, rd_loss: 4.83309404731, acc: 0.0980000074953
t: 200.0, rd_loss: 3.81237616211, acc: 0.263000019006
t: 300.0, rd_loss: 3.51845422685, acc: 0.290333356783
t: 400.0, rd_loss: 3.31810754955, acc: 0.292666691653
t: 500.0, rd_loss: 3.48527273357, acc: 0.277666689083
t: 600.0, rd_loss: 3.06100189149, acc: 0.340666691475
t: 700.0, rd_loss: 3.02625158072, acc: 0.334666692317
t: 800.0, rd_loss: 3.06034492403, acc: 0.330333357863
t: 900.0, rd_loss: 3.16739703059, acc: 0.322666690871

3. Testing

a). After training vtranse, you will find files like 'vrd_vgg0001.ckpt' in the 'vtranse/pred_para/vrd_vgg' folder. And then you can test your trained model

b). Open the file 'vtranse/test_file/' and change the variable 'model_path' to the suitable pretrained model's name.

c). Create a folder to save the result of detected relationships using the following commend:

mkdir -p ~vtranse/pred_res

d). After changing the name of your model, using following commend to get the relationship detection results:

run test_file/

e). After testing, you can run the file 'vtranse/test_file/' to evaluate your detected result:

run test_file/

VG dataset

1). Download VG dataset. This dataset can be downloaded from their offical website: After downloading these files, you should using the following commend to put these images into the folder 'dataset/VG/images/VG_100K'

mkdir -p ~dataset/VG/images/VG_100K
mv images/VG100K dataset/VG/images/VG_100K
mv images/VG100K dataset/VG/images/VG_100K

2). Download training/testing split Since this dataset is so noisy, and I use one filtered type which is provided by, you can download the split form this link. After downloading this file, you can use the following commend to pre-process the vg dataset

mkdir -p ~dataset/VG/imdb
mv vg1_2_meta.h5 dataset/VG/imdb
run process/

3). Training, Testing and Evaluation After pre-processing Vg dataset, you can using similar process like VRD dataset to train, test and evaluate your model by using following commends:

run train_file/
run test_file/
run test_file/


  author    = {Hanwang Zhang, Zawlin Kyaw, Shih-Fu Chang, Tat-Seng Chua},
  title     = {Visual Translation Embedding Network for Visual Relation Detection},
  booktitle = {CVPR},
  year      = {2017},

Results of VRD (R100)

predicate phrase relation
published result 44.76 22.42 15.20
implemented result 46.48 24.32 16.27

Results of VG (R100)

predicate phrase relation
published result 62.87 10.45 6.04
implemented result 61.70 13.62 11.62


  1. VRD project:

  2. Visual Genome

  3. Vtranse Caffe Type:

  4. The faster rcnn code which I used to train the detection part in this file:


  1. If you have any problems of this programming, you can eamil to
You can’t perform that action at this time.