GitHub - Tianhao1997/Hand-Object-Interaction-detection

Introduction

This repo is based on Faster-RCNN.

Running the repo in Colab is recommended, copy the file HOI detection.ipynb, then run it on Colab. (remember to change the runtime type to GPU in Colab)

【1】Prerequisites (Colab user can skip this step)

Python 3.7
Pytorch 1.6
CUDA 10.1

Create a new conda called handobj, install pytorch-1.6.0

conda create --name handobj python=3.7
conda activate handobj

# for cpu
conda install pytorch==1.6.0 torchvision==0.7.0 cpuonly -c pytorch  

# for GPU
conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.1 -c pytorch

【2】Installation & Compile

Clone the code

git clone https://github.com/forever208/Hand-Object-Interaction-detection.git

Install all the python dependencies using pip:

pip install -r requirements.txt

Compile the cuda dependencies using following simple commands:

cd lib
python setup.py build develop
cd ..

Install coco_python_API

mkdir data
cd data
git clone https://github.com/pdollar/coco.git 
cd coco/PythonAPI
make
cd ../../..

【3】Run Demo

Download the model

Creat a folder ./models/res101_handobj_100K/pascal_voc, then download the model.

mkdir -p ./models/res101_handobj_100K/pascal_voc
cd models/res101_handobj_100K/pascal_voc
wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=166IM6CXA32f9L6V7-EMd9m8gin6TFpim' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=166IM6CXA32f9L6V7-EMd9m8gin6TFpim" -O faster_rcnn_1_8_89999.pth && rm -rf /tmp/cookies.txt
cd ../../..

the folder structure looks like this:

models
└── res101_handobj_100K
    └── pascal_voc
        └── faster_rcnn_{checksession}_{checkepoch}_{checkpoint}.pth

run demo

Put your images in the images/ folder and run the command.

python demo.py --checkepoch=8 --checkpoint=89999

A new folder images_det/ will be created with the detected results

Params to save detected results in demo.py you may need for your task:

hand_dets: detected results for hands, [boxes(4), score(1), state(1), offset_vector(3), left/right(1)]
obj_dets: detected results for object, [boxes(4), score(1), state(1), offset_vector(3), left/right(1)]

We did not train the contact_state, offset_vector and hand_side part for objects. We keep them just to make the data format consistent. So, only use the bbox and confidence score infomation for objects.

Matching:

Check the additional matching.py script to match the detection results, hand_dets and obj_dets, if needed.

One Image Demo Output:

Color definitions:

yellow: object bbox
red: right hand bbox
blue: left hand bbox

Label definitions:

L: left hand
R: right hand
N: no contact
S: self contact
O: other person contact
P: portable object contact
F: stationary object contact (e.g.furniture)

【4】Train

Download dataset

creat a folder ./data, then download the dataset and unzip it.

mkdir data
cd data

wget https://fouheylab.eecs.umich.edu/~dandans/projects/100DOH/downloads/pascal_voc_format.zip
unzip pascal_voc_format.zip
rm -rf pascal_voc_format.zip

cd ..
mv data/pascal_voc_format/VOCdevkit2007_handobj_100K/ data/

Download pre-trained Resnet-101 model

Download pretrained Resnet-101 model by running the command

cd data
mkdir pretrained_model
cd pretrained_model

wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=1Pyv546ss5q4idvcXE_Z_GFn_qOhDz10H' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=1Pyv546ss5q4idvcXE_Z_GFn_qOhDz10H" -O resnet101_caffe.pth && rm -rf /tmp/cookies.txt

So far, the data/ folder should be like this:

data/
├── pretrained_model
│   └── resnet101_caffe.pth
├── VOCdevkit2007_handobj_100K
│   └── VOC2007
│       ├── Annotations
│       │   └── *.xml
│       ├── ImageSets
│       │   └── Main
│       │       └── *.txt
│       └── JPEGImages
│           └── *.jpg

To train a hand object detector model with resnet101 on pascal_voc format data, run:

python trainval_net.py --model_name handobj_100K --log_name=handobj_100K --dataset pascal_voc --net res101 --bs 1 --nw 2 --lr 1e-3 --lr_decay_step 3 --cuda --epoch=10 --use_tfb

【5】Test

To evaluate the detection performance, run:

python test_net_Colab.py --model_name=handobj_100k --save_name=handobj_100k --cuda --checkepoch=8 --checkpoint=89999

【6】Benchmarking (AP)

Tested on the testset of our 100K and ego dataset:

Name	Hand	Obj	H+Side	H+State	H+O	All	Model Download Link
handobj_100K+ego	90.4	66.3	88.4	73.2	47.6	39.8	faster_rcnn_1_8_132028.pth
handobj_100K	89.8	51.5	65.8	62.4	27.9	20.9	faster_rcnn_1_8_89999.pth

Tested on the testset of our 100K dataset:

Name	Hand	Obj	H+Side	H+State	H+O	All
handobj_100K+ego	89.6	64.7	79.0	63.8	45.1	36.8
handobj_100K	89.6	64.0	78.9	64.2	46.9	38.6

Tested on the testset of our ego dataset:

Name	Hand	Obj	H+Side	H+State	H+O	All
handobj_100K+ego	90.5	67.2	90.0	75.0	47.4	46.3
handobj_100K	89.8	41.7	59.5	62.8	20.3	12.7

The model handobj_100K is trained on trainset of 100K youtube frames.

The model handobj_100K+ego is trained on trainset of 100K plus additional egocentric data we annotated, which works much better on egocentric data.

We provide the frame names of the egocentric data we used here: trainval.txt, test.txt. This split is backwards compatible with the Epic-Kitchens2018 (EK), EGTEA, and CharadesEgo (CE).

Citation

If this work is helpful in your research, please cite:

@INPROCEEDINGS{Shan20, 
    author = {Shan, Dandan and Geng, Jiaqi and Shu, Michelle  and Fouhey, David},
    title = {Understanding Human Hands in Contact at Internet Scale},
    booktitle = CVPR, 
    year = {2020} 
}

When you use the model trained on our ego data, make sure to also cite the original datasets (Epic-Kitchens, EGTEA and CharadesEgo) that we collect from and agree to the original conditions for using that data.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
assets		assets
cfgs		cfgs
datasets		datasets
images		images
model		model
pycocotools		pycocotools
roi_data_layer		roi_data_layer
LICENSE		LICENSE
README.md		README.md
_init_paths.py		_init_paths.py
demo.py		demo.py
demo_Colab.py		demo_Colab.py
requirements.txt		requirements.txt
setup.py		setup.py
test.md		test.md
test_net.py		test_net.py
test_net_Colab.py		test_net_Colab.py
trainval_net.py		trainval_net.py
trainval_net_Colab.py		trainval_net_Colab.py

License

Tianhao1997/Hand-Object-Interaction-detection

Folders and files

Latest commit

History

Repository files navigation