DeePoint: Visual Pointing Recognition and Direction Estimation, ICCV 2023

This repository provides an implementation of our paper DeePoint: Visual Pointing Recognition and Direction Estimation in ICCV 2023. If you use our code and data please cite our paper.

Please note that this is research software and may contain bugs or other issues – please use it at your own risk. If you experience major problems with it, you may contact us, but please note that we do not have the resources to deal with all issues.

@InProceedings{Nakamura_2023_ICCV,
	author    = {Shu Nakamura and Yasutomo Kawanishi and Shohei Nobuhara and Ko Nishino},
	title     = {DeePoint: Visual Pointing Recognition and Direction Estimation},
	booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
	month     = {October},
	year      = {2023},
}

Prerequisites

We tested our code with python3.10 with external libraries including:

numpy
opencv-python
opencv-contrib-python
torch
torchvision
pytorch-lightning

Please refer to environment/pip_freeze.txt for the specific versions we used.
You can also use singularity to replicate our environment:

singularity build environment/deepoint.sif environment/deepoint.def
singularity run --nv environment/deepoint.sif

Usage

Demo

You can download the pretrained model from here. Download it and save the file as models/weight.ckpt.

You can apply the model on your video and visualize the result by running the script below.

python src/demo.py movie=./demo/example.mp4 lr=l ckpt=./models/weight.ckpt

The video has to contain one person within the frame and the person had better shows the whole body in the frame.
You need to specify the pointing hand (left or right) for visualization.
Since this script uses OpenGL (PyOpenGL and glfw) to draw an 3D arrow, you need to have an window system for this to work.
- We use the script that were used in Gaze360 for drawing 3D arrows.

DP Dataset

You can download the DP Dataset from Google Drive. link

License

The DP Dataset is distributed under the Creative Commons Attribution-Noncommercial 4.0 International License (CC BY-NC 4.0).

Training with the DP Dataset

After downloading the dataset, follow the instructions in data/README.md. The structure of data directory then should look like below:

deepoint
├── data
│   ├── README.md
│   ├── mount_frames.sh
│   ├── frames
│   │   ├── 2023-01-17-livingroom
│   │   └── ...
│   ├── labels
│   │   ├── 2023-01-17-livingroom
│   │   └── ...
│   └── keypoints
│       ├── collected_json.pickle
│       └── triangulation.pickle
└── ...

After downloading the dataset, run

python src/main.py task=train

to train the model. Refer to conf/model for configurations of the model.

Evaluation with the DP Dataset

After training, you can evaluate the model by running:

python src/main.py task=test ckpt=./path/to/the/model.ckpt

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
conf		conf
data		data
demo		demo
environment		environment
img		img
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
visualization.ipynb		visualization.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

conf

conf

data

data

demo

demo

environment

environment

img

img

src

src

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

visualization.ipynb

visualization.ipynb

Repository files navigation

DeePoint: Visual Pointing Recognition and Direction Estimation, ICCV 2023

Prerequisites

Usage

Demo

DP Dataset

License

Training with the DP Dataset

Evaluation with the DP Dataset

About

Releases

Packages

Contributors 3

Languages

License

kyotovision-public/deepoint

Folders and files

Latest commit

History

Repository files navigation

DeePoint: Visual Pointing Recognition and Direction Estimation, ICCV 2023

Prerequisites

Usage

Demo

DP Dataset

License

Training with the DP Dataset

Evaluation with the DP Dataset

About

Resources

License

Stars

Watchers

Forks

Languages