Skip to content

zhongyi-zhou/GestureIMT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Gesture-aware Interactive Machine Teaching with In-situ Object Annotations (UIST 22)

demo


Gesture-aware Interactive Machine Teaching with In-situ Object Annotations
Zhongyi Zhou, Koji Yatani
The University of Tokyo
UIST 2022
Abstract: Interactive Machine Teaching (IMT) systems allow non-experts to easily create Machine Learning (ML) models. However, existing vision-based IMT systems either ignore annotations on the objects of interest or require users to annotate in a post-hoc manner. Without the annotations on objects, the model may misinterpret the objects using unrelated features. Post-hoc annotations cause additional workload, which diminishes the usability of the overall model building process. In this paper, we develop LookHere, which integrates in-situ object annotations into vision-based IMT. LookHere exploits users' deictic gestures to segment the objects of interest in real time. This segmentation information can be additionally used for training. To achieve the reliable performance of this object segmentation, we utilize our custom dataset called HuTics, including 2040 front-facing images of deictic gestures toward various objects by 170 people. The quantitative results of our user study showed that participants were 16.3 times faster in creating a model with our system compared to a standard IMT system with a post-hoc annotation process while demonstrating comparable accuracies. Additionally, models created by our system showed a significant accuracy improvement ($\Delta mIoU=0.466$) in segmenting the objects of interest compared to those without annotations.

News

Getting Started

This code has been tested on PyTorch 1.12 with CUDA 11.6 and PyTorch 1.10 with CUDA 11.3.

To install PyTorch 1.12 with CUDA 11.6,

chmod +x ./install/init_cuda_11_6.sh
./install/init_cuda_11_6.sh

To install PyTorch 1.12 with CUDA 11.3,

chmod +x ./install/init_cuda_11_3.sh
./install/init_cuda_11_3.sh

If you are using other versions

(not necessary if the code above succeeds.) This project may also work in other version of PyTorch. You can exam the required packages under ./install and install them by yourself. You also need to download two checkpoint files from Google Drive:

  • put resnet18_adam.pth.tar under ./demo_app/src/ckpt/ and ./object_highlights/ckpt/
  • put unet-b0-bgr-100epoch.pt under ./demo_app/src/ckpt/

Website Demo

Initialization

conda activate lookhere
cd demo_app
./gen_keys.sh

Run the server

python app.py

Teaching

Then you can access the teaching interface via

You can also access this website through LAN:

Check demo_app/README.md for more details on how to use the app.

Training

All your teaching data will be stored at ./tmp/000_test/. You can start training using

./src/trainer/train.sh ./tmp/000_test/ours/ 1

This project does not include the function for automatic training in the system. Please implement this function yourself by refering to the codes used above.

Model Assessment

Once the training process finishes, you can assess your model via this link:

HuTics: Human Deictic Gestures Dataset

HuTics covers covers four kinds of deictic gestures to objects. Note that we only annotate the segmentation masks of the objects. The hand segmentation masks are generated from this work.

This dataset is under the license of [CC-BY-NonCommercial].

Download: [google drive]

Exhibiting Pointing
Presenting Touching

Gesture-aware Object-Agnostic Segmentation

You need to first download the HuTics dataset above.

Start training the network. Please check the following path to your dataset location.

cd object_highlights
conda activate lookhere
./trainer/train.sh PATH_TO_HUTICS

After the training process finishes, you need to convert the rgb-based ckpt into the bgr-based one.

python utils/ckpt_rgb2bgr.py --input ${YOUR_INPUT_RGB_MODEL.pt} --output ${YOUR_OUTPUT_BGR_MODEL.pt}

The model is now ready, and you can use the trained model for the inference.

python demo_video.py --objckpt ${YOUR_OUTPUT_BGR_MODEL.pt} 

The output video will be at vids/tissue_out.mp4

Related Work

Citations

@inproceedings{zhou2022gesture,
author = {Zhou, Zhongyi and Yatani, Koji},
title = {Gesture-Aware Interactive Machine Teaching with In-Situ Object Annotations},
year = {2022},
isbn = {9781450393201},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3526113.3545648},
doi = {10.1145/3526113.3545648},
booktitle = {Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology},
articleno = {27},
numpages = {14},
keywords = {dataset, deictic gestures, in-situ annotation, Interactive machine teaching},
location = {Bend, OR, USA},
series = {UIST '22}
}

@inproceedings{zhou2021enhancing,
author = {Zhou, Zhongyi and Yatani, Koji},
title = {Enhancing Model Assessment in Vision-Based Interactive Machine Teaching through Real-Time Saliency Map Visualization},
year = {2021},
isbn = {9781450386555},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3474349.3480194},
doi = {10.1145/3474349.3480194},
pages = {112–114},
numpages = {3},
keywords = {Visualization, Saliency Map, Interactive Machine Teaching},
location = {Virtual Event, USA},
series = {UIST '21}
}

About

Gesture-aware Interactive Machine Teaching with In-situ Object Annotations (UIST 22)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published