AttnGrounder: Talking to Cars with Attention

AttnGrounder: Talking to Cars with Attention by Vivek Mittal.

Accepted at ECCV'20 C4AV Workshop. Talk2Car dataset used for this paper is available at https://talk2car.github.io/.

Model Overview

Abstract:

We propose Attention Grounder (AttnGrounder), a singlestage end-to-end trainable model for the task of visual grounding. Visual grounding aims to localize a specific object in an image based on a given natural language text query. Unlike previous methods that use the same text representation for every image region, we use a visual-text attention module that relates each word in the given query with every region in the corresponding image for constructing a region dependent text representation. Furthermore, for improving the localization ability of our model, we use our visual-text attention module to generate an attention mask around the referred object. The attention mask is trained as an auxiliary task using a rectangular mask generated with the provided ground-truth coordinates. We evaluate AttnGrounder on the Talk2Car dataset and show an improvement of 3.26% over the existing methods.

Attention Map in Action

Usage

Preprocessed Talk2Car data is available at this link extract it under ln_data folder. Download the images following instruction given at this link. Extract all the images in ln_data\images folder. All the hyperparameters are set, just run the following command in working directory (if you face memory issue try decreasing the batch size).

python train_yolo.py --batch_size 14

Credits

Part of the code or models are from DMS, MAttNet, Yolov3, Pytorch-yolov3 and One Stage Grounding.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
dataset		dataset
model		model
saved_models		saved_models
static		static
utils		utils
LICENSE		LICENSE
README.md		README.md
test.py		test.py
train_yolo.py		train_yolo.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AttnGrounder: Talking to Cars with Attention

Model Overview

Attention Map in Action

Usage

Credits

About

Releases

Packages

Languages

License

vk-mittal14/AttnGrounder

Folders and files

Latest commit

History

Repository files navigation

AttnGrounder: Talking to Cars with Attention

Model Overview

Attention Map in Action

Usage

Credits

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages