SAC-VGNet

The datasets for the paper 'Semantic Aligned Cross-Modal Visual Grounding Network with Transformer'

Data preparation

We manually annotate a text description for each image in two fine-grained object detection datasets ( "Military Aircraft Dataset" and "FGVC Aircraft Dataset" ), which could be downloaded in the 百度网盘(提取码: v3c0).