Skip to content

AirVLN/AirVLN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AerialVLN: Vision-and-language Navigation for UAVs

Code | Paper | Data

Official implementation of the ICCV 2023 paper: AerialVLN: Vision-and-language Navigation for UAVs

Shubo LIU*, Hongsheng ZHANG*, Yuankai QI, Peng WANG, Yanning ZHANG, Qi WU

Instruction: Take off, fly through the tower of cable bridge and down to the end of the road. Turn left, fly over the five-floor building with a yellow shop sign and down to the intersection on the left. Head to the park and turn right, fly along the edge of the park. March forward, at the intersection turn right, and finally land in front of the building with a red billboard on its rooftop. Instruction: Take off, fly through the tower of cable bridge and down to the end of the road. Turn left, fly over the five-floor building with a yellow shop sign and down to the intersection on the left. Head to the park and turn right, fly along the edge of the park. March forward, at the intersection turn right, and finally land in front of the building with a red billboard on its rooftop.

Abstract

Recently emerged Vision-and-Language Navigation (VLN) tasks have drawn significant attention in both computer vision and natural language processing communities. Existing VLN tasks are built for agents that navigate on the ground, either indoors or outdoors. However, many tasks require intelligent agents to carry out in the sky, such as UAV-based goods delivery, traffic/security patrol, and scenery tour, to name a few. Navigating in the sky is more complicated than on the ground because agents need to consider the flying height and more complex spatial relationship reasoning. To fill this gap and facilitate research in this field, we propose a new task named AerialVLN, which is UAV-based and towards outdoor environments. We develop a 3D simulator rendered by near-realistic pictures of 25 city-level scenarios. Our simulator supports continuous navigation, environment extension and configuration. We also proposed an extended baseline model based on the widely-used cross-modal-alignment (CMA) navigation methods. We find that there is still a significant gap between the baseline model and human performance, which suggests AerialVLN is a new challenging task.

Code | Paper | Data

Updates

2023/08/30🔥: We release the AerialVLN dataset, code and simulators.

2023/07/14: AerialVLN is accpeted by ICCV2023! 🎉

TODOs

  • AerialVLN Dataset

  • AerialVLN Code

  • AerialVLN Simulators

  • AerialVLN Challenge

Installation

Pleae follow the following steps to install the simulator.

Download and extract AerialVLN simulator:

bash scripts/download_simulator.sh

Download and extract AerialVLN dataset:

bash scripts/download_dataset_aerialvln.sh
# if you want to use aerialvln-s dataset, run: bash download_dataset_aerialvln-s.sh instead

Download AerialVLN code and install environment:

conda create -n AerialVLN python=3.8
conda activate AerialVLN
pip install -r requirements.txt

Finally, your project dir should be like this:

  • Project dir
    • AirVLN
    • DATA
      • data
        • aerialvln
    • ENVs
      • env_1
      • env_2
      • ...

Running

Please see the examples in scripts

Question?

Feel free to contact Shubo LIU or Hongsheng ZHANG.

Citing

If you use AerialVLN in your research, please cite the following paper:

@inproceedings{liu_2023_AerialVLN,
  title={AerialVLN: Vision-and-language Navigation for UAVs},
  author={Shubo Liu and Hongsheng Zhang and Yuankai Qi and Peng Wang and Yanning Zhang and Qi Wu},
  booktitle={International Conference on Computer Vision (ICCV)},
  year={2023}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published