- For anyone who has download the Street or Trees sub-dataset before Beijing Time 2021.10.25 8:40 or UTC 2021.10.25 :40, please update the label files. The reason can be found on this issue. We feel very sorry for our mistake.
SimUAV is a simulated, not real-world, dataset for UAV small object detection. It contains 29,568 images in 8 scenes or backgrounds including street, trees, grass, mountain lake, palace, winter town, and temple. Each scene has 4 multi-rotor models: Parrot A.R. Drone 2.0, DJI Inspire I, DJI Mavic 2 Pro, and Dji Phantom 4 Pro.
Figure 1 Examples for 8 scenes simulated, each has been annotated by red boxes. Note the white cross-line in each image helps to locate the objects(in which quadrant) and the multi-rotor model in each image is DJI Mavic 2 Pro.
Figure 2 3D models for multi-rotors we used in simulation.
COUNT | Pool | Street | Trees | Grass | Mountain Lake | Palace | Seaside Temple | Winter Town |
---|---|---|---|---|---|---|---|---|
A.R. Drone | 653 | 1248 | 1230 | 600 | 1203 | 602 | 604 | 1201 |
DJI Inspire | 641 | 1240 | 1274 | 600 | 1202 | 603 | 601 | 1206 |
DJI Mavic | 757 | 1460 | 1326 | 601 | 1204 | 600 | 602 | 1205 |
DJI Phantom | 623 | 1251 | 1216 | 603 | 1204 | 604 | 600 | 1204 |
Table 1 Image Count of each model and scene.
Figure 3 Proportion of each scene and model.
Anyone can get access the SimUAV Dataset by following links:
- Pool(1.01GB): Google Drive; Baidu Cloud Token: dq0z
- Street(2.3GB): Google Drive; Baidu Cloud Token: nzkz
- Trees(2.41GB): Google Drive; Baidu Cloud Token: ooxx
- Grass(872MB): Google Drive; Baidu Cloud Token: dzyr
- Mountain Lake(2.44GB): Google Drive; Baidu Cloud Token: 58zm
- Palace(1.6GB): Google Drive; Baidu Cloud Token: j8ze
- Seaside Temple(1.08GB): Google Drive; Baidu Cloud Token: u7zy
- Winter Town(2.9GB): Google Drive; Baidu Cloud Token: t1aa
Warm prompt 1: before you download any scene you choose, it may help to preview the example images in the 'examples' fold.
Warm prompt 2: in our published paper, we also use a real-world dataset to train our deep learning model along with the SimUAV dataset. You can download the real-world dataset as you want: Google Drive ; Baidu Cloud Token: n7mo
In recent years, UAVs(Unmanned Aerial Vehicles, here we mean multi-rotors) have been widely used in agriculture, industry, and the military, bringing great risks to public security such as illegal flying, spy flying, and more seriously, terrorist attack by dropping bombs. Therefore, research on the detection and recognition of UAV targets has already been carried out. Popular technologies for UAV small object detection mainly resort to deep learning algorithms in computer vision, which always need a large number of data. Unfortunately, few object detection datasets are available on multi-rotors. To address (relieve) this issue, we construct the SimUAV dataset which is entirely generated by simulation softwares without any real-world image.
We definite the task of UAV small object detection: determine whether there are any instances of UAV, specifically multi-rotors in an image and, if present, to return the spatial location and extent of each object instance (e.g., via a bounding box). Small here means the ground-truth box size of each object is always small, exactly no larger than 30*30 pixels in a 640*640 image.
We apply Airsim and UE4 as our simulation softwares. Airsim is an open-source simulator for drones, built on Epic Games’ Unreal Engine 4 as a platform for AI research. Airsim allows us to fly multi-rotors in UE4 rendering environments at various velocities and on expected paths which can be set easily by python API. Besides, Airsim supports substituting any 3D model file for the default multi-rotor model. UE4 provides a variety of environments or scenes in Epic Store, including forests, beaches, cities, parks, and so on.
Figure 4 Left: A snapshot from AirSim shows a multi-rotor flying in an urban environment. Right: A Search Page of UE 4 Market.
The generation of the SimUAV dataset is implemented by Airsim Python API. As described in figure 5, we create four multi-rotors, one as the observer which carries several cameras(we only depict one camera) and others as the targets flying at a certain distance from the observer. Targets are expected to be captured by one or more cameras, and consequently, the camera image will be stored. Note that, we keep the observer vibrate during simulation in order to imitate shooting by human hand.
Figure 5 The observer multi-rotor shoots the target multi-rotors with onboard cameras.
Warm prompt 3: github's markdown dose not support math expression, please use a chrome extension or you can download the whole project and read locally.
Since the positions of targets and attitudes of cameras are available by Airsim during shooting, the labels of each image can be computed theoretically. Taking Target1 in figure 5 as an example, there are three frames involved: world frame, camera frame, and pixel frame, respectively expressed by
Overall, we generate all the annotated images in various scenes and drone models forming the SimUAV dataset.
Please cite this paper in your publications if SimUAV helps your research.
@misc{rui2021comprehensive,
title={A Comprehensive Approach for UAV Small Object Detection with Simulation-based Transfer Learning and Adaptive Fusion},
author={Chen Rui and Guo Youwei and Zheng Huafei and Jiang Hongyu},
year={2021},
eprint={2109.01800},
archivePrefix={arXiv},
primaryClass={cs.CV}
}