Skip to content

About the AOT dataset

Vương Tuấn Khanh edited this page Dec 20, 2021 · 3 revisions

📚 This guide explains the details of the AOT Dataset 🚀. UPDATED 24 August 2021.

Reference: https://www.aicrowd.com/challenges/airborne-object-tracking-challenge

📖 Airborne object tracking dataset (AOT) description

The Airborne Object Tracking (AOT) dataset is a collection of flight sequences collected onboard aerial vehicles with high-resolution cameras. To generate those sequences, two aircraft are equipped with sensors and fly planned encounters (e.g., Helicopter1 in Figure 1(a)). The trajectories are designed to create a wide distribution of distances, closing velocities, and approach angles. In addition to the so-called planned aircraft, AOT also contains other unplanned airborne objects, which may be present in the sequences (e.g., Airborne1 in Figure 1(a)). Those objects are also labeled but their distance information is not available.

Airborne objects usually appear quite small at the distances which are relevant for early detection: 0.01% of the image size on average, down to a few pixels in area (compared to common object detection datasets, which exhibit objects covering more considerable portion of the image). This makes AOT a new and challenging dataset for the detection and tracking of potential aerial approaching objects.

Figure 1: Overview of the Airborne Object Tracking (AOT) dataset, more details in the "Data Diversity" section

In total, AOT includes close to 164 hours of flight data:

  • 4,943 flight sequences of around 120 seconds each, collected at 10 Hz in diverse conditions. Each sequence typically includes at most one planned encounter, although some may include more.
  • 5.9M+ images
  • 3.3M+ 2D annotation

Video 1: Flight sequence of the drone, with other airborne objects annotated

📰 Dataset diversity

A unique feature of AOT compared to comparable existing datasets is the wide spectrum of challenging conditions it covers for the detection and tracking of airborne objects.

Figure 2: Samples images showcasing the diversity in AOT dataset

  • Airborne object size: often a direct proxy to a distance to the object, the area of objects in the dataset varies from 4 to 1000 pixels, as illustrated in Fig.1(b). Note that the ground truth for tiny and small objects cannot be marked perfectly tight, instead it is approximated with circles of radius 3 and 8 pixels respectively, which yield two bright horizontal lines in the Fig.1(b).
  • Planned encounters:
    • Distance to the object: concentrated between 600 to 2,000 meters (25-75 percentiles)
    • Closing velocity (the velocity with which the object approaches the camera): up to 70 meters per second.
    • Angle of approach:
      • Azimuth: from -60 to 60 degrees
      • Elevation: from -45 to 45 degrees
    • Collision risk: out of the planned encounters, it is estimated that 55% of them would qualify as potential collision trajectories and close encounters
  • Camera roll angle: related to camera trajectory, the bank angle goes up to 60 degrees in high bank turns
  • Altitude: the altitude of the camera varies from 24 to 1,600 meters above mean sea level (MSL) with most captures between 260 and 376 meters MSL. The captures are as low as 150 meters above ground, which is challenging to capture.
  • Distance to visual horizon: 80% of targets are above the horizon, 1% on the horizon, and 19% below. This feature particularly affects the amount of clutter in the background of the object.
  • Airborne object type: see Figure 1(d) and Table 1 below
  • Sky conditions and visibility: sequences with clear, partly cloudy, cloudy, and overcast skies are provided, 69% of the sequences have good visibility, 26% have medium visibility, and 5% exhibit poor visibility conditions.
  • Light conditions:
    • Back-lit aircraft, sun flare, or overexposure are present in 5% of the sequences
    • Time of the day: data was captured only during well lit daylight operations but a different times of the day, creating different sun angle conditions
    • Terrain: flat horizon, hilly terrain, mountainous terrain, shorelines Table 1 below provides an overview of the objects present in the dataset. There are 3,306,350 frames without labels as they contain no airborne objects. Note that all airborne objects are labeled. For images with labels, there are on average 1.3 labels per image.
  • This dataset also includes hot air balloons, ultra lights, drones, etc
Split All Airborne Objects Airplane Helicopter Bird Other*
Training 2.89M 0.79M 1.22M 0.33M 0.54M
Planned 1.39M 0.24M 1.15M 0.00M 0.00M
Unplanned 1.50M 0.56M 0.07M 0.33M 0.54M
Test 0.50M 0.13M 0.17M 0.06M 0.14M
Planned 0.20M 0.04M 0.16M 0.00M 0.00M
Unplanned 0.29M 0.08M 0.00M 0.06M 0.14M
Total 3.39M 0.92M 1.39M 0.39M 0.69M

Table 1: Types and distribution of airborne object labels

📜Data collection process

During a given data capture, the two sensor-equipped aircraft perform several planned rectilinear encounters, repositioning in between maneuvers. This large single data record is then split into digestible sequences of 120 seconds. Due to those cuts in the original record, individual sequences may comprise a mix of: rectilinear approaches, steep turns, with or without the aircraft in sight. As an example, it is possible to have a single rectilinear approach split across two sequences. In addition to the planned aircraft, a given sequence might contain other unplanned airborne objects like birds and small airplanes, or even no airborne objects.

🗄 Data format

The data obtained from two front-facing cameras, the Inertial Navigation System (INS), and the GPS provide the onboard imagery, the orientation and position of the aircraft. The provided dataset will therefore include:

  1. Front-view, low-altitude videos (sampled as .png images at 10 FPS)
  2. Distance to planned aircraft (calculated based on their GPS)
  3. Manually labeled ground truth bounding boxes for all visible airborne objects.