# SkyScenes

In this notebook, we'll explore a subset of the [SkyScenes dataset](https://arxiv.org/abs/2312.06719). 

[This subset of the SkyScenes](https://huggingface.co/datasets/Voxel51/SkyScenes) dataset focuses on aerial perspectives captured at three different heights (15, 35, and 60 meters) while maintaining a consistent pitch angle of 0 degrees. All images are from the clear daytime conditions (ClearNoon) across four different urban environments: Town01, Town02, Town05, and Town07. For each scene configuration, the dataset includes RGB images along with their corresponding depth maps and segmentation masks.

We'll explore this dataset using the open-source library, FiftyOne. If you haven't already, you can should install FiftyOne:

```bash
pip install -U fiftyone
```

Once you've installed FiftyOne, you can download the dataset like so:

In [2]:
import fiftyone as fo
from fiftyone.utils.huggingface import load_from_hub

dataset = load_from_hub(
    "Voxel51/SkyScenes",
    name="skyscenes",
    persistent=True,
    # overwrite=True #use this in-case you need to overwrite your changes and start from scratch 
    )

Downloading config file fiftyone.yml from Voxel51/SkyScenes
Loading dataset


Importing samples...
 100% |█████████████████| 840/840 [22.9ms elapsed, 0s remaining, 36.7K samples/s]      
Downloading 840 media files...


100%|██████████| 9/9 [05:01<00:00, 33.50s/it]


If you want to access the dataset anytime after you've already downloaded it, you can do so like this:

```python
import fiftyone as fo
dataset = fo.load_dataset("skyscenes")
```

This dataset is parsed as a grouped dataset.

A grouped dataset is a collection of data organized into specific categories or classes that are related by defined properties. It represents data that has been bundled together rather than left as individual raw values, making it more structured and manageable for analysis.

In this FiftyOne dataset, each scene is grouped by the camera angle, allowing for easy comparison of the same scene from different heights while maintaining the relationship between RGB images and their corresponding depth and segmentation information.

For each scene, it associates the RGB image with its corresponding depth map and segmentation mask across three different aerial perspectives (15m, 35m, and 60m height at 0° pitch). 

The segmentation masks are mapped to 28 distinct classes, ranging from urban infrastructure elements like buildings, roads, and bridges to dynamic objects such as pedestrians, vehicles, and cyclists. 

Let's take a look at the dataset:

In [1]:
session = fo.launch_app(dataset)

NameError: name 'fo' is not defined