## **1. Import & Extract Dataset**

The Aeroscapes dataset (aerial images and their segmentations) were collected and labelled by Ishan Nigam, Chen Huang, and Deva Ramanan for their “Ensemble Knowledge Transfer for Semantic Segmentation” research project.  The images which were acquired by a drone at an altitude of about 5 to 50 meters above ground have a resolution of about 720 pixels. The dataset information can be found at [https://github.com/ishann/aeroscapes](https://github.com/ishann/aeroscapes) while the dataset can be downloaded from [https://drive.google.com/file/d/1W7yQtrGUnPQ1fB2dPb5wPjrLrlQi395g/view](https://drive.google.com/file/d/1W7yQtrGUnPQ1fB2dPb5wPjrLrlQi395g/view).

As part of data preparation step, we will import this dataset and extract them to a directory.

In [1]:
############################################################################
# Mount Google Drive to /content/drive' and set default module path
############################################################################
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [2]:
############################################################################
# Import Dataset from Link
############################################################################
import os
import gdown

# Change directory to the dataset directory
dataset_dir = '/content/drive/My Drive/Colab Notebooks/aerial-image-segmentation-with-unet/dataset/'
os.chdir(dataset_dir)

# Import dataset into the dataset directory
url = 'https://drive.google.com/u/0/uc?id=1W7yQtrGUnPQ1fB2dPb5wPjrLrlQi395g'
output = 'aeroscapes.tar.gz'
gdown.download(url, output, quiet=False)

Downloading...
From: https://drive.google.com/u/0/uc?id=1W7yQtrGUnPQ1fB2dPb5wPjrLrlQi395g
To: /content/drive/My Drive/Colab Notebooks/aerial-image-segmentation-with-unet/dataset/aeroscapes.tar.gz
100%|██████████| 788M/788M [00:07<00:00, 104MB/s]


'aeroscapes.tar.gz'

In [None]:
############################################################################
#  Extract Dataset to a directory
############################################################################
#os.chdir('/')

extraction_path = "/aeroscapes/"

if os.path.isdir(extraction_path):
  !tar -xzvf 'aeroscapes.tar.gz' -C "/aeroscapes/" # extract tar.gz file
else:
  os.mkdir(extraction_path)
  !tar -xzvf 'aeroscapes.tar.gz' -C "/aeroscapes/" # extract tar.gz file


## **2. Load and preview sample images and their segmentation labels**

In [4]:
dataset_directory = '/content/drive/My Drive/Colab Notebooks/aerial-image-segmentation-with-unet/dataset/aeroscapes/'
image_directory_path = dataset_directory + 'JPEGImages/'
mask_directory_path = dataset_directory + 'SegmentationClass/'

In [5]:
def list_image_paths(image_directory, mask_directory):
  """
  Extracts image and their segmentation filenames from their 
  directory.
  Args:
    image_directory: path to the image directory 
    mask_directory:  path to the segmentation labels
                    directory
  Returns:
    image_paths: a list containing filepaths of the 
                 images in the specified directory path
    mask_paths: a list containing filepaths of the image 
                 segmentation labels in the specified 
                 directory path
  """
  
  image_paths = []
  mask_paths = []
  image_filenames = os.listdir(image_directory)
  
  for image_filename in image_filenames:
    image_paths.append(image_directory + "/" + image_filename)
    mask_filename = image_filename.replace('.jpg', '.png')
    mask_paths.append(mask_directory + "/" + mask_filename)
    
  return image_paths, mask_paths

In [6]:
image_paths, mask_paths = list_image_paths(image_directory_path, mask_directory_path) 
number_of_images, number_of_masks = len(image_paths), len(mask_paths)
print(f"1. There are {number_of_images} images and {number_of_masks} masks in our dataset")
print(f"2. An example of an image path is: \n {image_paths[1]}")
print(f"3. An example of a mask path is: \n {mask_paths[1]}")


1. There are 3269 images and 3269 masks in our dataset
2. An example of an image path is: 
 /content/drive/My Drive/Colab Notebooks/aerial-image-segmentation-with-unet/dataset/aeroscapes/JPEGImages//000001_071.jpg
3. An example of a mask path is: 
 /content/drive/My Drive/Colab Notebooks/aerial-image-segmentation-with-unet/dataset/aeroscapes/SegmentationClass//000001_071.png


In [7]:
import random
import imageio
import numpy as np
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

number_of_samples = len(image_paths)

for i in range(20):
    N = random.randint(0, number_of_samples - 1)

    img = imageio.imread(image_paths[N])
    mask = imageio.imread(mask_paths[N])
    #mask = np.array([max(mask[i, j]) for i in range(mask.shape[0]) for j in range(mask.shape[1])]).reshape(img.shape[0], img.shape[1])

    fig, arr = plt.subplots(1, 2, figsize=(20, 12))
    arr[0].imshow(img)
    arr[0].set_title('Image')
    arr[0].axis("off")
    arr[1].imshow(mask)
    arr[1].set_title('Segmentation')
    arr[1].axis("off")    

Output hidden; open in https://colab.research.google.com to view.