# Download Aerial Images using Bing Maps

## Method 1 (fast): Using the Bing Aerial API to retrieve Medium Resolution Images

This method is really fast in retrieving ready-to-download images which are provided as a cutout from predefined Bing Maps. It returns lower resolution images which do not always have the coordinate's location at their center point.

Since these images will also be used during production for substantial speed increase, they can also be used to train the model.
During training, each image will be downscaled to a resolution of 224x224 pixels and transformed in multiple ways. Because of that, original image size does not have to be upwards of 1080p.

In [73]:
import urllib.request
from fastai.vision import Path
import pandas as pd
import os
import shutil
import time

In [None]:
# set up paths for data input and downloaded output
path = Path("./data")
output_dir = Path("./output_full")

# if the output directory does not exist, create it.
# Otherwise clear all files in directory to start anew
if not os.path.exists(output_dir):
    os.makedirs(output_dir)
else:
    shutil.rmtree(output_dir)
    os.makedirs(output_dir)

In [74]:
# Bing Map Aerial API to retrieve aerial images for given coordinates
bing_url = "https://dev.virtualearth.net/REST/v1/Imagery/Map/Aerial/" \
            "{lat},{lon}/17?ms={width},{height}&od=1&c=de-DE&key={api_key}"
KEY = "AijbFhynMi9YlUoC5sbBKfrfbnkcMJ34sYBEORQwbsviodnw8nTkkgh5se5COtMs" # API Key
# Images should all be the same size (600x900 Pixel)
WIDTH = 600
HEIGHT = 900

In [75]:
# Read labeled dataset
data = pd.read_excel(path/"Standorte_labeled.xlsx")

In [76]:
data.head()

Unnamed: 0.1,Unnamed: 0,Straße,PLZ,Ort,emp_land,Lat,Lon,Kategorie,Markiert
0,0,REWESTR. 1,1683,STARBACH,D,51.086086,13.278351,Sehr Gut,
1,1,RAIFFEISENSTR. 5-9,61191,ROSBACH,D,50.29535,8.687106,Sehr Gut,
2,2,AM RÖMERFELD 6,71149,BONDORF,D,48.506751,8.836023,Sehr Gut,
3,3,SEEBERGER STRASSE 10,15345,ALTLANDSBERG,D,52.548581,13.694739,Sehr Gut,
4,4,IN DEN WEINÄCKERN 1,69168,WIESLOCH,D,49.296811,8.668717,Sehr Gut,


In [77]:
# Quick summary of data
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 21433 entries, 0 to 21432
Data columns (total 9 columns):
Unnamed: 0    21433 non-null int64
Straße        21433 non-null object
PLZ           21433 non-null int64
Ort           21433 non-null object
emp_land      21433 non-null object
Lat           21433 non-null float64
Lon           21433 non-null float64
Kategorie     965 non-null object
Markiert      19 non-null object
dtypes: float64(2), int64(2), object(5)
memory usage: 1.5+ MB


In [78]:
# we only need rows that have been labeled, therefore not having a 
# null value in column "Kategorie"
data = data[data.Kategorie.notnull()]

In [79]:
# A total of 965 entries have been successfully labeled. 
# We continue by only using these 965 locations.
data.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 965 entries, 0 to 964
Data columns (total 9 columns):
Unnamed: 0    965 non-null int64
Straße        965 non-null object
PLZ           965 non-null int64
Ort           965 non-null object
emp_land      965 non-null object
Lat           965 non-null float64
Lon           965 non-null float64
Kategorie     965 non-null object
Markiert      19 non-null object
dtypes: float64(2), int64(2), object(5)
memory usage: 75.4+ KB


In [80]:
# Retrieve all possible classes
classes = data.Kategorie.unique()

In [81]:
classes

array(['Sehr Gut', 'Mittel', 'Gut', 'Schlecht'], dtype=object)

In [82]:
# create a subfolder in our output directory for each category
# if the folder already exists, delete all images currently in it
for c in classes:
    if not os.path.exists(output_dir/c):
        os.makedirs(output_dir/c)
    else:
        shutil.rmtree(output_dir/c)
        os.makedirs(output_dir/c)

In [84]:
# Retrieve image for each row in our dataset by using the Bing API
counter = 0
for idx, row in data.iterrows(): # iterate over dataset
    url = bing_url.format(lat=row["Lat"], 
                          lon=row["Lon"], 
                          width=WIDTH, 
                          height=HEIGHT, 
                          api_key=KEY
                         ) # create URL
    fname = "image{num}.jpeg".format(num=counter)
    
    # retrieve image and save to subfolder
    urllib.request.urlretrieve(url, output_dir/row["Kategorie"]/fname) 
    if counter % 10 == 0: # wait for 5 seconds after 10 images
        time.sleep(5) # to avoid rate limiting
    counter += 1

## Method 2 (slow): Retrieve High Resolution Aerial Images stitched together using Bing Map Tile System

This method will take a long time but generates Images with a much higher resolution than the previous method. On top, it calculates a bounding box, which means that the coordinate pair will always be the central location in an image and the radius will always be static for each image.

It should be used if really high resolution imagery is required for certain tasks.

In [None]:
import subprocess

In [None]:
# define path toBing Map Tile System script
# Python Implementation for Map Tile System can be found 
# @GitHub: https://github.com/jpkunkler/Bing_Aerial_API
script = r"Bing_Aerial_API/imageRetrieval.py"

In [None]:
# Go through dataset row by row
for idx, row in data.iterrows():
    out_path = output_dir/row["Kategorie"]
    
    # use subprocess to call script with required parameters
    p = subprocess.Popen(['python', script, str(row["Lat"]), 
                          str(row["Lon"]), 
                          out_path]
                        )
    p.wait() # wait until subprocess has finished retrieving the image

    print("Proceeding to next row. Please stand bye.")