# Identify Sample Locations near Intersections and Download GSV images

Find all intersections in an OSM extract, then create a CSV file with all the points near the intersections that we want to sample.

We walk down each "way" in the OSM data and work out a heading  at each point, based on the average of the bearing from the previous point, and the bearing to the next point, in order to have some idea which heading to use in a Google Street View request to sample images (roughly) forward/backward/left/right.

The items we are looking for might not be most visible from right in the middle of an intersection, therefore there is an option to specify a range around each intersection that we want to sample, along the heading.  E.g. if we specify 20m, then we will sample the point in the middle of the intersection, +/- 10m, and +/- 20m.  We use 10m intervals within this range because Google Street View typically gives a different image roughly every 10m.

This "ordered" version supercedes an earlier version:  In OSM, a long street may be divided up into multiple connecting "ways", each with the same name.  Rather than walking down ways in random order, we attempt process ways in order of their name, and then within the name we attempt to identify a logical order:  Start with a way whose first node is NOT an intersection, then find the next way with the same name that intersects with the end of the first, and so on.  This isn't really necessary to generate a map of all the detections, but it is useful when visually inspecting the results to assess or measure the quality of the results:  It is less disorientating for a human to see the images in a logical "walking" order.

Once we have a list of points to sample, we output a batch "csv" containing the way id, the node id of the sample point, the offset in metres (e.g. "-20"), the latitude, longitude, and bearing.  This can then be used to download and cache Google Street View images, and process them with our detection model.

## Configuration

Which "locality" do we wish to process?

Assumes that we can find a pair of OSM files with corresponding names, extracted with the "osmium" tool.  One file follows the official shape of the locality, while a second file follows a bounding box around the locality with a 200m margin, so that when we are looking for intersections, we don't miss any due to the intersecting road being just outside the boundary of the locality (apart from the intersection).

In [1]:
locality = 'Mount Eliza Sample'
margin   = 20

## Import required code

In [2]:
import os
import sys

# Make sure local modules can be imported
module_path_root = os.path.abspath(os.pardir)
if module_path_root not in sys.path:
    sys.path.append(module_path_root)
    
# Import local modules
import osm_gsv_utils.osm_walker as osm_walker
import osm_gsv_utils.gsv_loader as gsv_loader

## Identify sample points

Load the OSM data, and then generate lists of sample points at margins of 0, +/- 10m, and +/- 20m from each intersection,
and report on how many samples are found for each sample setting, to get an idea of the impact of increasing/decreasing
the margin.

In [3]:
# Derive paths for configuration

locality_clean = locality.replace(' ', '_')

filename_main       = os.path.join(os.pardir, 'data_sources', 'Locality_' + locality_clean + '.osm')
filename_margin     = os.path.join(os.pardir, 'data_sources', 'Locality_' + locality_clean + '_margin.osm')
locality_margin     = '{0:s}_{1:d}m'.format(locality_clean, margin)

detection_filename  = os.path.join(
    module_path_root,
    'detections',
    locality_margin,
    'detection_log.csv'
)

In [4]:
# Load OSM data
walker = osm_walker(filename_main, filename_margin, verbose=False)

# Generate sample lists with different margin settings, and report sample point count for each
sample_points_20 = walker.sample_all_way_intersections(-20, +20, 10, ordered=True, verbose=False)
sample_points_10 = walker.sample_all_way_intersections(-10, +10, 10, ordered=True, verbose=False)
sample_points_00 = walker.sample_all_way_intersections(  0,   0, 10, ordered=True, verbose=False)

print('+/- 20m: ' + str(len(sample_points_20)))
print('+/- 10m: ' + str(len(sample_points_10)))
print('+/- 00m: ' + str(len(sample_points_00)))

+/- 20m: 1851
+/- 10m: 1181
+/- 00m: 511


In [5]:
# Show sample structure of each point
# [lat, lon, bearing, offset, way_start_id, way_id, node_id]

print(sample_points_20[0])

[-38.1611285, 145.1035642, 202, 0, '809849739', '770308568', '7190614708']


## Download/Cache GSV images

Download GSV images (if not already cached)

In [6]:
download_directory = os.path.join(module_path_root, 'data_sources', 'gsv')
apikey_filename    = os.path.join(module_path_root, 'apikey.txt')
batch_filename     = os.path.join(module_path_root, 'batches', locality_clean + '_20m.csv')
output_geojson     = os.path.join(module_path_root, 'detections', locality_margin, 'detected_points.geojson')

# Initialise interface to Google Street View
gsv = gsv_loader(apikey_filename, download_directory)

# Create a batch file (CSV) with the list of locations to download from Google
# limit=0 means unlimited, set to a small integer to test a few downloads
gsv.write_batch_file(batch_filename, sample_points_20, limit=0)

Backup D:\TensorFlow2\TFODCourse\minor_thesis\batches\Mount_Eliza_Sample_20m.csv to D:\TensorFlow2\TFODCourse\minor_thesis\batches\Mount_Eliza_Sample_20m20210920_142048.csv
Write D:\TensorFlow2\TFODCourse\minor_thesis\batches\Mount_Eliza_Sample_20m.csv


In [7]:
# Process the batch file (with progress bar) and report how many images were fetched vs. skipped
# Working from a batch file means we have a permanent record of what was loaded (in case we need to resume later)
# and helps us implement a progress bar via tdqm
gsv.process_batch_file(batch_filename, progress=True, verbose=False)

  0%|          | 0/1851 [00:00<?, ?it/s]

GSV Cache Hits:       7404 Misses:          0


## Load Detections

In [8]:
# Load detections from Apply_Model.ipynb

walker.load_detection_log(detection_filename)

In [9]:
walker.draw_way_segment('809849739', intersection_skip_limit=1, verbose=True)

7190614708     0 => 0 1 0 -38.161128, 145.103564
638345374      1 => 1 0 2 -38.161668, 145.103287
113302198      2 => 0 0 2 -38.162533, 145.102853
367133525      3 => 1 1 0 -38.163992, 145.102131
30204320       4 => 0 0 1 -38.164266, 145.102000
638345569      5 => 0 0 1 -38.164538, 145.101855
30204321       6 => 0 0 1 -38.164804, 145.101766
638345571      7 => 0 0 1 -38.165148, 145.101686
768939206      8 => 1 1 0 -38.165326, 145.101653
30204322       9 => 0 0 1 -38.165519, 145.101643
638345574     10 => 0 0 1 -38.165842, 145.101664
638345577     11 => 0 0 1 -38.166244, 145.101691
30204323      12 => 1 1 0 -38.166706, 145.101747
607647718     13 => 1 1 0 -38.166812, 145.101755
30204324      14 => 0 0 1 -38.167470, 145.101785
638345589     15 => 0 0 1 -38.167824, 145.101774
638345592     16 => 0 0 1 -38.168130, 145.101712
30204325      17 => 0 0 1 -38.168518, 145.101619
638345603     18 => 0 0 1 -38.168994, 145.101463
30204326      19 => 0 0 1 -38.169395, 145.101265
638345610     20 => 

In [10]:
walker.write_detected_geojson(locality_margin, output_geojson, intersection_skip_limit=1, verbose=False)

Writing to: D:\TensorFlow2\TFODCourse\minor_thesis\detections\Mount_Eliza_Sample_20m\detected_points.geojson
