# Process Lane Detection Log

From all the stats in the "metadata.csv" produced by lane detection:

* For each point, find the corresponding "intersection" nodes in OSM for the way segment
    * Find the nearest way, excluding unnamed (roundabouts etc. that are really intersections themselves)
    * On that way, find the three nearest points
    * The nearest point is one intersection node
    * If the point is further from the second-nearest point than the third-nearest point, then use the third-nearest
    
    |-------*-|-|
    
* If the distance from the previous point is less than a threshold number of metres, then disregard the point as a duplicate that might otherwise skew our assessment.  E.g. car waiting at intersection for many frames.

* What is the difference between the slopes of the potential bike lane/shoulder?

* What is the difference between the intercepts at the top line?

==> Set thresholds for these

==> Count what proportion of points in a way section meet both criteria

==> Compare to ground truth for a suburb to select a score/proportion to use for a yes/no answer

==> Draw on a map

In [1]:
import_directory = 'dashcam_tour_mount_eliza'
#import_directory = 'dashcam_tour_frankston'

locality = 'Mount Eliza'
#locality = 'Frankston'
margin   = 20

In [2]:
import os
import sys
import shutil

from pathlib import Path

import pandas as pd
import numpy as np

from tqdm.notebook import tqdm, trange

from shapely.geometry import Point

module_path_root = os.path.abspath(os.pardir)
if module_path_root not in sys.path:
    sys.path.append(module_path_root)
    
# Import local modules
import osm_gsv_utils.osm_walker as osm_walker

In [3]:
# Derive paths
metadata_dir        = os.path.join(module_path_root, 'data_sources', import_directory, 'split')
metadata_csv_in     = os.path.join(metadata_dir, 'metadata.csv')
metadata_csv_out    = os.path.join(metadata_dir, 'metadata_out.csv')
metadata_csv_sum    = os.path.join(metadata_dir, 'metadata_with_summary.csv')

lanes_dir_in        = os.path.join(metadata_dir, 'lanes')
lanes_dir_out       = os.path.join(metadata_dir, 'lanes_filtered')

locality_clean = locality.replace(' ', '_')

filename_main       = os.path.join(os.pardir, 'data_sources', 'Locality_' + locality_clean + '.osm')
filename_margin     = os.path.join(os.pardir, 'data_sources', 'Locality_' + locality_clean + '_margin.osm')
locality_margin     = '{0:s}_{1:d}m'.format(locality_clean, margin)

In [4]:
# Load OSM data
walker = osm_walker(filename_margin, filename_margin, verbose=False)

  0%|          | 0/4274 [00:00<?, ?it/s]

  0%|          | 0/4274 [00:00<?, ?it/s]

In [5]:
# Find nearest intersection for each record in metadata.csv
walker.find_nearest_intersections_for_csv(metadata_csv_in, metadata_csv_out)

  0%|          | 0/13545 [00:00<?, ?it/s]

In [6]:
# Some images were filtered out because we couldn't find the two closest intersections,
# or we were too close to an intersection and wanted to avoid noise

# Create a folder where only the included "lanes" images are included, for ease of browsing

# Create output directory for filtered lane images and delete any existing files
shutil.rmtree(lanes_dir_out, ignore_errors=True)
Path(lanes_dir_out).mkdir(parents=True, exist_ok=True)

# Copy every file that made the cut
df = pd.read_csv(metadata_csv_out)

for i in trange(0, len(df['filename'])):
    path = df['filename'][i]
    filename = os.path.basename(path)
    shutil.copyfile(os.path.join(lanes_dir_in, filename), os.path.join(lanes_dir_out, filename))

  0%|          | 0/6942 [00:00<?, ?it/s]

## Pandas Stats

For each combination of way_id_start, node_id1, node_id2:

* Proportion where left_slope2 and left_slope1 are not None or 'None'
* Standard Deviation of intersection_x
* Average of intersection_y

In [8]:
walker.summarise_lane_detections_csv(metadata_csv_out, metadata_csv_sum)

## Criteria

The following criteria appears to generally pick up most true bike lanes/paved shoulders without significant false positives:

intersection_y_std < 50
intersection_x_std < 50
prop_missing < 0.2
width_top_mean >= 75

Perhaps also set a limit that the length of a LineString must 