# 20. Detect Paved Shoulders Dashcam

This is an updated version of notebook 12, except we will run a lane detection algorithm while we are splitting the video into frames, and store information about the slope and intercept of detected lanes in metadata.csv

Then, we will interpret those stats to decide whether there is a paved shoulder (or bicycle lane) along each stretch of road, from intersection to intersection

## Configuration

Any configuration that is required to run this notebook can be customized in the next cell

In [1]:
# Name of a folder containing input MP4 videos and their corresponding NMEA files,
# imported from the dash camera.  This notebook will assume this folder is found
# inside the 'data_sources' folder.
# We will create a "split" subdirectory inside the import_directory, containing
# each of the images, plus a "metadata.csv" file to describe each of them in terms
# of latitude/longitude, altitude, and heading
import_directory = 'dashcam_tour_mount_eliza'
#import_directory = 'dashcam_tour_frankston'

# Required frames per second for output images, reduced down from 60 fps
output_fps = 5

# Which "locality" do we wish to process?
locality = 'Mount Eliza'

# We will sample the middle of each intersection, but we can also sample a
# "margin" around the intersection, at 10m intervals.
# E.g. if we set this to "20" then we will sample points at:
#    -20m, -10m, 0m, 10m, and 20m
# from the centre of the intersection, along the assumed bearing of the road
# Used here just to get the right filename
margin = 20

## Code

In [2]:
# General imports
import os
import sys
import shutil

from pathlib import Path

import pandas as pd
import numpy as np

from tqdm.notebook import tqdm, trange

from shapely.geometry import Point


module_path_root = os.path.abspath(os.pardir)
if module_path_root not in sys.path:
    sys.path.append(module_path_root)
    
# Import local modules
import osm_gsv_utils.dashcam_parser as dashcam_parser
import osm_gsv_utils.lane_detection as lane_detection
import osm_gsv_utils.osm_walker as osm_walker

In [3]:
# Derived paths

# Full path to the directory containing the MP4 videos and NMEA files
dashcam_dir = os.path.join(module_path_root, 'data_sources', import_directory)

# "Split" subdirectory where the output frames will be created, along with a "metadata.csv"
# with metadata about each frame, loaded and interpolated from the NMEA files
output_dir  = os.path.join(dashcam_dir, 'split')

# Configuration file to correct for optical distortion, created from calibration notebook 18
calibration_config = os.path.join(module_path_root, 'data_sources', 'dashcam_calibration.yml')

# Location of basic metadata.csv from the video-> image split process
metadata_dir        = os.path.join(module_path_root, 'data_sources', import_directory, 'split')
metadata_csv_in     = os.path.join(metadata_dir, 'metadata.csv')

# We read the basic metadata.csv and create a version where we have matched each point
# to the nearest intersection
metadata_csv_out    = os.path.join(metadata_dir, 'metadata_out.csv')

# We then make a final version where we have taken summary statistics for each segment of
# road along a way, from intersection to intersection, and join that back on as additional
# columns that can help us decide which segments of road appear to have a paved shoulder
metadata_csv_sum    = os.path.join(metadata_dir, 'metadata_with_summary.csv')

# Output geojson file with the paved shoulders we think we have detected
geojson_out         = os.path.join(metadata_dir, 'lanes.geojson')

# Directory where the lane detection process creates images with a detection overlay
lanes_dir_in        = os.path.join(metadata_dir, 'lanes')

# We filter out images that are too close to an intersection, this directory is a copy
# of the images with lane detection overlay that made the cut
lanes_dir_out       = os.path.join(metadata_dir, 'lanes_filtered')

# A version of the locality name with spaces replaced by underscores
locality_clean = locality.replace(' ', '_')

# Work out paths to OSM data
filename_main        = os.path.join(module_path_root, 'data_sources', 'Locality_' + locality_clean + '.osm')
filename_margin      = os.path.join(module_path_root, 'data_sources', 'Locality_' + locality_clean + '_margin.osm')
locality_margin     = '{0:s}_{1:d}m'.format(locality_clean, margin)

In [4]:
# Lane detector
ld = lane_detection(calibration_config=calibration_config)

# Initialise an object to parse dashcam footage and correlate it with NMEA data
# This time we are passing a lane detector model, which will run as we split the video into images
parser = dashcam_parser(source_fps=60, lane_detector=ld, write_lane_images=True)

In [5]:
# Split all videos in the directory at the required output frames per second
parser.split_videos(dashcam_dir, output_dir, output_fps=output_fps, suffix='MP4', verbose=False)

# First progress bar shows progress through the input video files
# Subsequent progress bar shows progress within an individual video

# Ignore occasional warning "processing line" due to missing fields in NMEA file.
# As long as there are only a few of these, the values will be interpolated
# from nearby entries.

  0%|          | 0/45 [00:00<?, ?it/s]

FILE210924-100801F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-100902F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-101002F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-101102F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-101202F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-101303F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-101403F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-101503F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-101603F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-101704F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-101804F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-101904F:   0%|          | 0/3612 [00:00<?, ?it/s]

], using previous values
], using previous values


FILE210924-102004F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-102105F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-102205F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-102305F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-102405F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-102506F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-102606F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-102706F:   0%|          | 0/3612 [00:00<?, ?it/s]

], using previous values
], using previous values


FILE210924-102806F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-102907F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-103007F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-103107F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-103207F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-103308F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-103408F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-103508F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-103608F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-103709F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-103809F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-103909F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-104009F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-104110F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-104210F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-104310F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-104410F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-104511F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-104611F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-104711F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-104811F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-104911F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-105012F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-105112F:   0%|          | 0/3612 [00:00<?, ?it/s]

FILE210924-105212F:   0%|          | 0/3612 [00:00<?, ?it/s]

## Process Lane Detection Log

From all the stats in the "metadata.csv" produced by lane detection:

* For each point, find the corresponding "intersection" nodes in OSM for the way segment
    * Find the nearest way, excluding unnamed (roundabouts etc. that are really intersections themselves)
    * On that way, find the three nearest points
    * The nearest point is one intersection node
    * If the point is further from the second-nearest point than the third-nearest point, then use the third-nearest
    
    |-------*-|-|
    
* If the distance from the previous point is less than a threshold number of metres, then disregard the point as a duplicate that might otherwise skew our assessment.  E.g. car waiting at intersection for many frames.

* What is the difference between the slopes of the potential bike lane/shoulder?

* What is the difference between the intercepts at the top line?

==> Set thresholds for these

==> Count what proportion of points in a way section meet both criteria

==> Compare to ground truth for a suburb to select a score/proportion to use for a yes/no answer

==> Draw on a map

In [6]:
# Load OSM data
walker = osm_walker(filename_margin, filename_margin, verbose=False)

  0%|          | 0/4274 [00:00<?, ?it/s]

  0%|          | 0/4274 [00:00<?, ?it/s]

In [7]:
# Find nearest intersection for each record in metadata.csv
walker.find_nearest_intersections_for_csv(metadata_csv_in, metadata_csv_out)

  0%|          | 0/13545 [00:00<?, ?it/s]

In [8]:
# Some images were filtered out because we couldn't find the two closest intersections,
# or we were too close to an intersection and wanted to avoid noise

# Create a folder where only the included "lanes" images are included, for ease of browsing

# Create output directory for filtered lane images and delete any existing files
shutil.rmtree(lanes_dir_out, ignore_errors=True)
Path(lanes_dir_out).mkdir(parents=True, exist_ok=True)

# Copy every file that made the cut
df = pd.read_csv(metadata_csv_out)

for i in trange(0, len(df['filename'])):
    path = df['filename'][i]
    filename = os.path.basename(path)
    shutil.copyfile(os.path.join(lanes_dir_in, filename), os.path.join(lanes_dir_out, filename))

  0%|          | 0/6942 [00:00<?, ?it/s]

## Pandas Stats

For each combination of way_id_start, node_id1, node_id2:

* Proportion where left_slope2 and left_slope1 are not None or 'None'
* Standard Deviation of intersection_x
* Average of intersection_y

In [9]:
walker.summarise_lane_detections_csv(metadata_csv_out, metadata_csv_sum)

## Criteria

The following criteria appears to generally pick up most true bike lanes/paved shoulders without significant false positives:

intersection_y_std < 50
intersection_x_std < 50
prop_missing < 0.2
width_top_mean >= 75

Perhaps also set a limit that the length of a LineString must be at least 20m to be drawn

In [10]:
walker.draw_lane_detections(metadata_csv_sum, geojson_out)

  0%|          | 0/52 [00:00<?, ?it/s]

Writing 52 features to: E:\Release\minor_thesis\data_sources\dashcam_tour_mount_eliza\split\lanes.geojson
