                                                Reflections




General Overview:

The direction I chose to explore for the pipeline was to select reasonable parameters for the Gaussian blur, Canny edge detection and Hough line voting procedures.  These parameters were similar to those that were effective for the practice problems, but a bit different for the specific images and videos presented in this problem set.  

Once these preliminaries were performed, the resulting image was masked to a bounding isosceles trapezoid that contains the lanes.  This ended up being proportional of the image size, as the position of the camera in the car relative to the lanes is reasonably invariant.  Doing it this way availed using the same routine for the two test videos as well as the larger challenge video.  

The horizon is set closer than the actual horizon.  The reason for this is that there is a lot of contrast on the horizon, and this contributes to noise that interferes with lane line detection.  This issue proved to be a balance in terms of the minimum line length, where if it is too small there is road noise, if it is too large then the broken road segments are harder to detect.  Similarly with the kernel size for the Gaussian blur, if it is too low then road noise contributes excessively, but if it is too large then features can be missed more easily.

For the left and right lane lines separately I observe if the slope for each line returned from the above meets a reasonably criteria.  If so then it is a member of a pool of lines for that side that does two things:  it is used to compute the mean slope for the side, and it is used to determine the nearest point to the camera.  The reason for this choice is that it seemed like a reasonable rubric for proper operation -- the farther out points are less important since they are farther away.  I looked at lines using both the nearest and farthest as endpoints, but in practice this did not work as well, at least in my opinion.

The slope and the nearest point are treated as a simple moving average filter using a specified number of points.  This smooths the jumpiness that otherwise is present.  Also, if a slope matching the requisite conditions is not detected, it simply recycles the last moving average.  This presents much better general performance, but it does come at the expense of more difficulty getting started.  Reason for this is satisfying the initial conditions for the first few frames.  The right way to do that is to have a sequence of stored values that are pre-initialized to some sane value.  What I chose to do was ignore the first frames vis-a-vis redoing the invocation to the pipeline and skipping writing out the first few.  The goal was steady state performance at the expense of how it starts. There are two video writing routines in the code, one that truncates the first few frames and one that does not.



What it does not do well / how to improve:

It really should have a better solution for the moving average filter.  While getting to steady state with a MA filter is always troubling, chopping frames off is probably not really the right solution.  It is a little disingenuous, I suppose.  

The method should be more restrictive in terms of color.  Not just RGB color, probably something more along the lines of hue banding.  This would be helpful in being able to reduce the impact of low contrast areas like the concrete section of lane in the challenge, and reduce the effect of things like trees and their shadows.  Those issues seem to make most of the problem in the challenge, and much of the solution involves proper color pre-processing.

Another problem is the relative performance of the solid line versus the dashed line.  The dashed line has less line samples and more variability, and consequently does not track as well.  I am not sure that the two can be made to perform the same, as there is no real solution to the underlying difference between the two, but it is something that can be improved to track the dashed line better.  

It would be better if there was a step performed initially that could help identify the active area for the lane.  Perhaps this could be done on the fly based on the detected lane.  For example, take the detected lane and expand it a bit and use that for the active area for subsequent lane detection.  I suspect this works well when it works and is terrible when it gets in a situation where it doesn't work properly.  It needs to be done, though, as it is not reasonable to assume some car and camera geometry as invariant -- it needs to be dynamic.  One reason this is important can be seen in the challenge.  That car in the right lane on the curve interferes with the little isoscolese triangle bound I use.  Camera position and such may be reasonable invariants, but road curve isn't.

Another thing it does not do well is track curving roads.  The way I implemented it, it is basically a moving average slope and a moving average nearest point.  From what I have seen this keeps it tangent to the line nearest you fairly well.  This works reasonably well for straight line roads, but it is not as good a model for curved roads.  It would be better to fit a higher order function, spline, etc., to a number of line segment centers in order to better estimate the road curvature.



In [88]:
import os
import math
import cv2
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
from moviepy.editor import *
from IPython.display import HTML
%matplotlib inline



# Mundane functions
def grayscale(img):
    return cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    
def canny(img, low_threshold, high_threshold):
    return cv2.Canny(img, low_threshold, high_threshold)

def gaussian_blur(img, kernel_size):
    return cv2.GaussianBlur(img, (kernel_size, kernel_size), 0)

def region_of_interest(img, vertices):
    mask = np.zeros_like(img)   
    if len(img.shape) > 2:
        channel_count = img.shape[2]  
        ignore_mask_color = (255,) * channel_count
    else:
        ignore_mask_color = 255
    cv2.fillPoly(mask, [vertices], ignore_mask_color)
    return cv2.bitwise_and(img, mask)

def hough_lines(img, rho, theta, threshold, min_line_len, max_line_gap):
    lines = cv2.HoughLinesP(img, rho, theta, threshold, np.array([]), minLineLength=min_line_len, maxLineGap=max_line_gap)
    line_img = np.zeros(img.shape, dtype=np.uint8)
    return draw_lines(line_img, lines)
   
def weighted_img(img, initial_img, α=1.0, β=1., λ=0.):
    return cv2.addWeighted(initial_img, α, img, β, λ)

def image_list():
    return os.listdir('test_images/')

def video_list():
    return os.listdir('test_videos/')

def load_image(image_name):
    return mpimg.imread('test_images/' + image_name)

def load_video(video_name):
    return VideoFileClip('test_videos/' + video_name)

def save_image(image, image_name):
    plt.imshow(image)  #cmap='gray'
    plt.savefig('test_images_output/' + image_name, bbox_inches='tight')

def save_video(video, video_name):
    video.write_videofile('test_videos_output/' + video_name, audio=False)
    
def run_image(image_name):
    image = load_image(image_name)
    processed_image = process_image(image)
    save_image(processed_image, image_name)
    
def run_video(video_name):
    video = load_video(video_name)
    processed_video = video.fl_image(process_image)
    save_video(processed_video, video_name)
    
    
def run_video2(video_name):
    clip = load_video(video_name)
    frame_count = 0
    new_frames = []
    for frame in clip.iter_frames():
        frame_count += 1
        new_frame = process_image(frame)
        if (frame_count > 40):
            new_frames.append(new_frame)
    new_clip = ImageSequenceClip(new_frames, fps = clip.fps)
    save_video(new_clip, video_name);
    
    
    
# Global vars, yuck    
x_near_left_old = 0
x_far_left_old = 0
x_near_right_old = 0 
x_far_right_old = 0

slope_left_old = -1.0
slope_right_old = 1.0
    
    
    
# Functions that matter
def draw_lines(img, lines, color=[255, 0, 0], thickness=12, samples = 8):
   
    global x_near_left_old
    global x_far_left_old
    global x_near_right_old
    global x_far_right_old
    
    global slope_left_old
    global slope_right_old
    
    height = img.shape[0]
    width = img.shape[1]
    
    slope_left_sum = 0
    slope_left_count = 0
    slope_right_sum = 0
    slope_right_count = 0
    
    nearest_left_x = 0
    nearest_left_y = 0
    nearest_right_x = 0
    nearest_right_y = 0
        
    for line in lines:
        for x1,y1,x2,y2 in line:
            
            # Discount things far away as they are noisy
            if ((y1 < int(0.5 * height)) or (y2 < int(0.5 * height))):
                continue
            
            # Slope
            m = (y2 - y1) / (x2 - x1)
            
            # Left - constrain slope and horizontal position
            if ((m < -0.5) & (m > -1.0) & (x1 < int( 0.5 * width))):
                slope_left_count += 1
                slope_left_sum += m
                if y1 > nearest_left_y:
                    nearest_left_y = y1
                    nearest_left_x = x1
                if y2 > nearest_left_y:
                    nearest_left_y = y2
                    nearest_left_x = x2
             
            # Right - constrain slope and horizontal position
            if ((m > 0.5) and (m < 1.0) and (x1 > int(0.5 * width))):
                slope_right_count += 1
                slope_right_sum += m
                if y1 > nearest_right_y:
                    nearest_right_y = y1
                    nearest_right_x = x1
                if y2 > nearest_right_y:
                    nearest_right_y = y2
                    nearest_right_x = x2
             
    if slope_left_count > 0:
        slope_left_avg = slope_left_sum / slope_left_count
        slope_left_avg = 1 / slope_left_avg
        
        slope_left_avg = (slope_left_avg + (samples - 1) * slope_left_old ) / samples
        slope_left_old = slope_left_avg
        
        x_near = int(nearest_left_x + slope_left_avg * (height - nearest_left_y))
        x_far = int(nearest_left_x - slope_left_avg * (nearest_left_y - int(0.65 * height)))
        
        x_near = int((x_near + (samples - 1) * x_near_left_old) / samples)
        x_near_left_old = x_near
        x_far = int((x_far + (samples - 1) * x_far_left_old) / samples)
        x_far_left_old = x_far
        
        cv2.line(img, (x_near, height), (x_far, int(0.65 * height)), color, thickness)
    else:
        cv2.line(img, (x_near_left_old, height), (x_far_left_old, int(0.65 * height)), color, thickness)
    
    if slope_right_count > 0:
        slope_right_avg = slope_right_sum / slope_right_count
        slope_right_avg = 1 / slope_right_avg
        
        slope_right_avg = (slope_right_avg + (samples - 1) * slope_right_old ) / samples
        slope_right_old = slope_right_avg
        
        x_near = int(nearest_right_x + slope_right_avg * (height - nearest_right_y))
        x_far = int(nearest_right_x - slope_right_avg * (nearest_right_y - int(0.65 * height)))
        
        x_near = int((x_near + (samples - 1) * x_near_right_old) / samples)
        x_near_right_old = x_near
        x_far = int((x_far + (samples - 1) * x_far_right_old) / samples)
        x_far_right_old = x_far
        
        cv2.line(img, (x_near, height), (x_far, int(0.65 * height)), color, thickness)
    else:
        cv2.line(img, (x_near_right_old, height), (x_far_right_old, int(0.65 * height)), color, thickness)

    return img


    
# Main image processing pipeline    
def process_image(image):
    height = image.shape[0]
    width = image.shape[1]
    g = grayscale(image)
    b = gaussian_blur(g, 15)
    c = canny(b, 40, 120)
    h = hough_lines(c, 1, 0.01, 10, 25, 30)  
    v = np.array([ [int(0.1 * width), height], [int(0.45 * width), int(0.5 * height)], [int(0.55 * width), int(0.5 * height)], [int(0.9 * width), height] ])
    m = np.zeros_like(h) 
    ch = np.dstack( (h, m, m) )
    r = region_of_interest(ch, v)
    return weighted_img(r, image)
    

# Videos
for video_name in video_list():
    run_video2(video_name)
    

[MoviePy] >>>> Building video test_videos_output/solidYellowLeft.mp4
[MoviePy] Writing video test_videos_output/solidYellowLeft.mp4


100%|██████████| 641/641 [00:06<00:00, 101.97it/s]


[MoviePy] Done.
[MoviePy] >>>> Video ready: test_videos_output/solidYellowLeft.mp4 

[MoviePy] >>>> Building video test_videos_output/solidWhiteRight.mp4
[MoviePy] Writing video test_videos_output/solidWhiteRight.mp4


100%|██████████| 182/182 [00:01<00:00, 100.10it/s]


[MoviePy] Done.
[MoviePy] >>>> Video ready: test_videos_output/solidWhiteRight.mp4 

[MoviePy] >>>> Building video test_videos_output/challenge.mp4
[MoviePy] Writing video test_videos_output/challenge.mp4


100%|██████████| 211/211 [00:04<00:00, 50.12it/s]


[MoviePy] Done.
[MoviePy] >>>> Video ready: test_videos_output/challenge.mp4 

