Finding Lane Lines on the Road
The goal of this project is to make a pipeline that finds lane lines on the road. Either images or video can be input to test the pipeline. The project is done in Python with OpenCV library and can be opened in Jupyter Notebook.
Digest and discussion can be found on this blog post.
My pipeline consists of 10 steps:
- Reading image or video frame
- Filtering white and yellow colors
- Conversion to grayscale
- Gaussian blurring
- Edge detection
- Region of interest definition
- Hough lines detection
- Filtering Hough lines
- Averaging line segments
- Applying moving average on final lines
The main function processing the image takes its path as argument. Below, draw_lanes_image function is called for each image found in a given directory.
def draw_lanes_image(imageName):
image = mpimg.imread(imageName)
import os
for img in os.listdir("test_images/"):
draw_lanes_image("test_images/"+img)
Below, there are 3 examples of loaded images. Later, after each step, intermediate results will be shown for these samples images. The third one is the most demanding for processing as there are shadows and contrasts between yellow line and the road is very small.
This step wouldn't be necessary for the first two easier images. In the third example, however, proceeding directly to the next step (gray scale conversion) would produce very similar gray colors for the yellow lane and the bright road. We would like to differentiate these two objects somehow. Thus the idea of initial filtering of 2 key colors which are the main components of the road lanes. Firstly, the image is converted to HSL color space. HSL (Hue, Saturation, Lightness) color space concept is based on human vision color perception. That is why it's easier to distinguish desired colors (yellow and white) than in RGB space even if there are shadows on the image. The code below is inspired by similar project.
def convert_hls(img):
return cv2.cvtColor(img, cv2.COLOR_RGB2HLS)
def mask_white_yellow(image):
converted = convert_hls(image)
# white color mask
lower = np.uint8([ 0, 200, 0])
upper = np.uint8([255, 255, 255])
white_mask = cv2.inRange(converted, lower, upper)
# yellow color mask
lower = np.uint8([ 10, 0, 100])
upper = np.uint8([ 40, 255, 255])
yellow_mask = cv2.inRange(converted, lower, upper)
# combine the mask
mask = cv2.bitwise_or(white_mask, yellow_mask)
whiteYellowImage = cv2.bitwise_and(image, image, mask = mask)
return whiteYellowImage
whiteYellowImage = mask_white_yellow(image)
For extraction of white color I filtered only high lightness from the "L" component of HSL color space. For yellow lanes I chose Hue to be more or less at 30 to select yellow color and filtered Saturation to be quite hith. Below, there are test images after such filtering.
As in many computer vision application, the image is converted to grayscale. It's mainly for the simplicity and speed of further operations. For instance, edge detectors find big gradients between adjacent pixels. So, it will be easier to compare pixels only in one dimension (grayscale) than in RGB or HSL color spaces.
def grayscale(img):
return cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
grayImage = grayscale(whiteYellowImage)
To supress noise and spurious gradients Gaussian smoothing is applied. By experiments, kernel of size 5 was chosen. Here, it's again preparation for edge detection step. Borders between lane and road can be not so smooth, so we don't want the edge detector to classify such regions as additional lines.
def gaussian_blur(img, kernel_size):
return cv2.GaussianBlur(img, (kernel_size, kernel_size), 0)
blurredImage = gaussian_blur(grayImage, 5)
To detect edges, let's use popular Canny method. It's called with 2 parameters: low and high thresholds which should be found by trial and error. According to the OpenCV documentation:
- If a pixel gradient is higher than the upper threshold, the pixel is accepted as an edge
- If a pixel gradient value is below the lower threshold, then it is rejected.
- If the pixel gradient is between the two thresholds, then it will be accepted only if it is connected to a pixel that is above the upper threshold.
Canny recommended a upper:lower ratio between 2:1 and 3:1. I chose values of 80 and 40. Below, there are outputs of this operation.
def canny(img, low_threshold, high_threshold):
return cv2.Canny(img, low_threshold, high_threshold)
edgesImage = canny(blurredImage, 40, 80)
Below, there are outputs of this operation.
To filter out unnecessary objects in the image, the region of interest is defined. Such mask (here it's trapezoid) is then applied to the working image.
def region_of_interest(img, vertices):
mask = np.zeros_like(img)
if len(img.shape) > 2:
channel_count = img.shape[2] # i.e. 3 or 4 depending on your image
ignore_mask_color = (255,) * channel_count
else:
ignore_mask_color = 255
cv2.fillPoly(mask, vertices, ignore_mask_color)
masked_image = cv2.bitwise_and(img, mask)
return masked_image
yTopMask = imgHeight*0.55
vertices = np.array([[0, imgHeight], [imgWidth*0.45, yTopMask],
[imgWidth*0.55, yTopMask], [imgWidth,imgHeight]], np.int32)
maskedImage = region_of_interest(edgesImage, [vertices])
Now, having edges detected in our interest area, all straight lines need to be identified. This is done by . This operation has quite many parameters which need to be tuned experimentally.
Now, having edges detected in our interest area, all straight lines need to be identified. This is done by Hough transform. This operation has quite many parameters which need to be tuned experimentally. Speaking at high level, they define how long or how "straight" the sequence of pixels should be to be classified as one line.
rho = 2
theta = np.pi/180
threshold = 15
min_line_length = 15
max_line_gap = 5
houghLines = cv2.HoughLinesP(maskedImage, rho, theta, threshold, np.array([]),
minLineLength=min_line_length, maxLineGap=max_line_gap)
Below, there are tested images with found lines plotted.
As we can see above, some line segments are unwanted. For example, small horizontal lines or some lines appearing on cars which are inside the region of interest. Therefore, for each Hough line we calculate a slope parameter. After some experimentation, only lines with slopes between 17 and 56 degrees of slope were left for further analysis. (it correspond to tangents of values 0.3 and 1.5).
for line in houghLines:
for x1,y1,x2,y2 in line:
a = float((y2-y1)/(x2-x1))
if not np.isnan(a) or np.isinf(a) or (a == 0):
if (a > -1.5) and (a < -0.3) :
linesFiltered.append(line)
if (a > 0.3) and (a < 1.5) :
linesFiltered.append(line)
Below, there are only filtered Hough lines.
All found Hough lines should be right now averaged/extrapolated to produce only two lines representing lanes. The first task is to divide lines into 2 groups (left and right - deduced from the slope sign). Then, one can use best linear fit for the points representing line segments or take an average from these lines. I decided to apply weighted average to calculate resulting slopes and intercepts. Here, lenghts of line segments serve as weights. The longer segment is, the more influence it has on the results. In addition, to amplify the importance of the segment length, the weight is calculated as square the lenght parameter.
for line in houghLines:
for x1,y1,x2,y2 in line:
a = float((y2-y1)/(x2-x1))
b = (y1-a*x1)
length = math.sqrt(pow(y2-y1,2)+ pow(x2-x1,2))
if not np.isnan(a) or np.isinf(a) or (a == 0):
if (a > -1.5) and (a < -0.3) :
cumLengthLeft += pow(length,2)
a_left += a * pow(length,2)
b_left += b * pow(length,2)
if (a > 0.3) and (a < 1.5) :
cumLengthRight += pow(length,2)
a_right += a * pow(length,2)
b_right += b * pow(length,2)
if (cumLengthLeft != 0) :
a_left /= cumLengthLeft
b_left /= cumLengthLeft
if (cumLengthRight != 0) :
a_right /= cumLengthRight
b_right /= cumLengthRight
The result parameters are a_left, b_left, a_right and b_right. Now, having previously defined the y positions of output lines, we can calculate their x coordinates which give us the full information about the points defining the final lines.
if (a_left!=0):
x1_left = int((y_max-b_left)/a_left)
x2_left = int((y_min-b_left)/a_left)
if (a_right!=0):
x1_right = int((y_max-b_right)/a_right)
x2_right = int((y_min-b_right)/a_right)
The final output of the pipeline.
While running the pipeline on the video stream, we can observe that the lines are flickering. To avoid it, we can apply cumulative moving average of the line parameters. For each frame it averages last n results including current one. Cumulative version of this method applies bigger weights for more recent results. By having in memory the last averaged frame we can also use it in case when no line is found in a frame by some mistake. The possible implementation with n=9 is as follows.
def get_averaged_line_params(lineParams, leftHoughLinesExist, rightHoughLinesExist):
global aLeftStored
global bLeftStored
global aRightStored
global bRightStored
param1 = 0.9
param2 = 0.1
a_left = lineParams[0]
b_left = lineParams[1]
a_right = lineParams[2]
b_right = lineParams[3]
if (aLeftStored==0):
aLeftStored = a_left
if (bLeftStored==0):
bLeftStored = b_left
if (aRightStored==0):
aRightStored = a_right
if (bRightStored==0):
bRightStored = b_right
if (not leftHoughLinesExist):
a_left = aLeftStored
else :
a_left = aLeftStored * param1 + a_left * param2
aLeftStored = a_left
b_left = bLeftStored * param1 + b_left * param2
bLeftStored = b_left
if (not rightHoughLinesExist):
a_right = aRightStored
else :
a_right = aRightStored * param1 + a_right * param2
aRightStored = a_right
b_right = bRightStored * param1 + b_right * param2
bRightStored = b_right
return [a_left, b_left, a_right, b_right]
lineParams = [a_left, b_left, a_right, b_right]
lineParams = get_averaged_line_params(lineParams, leftHoughLinesExist,
rightHoughLinesExist)
One potential shortcoming of the descirbed pipeline would be what would happen when another car appears in our region of interest. It could produce some lines that could be identified as lanes. When applying the pipeline in practice, I think there can be also too many parameters given by a'priori assumptions like lane colors or strictly defined region of interest. The pipeline probably would not work so well when the lanes are curvy or when some white/yellow flat signs are marked on the road.
A possible improvement would be to use some kind of higher order polynomial fit to handle curvy lanes. Also, instead of just guessing/experimenting with many parameters used here, maybe we can automate the process of parameters search. For example by some optimization - having the desired output (ground truth) of many images and processing input images with parameters combinations we could find a perfect set of parameters. Also, some artificial intelligence methods could be used to learn specific shapes of lanes and then detect them. It could be necessary when the road is without any lane - then more complicated scene understanding algorithms should be required.