## Lab 8: Line detection on the Raspberry Pi 3 B+

Written by: Enrique Mireles Gutiérrez  
ID Number: 513944  
Bachelor: ITR  
Date: 2019-04-01  

### Introduction

Until now, all proposed algorithms throughout lab reports 2 and 7 have been running in a laptop using a last generation processor. However, since the target device for the autonomous vehicle will be a Raspberry Pi 3 B+, it makes perfect sense to start migrating to this platform. Some key aspects must be taken into account when migrating. 

The processing power of both devices is really different. This means that some performance measurements must done in order to create a baseline to further optimize the algorithms. This will help maintain a reliable frame rate for autonomous diving. Other aspects such as software architecture play an important role during migration. Since the Raspberry Pi 3 B+ uses an ARM processor, the libraries used in the PC can’t be downloaded to a Raspberry Pi with ease. Therefore, OpenCV must be compiled from source.

### Objectives

This lab has the following objectives:
- Implement the algorithms presented during the lab 07 in a Raspberry Pi 3 B+.
- Tweak algorithms to maximize performance on target device.
- Present results and possible solutions for better performance.

### Procedure

This lab report is subdivided in smaller numbered programs shown below.

#### 1. Program

This lab report contains a single program, similar to the one presented at the end of lab 07. With this in mind, no other sections are presented with runnable code (such as in other lab reports). This version performs time measurements in order to determine which settings perform faster. No other changes were made.

In [10]:
import cv2
import numpy as np
from time import sleep

HIGHWAY_VIDEO = '../fig/videos/highway.mp4'
SCALE = 0.5

settings = {
    'video_file': HIGHWAY_VIDEO,
    'scale': SCALE,
    'blur_kernel_size': (7, 7),
    'blur_iterations': 1,
    'canny_low_threshold': 10,
    'canny_high_threshold': 40,
    'hough_rho': 1,
    'hough_threshold': 20,
    'hough_theta': np.pi / 180,
    'hough_min_line_length': 5,
    'hough_max_line_gap': 60,
    'show_all_hough_lines': False,
    'show_canny': False,
    'roi_vertices': np.array([[
        (430, 830),           # Bottom left roi coordinate.
        (900, 600),           # Top left roi coordinate.
        (1020, 600),          # Top Right roi coordinate.
        (1530, 830)           # Bottom right roi coordinate.
    ]]) * 0.5 * SCALE,
    'abs_min_line_angle': 20,
    'lane_min_y': int(600 * 0.5 * SCALE),
    'lane_max_y': int(830 * 0.5 * SCALE)
}

def isGrayscale(img):
    """
        Returns true if the image is grayscale (channels == 1).
        Returns false if channels > 1.
    """
    
    # If img.shape has a channel value, read it and determine if
    # it is a grayscale image. If it doesn't have, assume that the
    # image is grayscale.
    if (len(img.shape) == 3):
        return img.shape[2] == 1
    return True

def createRoi(img, vertices):
    """
        Applies an image mask.

        Only keeps the region of the image defined by the polygon
        formed from `vertices`. The rest of the image is set to black.
    """
    
    # defining a blank mask to start with
    mask = np.zeros_like(img)

    # defining a 3 channel or 1 channel color to fill the mask with depending on the input image
    if isGrayscale(img):
        ignore_mask_color = 255
    else:
        ignore_mask_color = (255, 255, 255)

    # filling pixels inside the polygon defined by "vertices" with the fill color
    cv2.fillPoly(mask, vertices, ignore_mask_color)

    # returning the image only where mask pixels are nonzero
    return cv2.bitwise_and(img, mask)

def findLanes(hough_lines, img, settings):
    """
        Given an array of hough lines, detects which lines correspond to the left and right lane.
    """

    # Empty arrays for storing line coordinates (sets of points (x, y)).
    right_lines = {
        'x': [],
        'y': []
    }
    left_lines = {
        'x': [],
        'y': []
    }
    
    # Stores the fitted line coordinates for both lanes.
    # Its format is [x0, y0, x1, y1].
    lanes = {
        'left': [],
        'right': []
    }
    
    # Make sure at least some lines have been detected.
    if (type(hough_lines) == type(np.array([]))):

        # Iterate through every line.
        for line in hough_lines:

            # Usualy every line object contains a single set of coordinates,
            # nonetheless, it is placed inside a for for safety.
            for x1, y1, x2, y2 in line:

                # Calculate the direction of the line found.
                direction = np.rad2deg(np.arctan2(y2 - y1, x2 - x1))

                # Only draw lines whose angle is greater than the threshold.
                if (np.abs(direction) > settings['abs_min_line_angle']):

                    # If lines have a positive direction they are from the
                    # right lane.
                    if (direction > 0):
                        # Right lane.
                        right_lines['x'].extend([x1, x2])
                        right_lines['y'].extend([y1, y2])
                        if (settings['show_all_hough_lines']): 
                            cv2.line(img, (x1, y1), (x2, y2), (255, 0, 0), 1)
                    else:
                        # Left lane.
                        left_lines['x'].extend([x1, x2])
                        left_lines['y'].extend([y1, y2])
                        if (settings['show_all_hough_lines']):
                            cv2.line(img, (x1, y1), (x2, y2), (0, 0, 255), 1)
                    
                    # Write the angle of the line for debugging purposes.
                    if (settings['show_all_hough_lines']): cv2.putText(img, '%.1f' % (direction), (x1, y1), cv2.FONT_HERSHEY_SIMPLEX, fontScale=0.5, color=(255, 255, 0), thickness=1)
    
    # Make sure points on the left side were found.
    if (len(left_lines['x']) > 0 and len(left_lines['y']) > 0) and not settings['show_all_hough_lines']:
        
        # Using the points found, find a 1st order polynomial that best fits the data.
        poly_left = np.poly1d(np.polyfit(left_lines['y'], left_lines['x'], deg=1))
        
        # Evaluate the function found for the desired lane lengths.
        lanes['left'] = [
            int(poly_left(settings['lane_min_y'])),
            settings['lane_min_y'],
            int(poly_left(settings['lane_max_y'])),
            settings['lane_max_y']
        ]
    
    # Make sure points on the right side were found.
    if (len(right_lines['x']) > 0 and len(right_lines['y']) > 0) and not settings['show_all_hough_lines']:
        
        # Using the points found, find a 1st order polynomial that best fits the data.
        poly_right = np.poly1d(np.polyfit(right_lines['y'], right_lines['x'], deg=1))
        
        # Evaluate the function found for the desired lane lengths.
        lanes['right'] = [
            int(poly_right(settings['lane_min_y'])), 
            settings['lane_min_y'], 
            int(poly_right(settings['lane_max_y'])), 
            settings['lane_max_y']
        ]

    return lanes

def drawLanes(lanes, img):
    """
        Given the output of findLanes, draws the lanes in the image.
    """
    
    if (len(lanes['left'])):
        cv2.line(img, (lanes['left'][0], lanes['left'][1]), (lanes['left'][2], lanes['left'][3]), (0, 0, 255), 2)
    
    if (len(lanes['right'])):
        cv2.line(img, (lanes['right'][0], lanes['right'][1]), (lanes['right'][2], lanes['right'][3]), (0, 0, 255), 2)

def findLines(img, settings):
    """
        Given an BGR image, finds all lines in the image using a Probabilistic Hough Transform.
    """
    
    # Convert to gray scale.
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    
    # Blur image so that the Canny edge detector doesn't find useless edges.
    blur_gray = gray
    for i in range(settings['blur_iterations']):
        blur_gray = cv2.GaussianBlur(blur_gray, (settings['blur_kernel_size']), sigmaX=0, sigmaY=0)
    
    # Detect edges and create a ROI for the section defined.
    edges = cv2.Canny(blur_gray, settings['canny_low_threshold'], settings['canny_high_threshold'], apertureSize=3)
    masked_edges = createRoi(edges, settings['roi_vertices'].astype(int))
    
    # Detect lines.
    hough_lines = cv2.HoughLinesP(
        masked_edges, 
        rho = settings['hough_rho'], 
        theta = settings['hough_theta'], 
        threshold = settings['hough_threshold'], 
        lines = np.array([]), 
        minLineLength = settings['hough_min_line_length'], 
        maxLineGap = settings['hough_min_line_length']
    )
    
    # Return all data.
    return  {
        'gray': gray,
        'blur_gray': blur_gray,
        'edges': edges,
        'masked_edges': masked_edges,
        'hough_lines': hough_lines
    }

# Open selected video file.
cap = cv2.VideoCapture(settings['video_file'])

# Get scaled video properties.
FRAME_WIDTH = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH) * settings['scale'])
FRAME_HEIGHT = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT) * settings['scale'])
FRAME_FPS = cap.get(cv2.CAP_PROP_FPS)
print('Dimensions: %dx%d' % (FRAME_WIDTH, FRAME_HEIGHT))

# Create input and ROI windows. Place each window side by side.
cv2.namedWindow('input', cv2.WINDOW_AUTOSIZE)
cv2.moveWindow('input', 0, 200)
if (settings['show_canny']):
    cv2.namedWindow('ROI', cv2.WINDOW_AUTOSIZE)
    cv2.moveWindow('ROI', FRAME_WIDTH, 200)

# Open video.
while(cap.isOpened()):
    
    # Read frame form video.
    t0 = cv2.getTickCount()
    ret, frame = cap.read()
    
    # If video ended exit.
    if ret:
        
        # Resize frame to desired scale.
        if (settings['scale'] != 1):
            img = cv2.resize(frame, (0, 0), fx=settings['scale'], fy=settings['scale'])
        else:
            img = frame
        
        # Find all lines in the image.
        e1 = cv2.getTickCount()
        output = findLines(img, settings)
        e2 = cv2.getTickCount()
        dt_findLines = (e2 - e1) / cv2.getTickFrequency()
        
        # With the lines found, find which of them belong to the left and right lanes.
        e1 = cv2.getTickCount()
        lanes = findLanes(output['hough_lines'], img, settings)
        e2 = cv2.getTickCount()
        dt_findLanes = (e2 - e1) / cv2.getTickFrequency()
        
        # Draw results.
        drawLanes(lanes, img)
        
        # Calculate loop time.
        e2 = cv2.getTickCount()
        dt = (e2 - t0) / cv2.getTickFrequency()
        
        # Draw times to output image.
        cv2.putText(img, 'Cycle: %.4fms - %.1ffps' % (dt, 1 / dt), (10, 15), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255, 0), 1)
        cv2.putText(img, 'findLines: %.4fms' % (dt_findLines), (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255, 0), 1)
        cv2.putText(img, 'findLanes: %.4fms' % (dt_findLanes), (10, 45), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255, 0), 1)
        
        cv2.imshow('input', img)
        if (settings['show_canny']):
            cv2.imshow('ROI', output['masked_edges'])

    else:
        break
        
    # If 'q' is pressed, then exit.
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
        
# Clear memory and windows.
cap.release()
cv2.destroyAllWindows()

Dimensions: 480x270


#### 2. Results

The performance measurements in the Raspberry Pi were made by changing the scale of the video and reporting the time it took to process the frame. These values were displayed onto the frame and included:
- **cycle time** and **frame rate** (total time between frames)
- **findLines time** (time required to perform a Hough Transform including the preprocessing required for it.)
- **findLanes time** (time required to determine which lines correspond to the left and right lanes and then fit a 1st order polynomial to the data.)

The scale of the video was changed between a 100%, 50%, 35%, 30% and 25%. These percentages yielded a wide variety of results shown in the following table. All times are presented in milliseconds (ms).

| Scale | Width | Height |  FPS | Cycle [ms] | findLines [ms] | findLanes [ms] | Total Algorithm [ms] | Frame Capture [ms] |
|:-----:|:-----:|:------:|:----:|:-----:|:---------:|:---------:|:---------------:|:-------------:|
|   1   |  960  |   540  | 10.2 |  97.7 |    70.7   |    5.2    |       75.9      |      21.8     |
|  0.5  |  480  |   270  | 18.6 |  53.9 |    22.9   |    5.5    |       28.4      |      25.5     |
|  0.35 |  336  |   189  | 19.4 |  51.5 |    13.6   |    4.6    |       18.2      |      33.3     |
|  0.3  |  228  |   162  |  22  |  45.4 |    11.7   |    4.8    |       16.5      |      28.9     |
|  0.25 |  240  |   135  | 24.6 |  40.6 |    9.4    |     5     |       14.4      |      26.2     |

###### Figure 1. Results at 100% Scale
<p align='center'>
    <img src='scale-100.png' alt='Video Scale 100%'>
</p>

###### Figure 2. Results at 50% Scale
<p align='center'>
    <img src='scale-50.png' alt='Video Scale 50%'>
</p>
    
###### Figure 3. Results at 35% Scale
<p align='center'>
    <img src='scale-35.png' alt='Video Scale 35%'>
</p>

###### Figure 4. Results at 30% Scale
<p align='center'>
    <img src='scale-30.png' alt='Video Scale 30%'>
</p>

###### Figure 5. Results at 25% Scale
<p align='center'>
    <img src='scale-25.png' alt='Video Scale 25%'>
</p>

### Conclusions

All in all the migration was quite simple. The algorithms used in lab 07 worked flawlessly in the Raspberry Pi. By using a settings dictionary the testing process was quite simple. The scale changes were made by changing a variable and all other dimensions scaled accordingly. It was found that every window being displayed resulted in a negative impact on the time response of the system. Therefore, the canny edge output display was disabled by a setting in the dictionary.

Some interesting facts arise from analyzing the table:
- The time that the function `findLanes` uses remains pretty much constant independently from the frame size. This makes sense since the input to the function is an array of lines, which in the most part remain constant regarding on the frame size.
- The function `findLines` depends directly on frame size. Quite a reduction is seen between 100% and 50%. At 50% the frame size is 480x270, which approximates to a more standard dimension like 480x360. At this frame size the Probabilistic Hough Transform took about 23ms to execute. This time is smaller than the time between frames (30 fps or 1/30sec); the importance of this is discussed in the following point.
- An area of opportunity for optimizing the performance of the Raspberry Pi relies in the last column of the results table: the frame capture time. This is the time that it takes to receive a frame from the camera. The highway video has a frame rate of 29.97 or 33 ms per frame. Since the total algorithm execution time at 50% is 28.4ms, this means that if the video acquisition and video processing are separated in multiple processes, an almost real-time execution could be achieved.

A next step for the migration would be to implement a threaded or multi-process application which acquires and processes video separately. Furthermore, several section of the code use nested for loops, which impact greatly on performance. Other alternatives must be used instead of for loops.

_I hereby affirm that I have done this activity with academic integrity._