## Vehicle Detection and Tracking Project

### Project Details
For this project, a labeled dataset was provide and my job is to decide what features to extract, then train a classifier and ultimately track vehicles in a video stream. Here are links to the labeled data for [vehicle](https://s3.amazonaws.com/udacity-sdc/Vehicle_Tracking/vehicles.zip) and [non-vehicle](https://s3.amazonaws.com/udacity-sdc/Vehicle_Tracking/non-vehicles.zip) examples to train your classifier. These example images come from a combination of the [GTI vehicle image database](http://www.gti.ssr.upm.es/data/Vehicle_database.html), the [KITTI vision benchmark suite](http://www.cvlibs.net/datasets/kitti/), and examples extracted from the project video itself.

Udacity recently released a labeled dataset which can be used to take advantage of to augment the training data. The Udacity data can find [here](https://github.com/udacity/self-driving-car/tree/master/annotations). In each of the folders containing images there's a csv file containing all the labels and bounding boxes. To add vehicle images to the training data, we can use the csv files to extract the bounding box regions and scale them to the same size as the rest of the training images.

The project video will be the same one as for the Advanced Lane Finding Project. 

The goals / steps of this project are the following:

* [Step 1](#step1): Perform a Histogram of Oriented Gradients (HOG) feature extraction on a labeled training set of images and train a classifier Linear SVM classifier.
* [Step 2](#step2): Optionally, apply a color transform and append binned color features, as well as histograms of color, to HOG feature vector. 
* [Step 3](#step3): Normalize the features and randomize a selection for training and testing.
* [Step 4](#step4): Implement a sliding-window technique and use trained classifier to search for vehicles in images.
* [Step 5](#step5): Run the pipeline on a video stream and create a heat map of recurring detections frame by frame to reject outliers and follow detected vehicles.
* [Step 6](#step6): Estimate a bounding box for vehicles detected.

[//]: # (Image References)
[image1]: ./examples/car_not_car.png
[image2]: ./examples/HOG_example.jpg
[image3]: ./examples/sliding_windows.jpg
[image4]: ./examples/sliding_window.jpg
[image5]: ./examples/bboxes_and_heat.png
[image6]: ./examples/labels_map.png
[image7]: ./examples/output_bboxes.png
[video1]: ./project_video.mp4


### Import necessary  libraries and packages

In [1]:
import numpy as np
import cv2
import glob
import os
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
%matplotlib inline
from PIL import Image

---
## Pipeline (single images)
<a id='step1'></a>
### Step 1 Histogram of Oriented Gradients (HOG)
####1. Explain how (and identify where in your code) you extracted HOG features from the training images.

The code for this step is contained in the first code cell of the IPython notebook (or in lines # through # of the file called `some_file.py`).  

I started by reading in all the `vehicle` and `non-vehicle` images.  Here is an example of one of each of the `vehicle` and `non-vehicle` classes:

![alt text][image1]

I then explored different color spaces and different `skimage.hog()` parameters (`orientations`, `pixels_per_cell`, and `cells_per_block`).  I grabbed random images from each of the two classes and displayed them to get a feel for what the `skimage.hog()` output looks like.

Here is an example using the `YCrCb` color space and HOG parameters of `orientations=8`, `pixels_per_cell=(8, 8)` and `cells_per_block=(2, 2)`:


![alt text][image2]

####2. Explain how you settled on your final choice of HOG parameters.

I tried various combinations of parameters and...

####3. Describe how (and identify where in your code) you trained a classifier using your selected HOG features (and color features if you used them).

I trained a linear SVM using...

<a id='step2'></a>
### Step 2 Optionally, apply a color transform and append binned color features, as well as histograms of color, to HOG feature vector.

In this step, I computed the camera calibration and distortion coefficients using the `cv2.calibrateCamera()` from  `objpoints` and `imgpoints` the I applied the distortion correction to the test images using the `cv2.undistort()` function.

<a id='step3'></a>
### Step 3 Normalize the features and randomize a selection for training and testing.

The combination of color and gradient thresholdings was deployed using the `pipeline()` function as below. Other gradient thresholding and color tranform functions are provided in the lectures for reference.

##### <a id='step4'></a>
### Step 4 Sliding Window Search

####1. Describe how (and identify where in your code) you implemented a sliding window search.  How did you decide what scales to search and how much to overlap windows?

I decided to search random window positions at random scales all over the image and came up with this (ok just kidding I didn't actually ;):

![alt text][image3]

####2. Show some examples of test images to demonstrate how your pipeline is working.  What did you do to optimize the performance of your classifier?

Ultimately I searched on two scales using YCrCb 3-channel HOG features plus spatially binned color and histograms of color in the feature vector, which provided a nice result.  Here are some example images:

![alt text][image4]

---
## Video Implementation
<a id='step5'></a>
### Step 5 Run the pipeline on a video stream and create a heat map of recurring detections frame by frame to reject outliers and follow detected vehicles.

####1. Provide a link to your final video output.  Your pipeline should perform reasonably well on the entire project video (somewhat wobbly or unstable bounding boxes are ok as long as you are identifying the vehicles most of the time with minimal false positives.)
Here's a [link to my video result](./project_video.mp4)


####2. Describe how (and identify where in your code) you implemented some kind of filter for false positives and some method for combining overlapping bounding boxes.

I recorded the positions of positive detections in each frame of the video.  From the positive detections I created a heatmap and then thresholded that map to identify vehicle positions.  I then used `scipy.ndimage.measurements.label()` to identify individual blobs in the heatmap.  I then assumed each blob corresponded to a vehicle.  I constructed bounding boxes to cover the area of each blob detected.  

Here's an example result showing the heatmap from a series of frames of video, the result of `scipy.ndimage.measurements.label()` and the bounding boxes then overlaid on the last frame of video:

### Here are six frames and their corresponding heatmaps:

![alt text][image5]

### Here is the output of `scipy.ndimage.measurements.label()` on the integrated heatmap from all six frames:
![alt text][image6]

### Here the resulting bounding boxes are drawn onto the last frame in the series:
![alt text][image7]


<a id='step6'></a>
### Step 6 Estimate a bounding box for vehicles detected.





In [11]:
# Import everything needed to edit/save/watch video clips
from moviepy.editor import VideoFileClip
from IPython.display import HTML

In [12]:
import imageio
imageio.plugins.ffmpeg.download()

In [13]:
def process_image(img):
    # NOTE: The output you return should be a color image (3 channel) for processing video below
    # TODO: put your pipeline here,
    # you should return the final output (image where lines are drawn on lanes)

    # Step 2 Apply distortion correction to the raw image
    # ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(objpoints, imgpoints, img.shape[1::-1], None, None)
    undist = cv2.undistort(img, mtx, dist, None, mtx) 
    
    # Step 3 Use color transforms, gradients, etc., to create a thresholded binary image
    kernel_size = 5
    color_binary, combined_binary = pipeline(undist)   
    combined_binary = np.uint8(combined_binary * 255)

    # Step 4 Apply a perspective transform to rectify binary image ("birds-eye view") 
    img_size = (undist.shape[1], undist.shape[0])
    src = np.float32(
        [[(img_size[0] / 2) - 55, img_size[1] / 2 + 100],
        [((img_size[0] / 6) - 10), img_size[1]],
        [(img_size[0] * 5 / 6) + 40, img_size[1]],
        [(img_size[0] / 2 + 60), img_size[1] / 2 + 100]])
    dst = np.float32(
        [[(img_size[0] / 5), 0],
        [(img_size[0] / 5), img_size[1]],
        [(img_size[0] * 4 / 5), img_size[1]],
        [(img_size[0] * 4 / 5), 0]])
    
    warped, perspective_M = warper(combined_binary, src, dst)

    # Step 5 Detect lane pixels and fit to find the lane boundary
    binary_warped = np.uint8(warped/255)
    Minv = cv2.getPerspectiveTransform(dst, src) 
    result = detect_lane(binary_warped, undist, Minv, is_plotting=False)
    
    return result


In [14]:
white_output = '../project_video_out.mp4'
## To speed up the testing process you may want to try your pipeline on a shorter subclip of the video
## To do so add .subclip(start_second,end_second) to the end of the line below
## Where start_second and end_second are integer values representing the start and end of the subclip
## You may also uncomment the following line for a subclip of the first 5 seconds
##clip1 = VideoFileClip("test_videos/solidWhiteRight.mp4").subclip(0,5)
clip1 = VideoFileClip("../project_video.mp4")
white_clip = clip1.fl_image(process_image) #NOTE: this function expects color images!!
%time white_clip.write_videofile(white_output, audio=False)

[MoviePy] >>>> Building video ../project_video_out.mp4
[MoviePy] Writing video ../project_video_out.mp4


100%|█████████▉| 1260/1261 [07:03<00:00,  3.05it/s]


[MoviePy] Done.
[MoviePy] >>>> Video ready: ../project_video_out.mp4 

CPU times: user 6min 44s, sys: 1min 59s, total: 8min 44s
Wall time: 7min 4s


In [15]:
HTML("""
<video width="960" height="540" controls>
  <source src="{0}">4
</video>
""".format(white_output))

---
## Discussion
Here I'll talk about the approach I took, what techniques I used, what worked and why, where the pipeline might fail and how I might improve it if I were going to pursue this project further.

The pipeline successfully detected the lane area of the project video. During the implemetation, there are some problems that I faced in my implementation.
- I used color transforms and gradients to create a thresholded binary image. There are difficult extract clear lines in the images with large and dark shadow or image with very bright road. I used B-channel from LAB color space to extract yellow line and L-channel to extract white line. In addition, I combined gradient w.r.t direction x, gradient magnitude and gradient direction to get more sufficient edges. 
- I manually defined fixed src and dst points for the perspective transform which based on a test image with the straight lane. Sometimes, the perspective transform did not give a good result since the car run off the center of the lane and gave different view. For further improvement, I might develop auto-detection src and dst points algorithm for the perspective transform.
- Currently, I always did the blind search to detect lane pixels of frames. This made the lane boundary a litle bit fluctate when running on multiple frame. For further improvement, I will apply the lane pixel search based on the lane position of previous frames or using frame averaging for smooth lane detection.
- Currently, the combined thresholded binary image still contained noisy pixels of the sides of the lane. I might apply the region of interest technique, which will process the image inside the wanted region only.

###Discussion

####1. Briefly discuss any problems / issues you faced in your implementation of this project.  Where will your pipeline likely fail?  What could you do to make it more robust?

Here I'll talk about the approach I took, what techniques I used, what worked and why, where the pipeline might fail and how I might improve it if I were going to pursue this project further.
