## Final Project for COMS 3101 - Web App for People Detection
## Tomer Solomon Mate, ts2838

IMPORTANT: Please run with Python 2.7, as OpenCV does not support Python 3 and is the main library I used for the computer vision. 

### Description

For my final project, I decided it would be fun to create a basic web app in Flask that uses machine learning to detect people in videos (MP4 format) and images (JPEG format). I am interested in all the current hype around autonomous driving, such as the work companies like Tesla as well as car companies like Ford are doing in helping spur the next generation of autonomous vehicles, and thought this would be a cool project to get my hands dirty with how exactly this autonomous driving works at an elementary (read: very elementary) level. Computer vision in general interests me and is all around us as well, from Facebook facial recognition to Google generating tags for the photos we upload to Google Photos (almost scary how well Google photos searches through your photos).

As such, I thought it would be cool to implement a basic people detection algorithm as a final project for this class. I experimented around with Flask as well to dabble in how basic web application works. I used OpenCV as the main library for computer vision work as it is the leading and one of the most powerful computer vision libraries, and wanted to get experience with how it worked. Becasue it was first coded up in C/C++ then transferred over to Python however, the documentation wasn't necessarily the best. Specifically, the machine learning algorithm is a support vector machine pretrained on the INRIA Person datasets. 

A video demo is saved down as "video_demo.mp4" in the folder, which is a screen cast of the app in action!


### Code Structure

Part 1, the machine learning part, is made up of three functions. The first function, detect_human, reads in an image using matplotlib's imread, then creates a HOG features space, then detects people, then draws a green box around detected people, and finally saves the file in the project_directory + '/static/results/' folder. The HOG feature space is an abbreviation for Histogram of Oriented Gradients, and is one of many feature spaces used in machine learning that reduces the dimensions of the image by some mathematics I'm not 100% familiar with. The video_detect_human algorithm is pretty much the exact same except the input is an array rather than a string. The final function is the save_video function, which was a little trickier. I used OpenCV's VideoCapture to read in a video, where fourcc corresponds to the type of compression and VideoWriter object saves the created movie. It then iterates over every frame, applies the vide_detect_human function, saves each frame, and then releases everything when done. A point to note is that it takes a while to create the video (about 20 seconds for 2-3 seconds). Part 2 is made up of the Flask application. The first part of the app is the creation of the index.html page, which basically takes an movie/image and uploads it to the /static/uploads folder.  If the file is an image, it redirects to the image html page, and if the file is a movie, it redirects to the video html page.

### How to Run

The app is  run locally on http://localhost:5000/. In order to run it, you simply run the code below after changing the project_directory variable to the directory where the Project folder is saved. You then upload images/videos to the app, and I put sample images and videos in the respective sample_images and sample_videos folder of the Project folder for you to play around with. Don't change around the order of any of the files within the Project folder, since the current structure allows images/videos to be saved in the uploads and results folder as well as for the Flask app to perform the render_template function properly. The ipython notebook should also not be moved around. Certain packages need to be installed, which I will cover below. 


### Dependenices

As far as dependenices go, the trickest package to install is OpenCV. I spent a while figuring out how to do this, as one can't simply pip install OpenCV. To do so, You "conda install -c https://conda.binstar.org/menpo opencv" in the terminal. Make sure that the python version you are using is Anaconda. When I run "python -V" in terminal, I get Python 2.7.12 :: Anaconda 4.1.1 (x86_64).

You need to also install imutils (simply pip install imutils), which is used for image processing. Other libraries that are used, which can be seen below, are: os, flask, werkzeug, numpy, and matplotlib. All these have straightforward pip installs.


### Problems Encountered

There were definitely a good amount of road blocks along the way. For the ML algorithm part, fortunately I didn't have to build any algorithm from scratch as it came nicely packaged in OpenCV, but I still had to figure out how exactly  HOG.Descripter and setSVMDetector worked. I also had problems initially with figuring out how to save an MP4, as the ImageCapture documentation was obscure. The saving of the video took a while since the VideoCapture and VideoWriter class don't have great documentation, and I had to guess around with the the type of fourCC code to use since initially when I was using MP4, the video wouldn't play in the HTML for some unknown reason.  I had hiccups figuring out how to get the filename variable to pass through to the other app routes, but eventually figured it out. I also took some time trying to figure out how to display variable names in the HTML, which is more of a Jinja syntax problem then a pure python one. The last part of the flask app simply runs the app.


### Python Evaluation

Overall, I thought Python and Anaconda in particular was a good platform to create this project, particularly the machine learning component of the project. Python is particularly good because you are able to import libraries like OpenCV which allow people to get the ball rolling much quicker, rather than having people create entire classes by themself. I also found Matplotlib helpful in creating pictures of the images and visualizing the people detection, and thought that Python has a good system for saving down files locally (like how I saved down files into the uploads folder and results folder). I wonder how effective Python would be when it comes to interacting with servers. I know there was a brief PDF on SQL and Python in the Coursworks, but I didn't see any information on storing images/videos in a SQL database. While I found Flask to be a good starting point for web application and good for this purpose, I wonder how effective it would be for larger apps and how it would scale, so I don't believe I Python would be as good for web application necessarily. I looked into it, and there are other web frameworks like Django which might be more effective for bigger web applications. 

### Final Thoughts

Overall, I had fun making this little app. In the future I want to work with how I can effectively store images in a database rather than just save them down locally, and I want to also learn how deployment works on applications like Heroku. Also room for improvement is speeding up the analysis of movies, which now takes a while. 

Please email me know if you have any questions/any problems when running it at ts2838@columbia.edu. Hope you enjoy!

In [None]:
import os
from flask import Flask, render_template, request, redirect, url_for, send_from_directory, flash, session
from werkzeug.utils import secure_filename

#OpenCV library, backbone of computer vision
#Installed using: 
# conda install -c https://conda.binstar.org/menpo opencv
import cv2

#imutils is an image processing library. Install by: pip install imutils
import imutils
import numpy as np
from imutils.object_detection import non_max_suppression
import matplotlib.pyplot as plt
import matplotlib.image as mpimg

#define directory where Project is saved. User changes this to where Project folder is stored locally. 
project_directory = '/Users/Tomer/Documents/Columbia/2016-2017/Python/ts2838_project'


# PART 1
# Machine learning algorithm for pictures
#This is a SVM that uses OpenCV and is pre-trained with the INRIA Person dataset: 
#http://pascal.inrialpes.fr/data/human/

#This code is adapted from Adrian Rosenbeck's Pedestrian Detection OpenCV Tutorial
#located at http://www.pyimagesearch.com/2015/11/09/pedestrian-detection-opencv/
#which is primarily based off of the paper: 
#N. Dalal, B. Triggs, Histograms of Oriented Gradients for Human Detection.
#IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005.

def detect_human(query_img):
    img = mpimg.imread(query_img)

    #define HOG as feature space
    svm = cv2.HOGDescriptor()
    
    #set SVM detector and define which detector is being using 
    svm.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())
    
    #detects people 
    positive_windows, weights = svm.detectMultiScale(img, winStride=(8, 8), padding=(8, 8), scale=1.05, useMeanshiftGrouping=False)
    positive_windows = np.array([[x, y, x + w, y + h] for (x, y, w, h) in positive_windows])
    filtered_windows = non_max_suppression(positive_windows, probs=None, overlapThresh=0.65)
    
    #draws a box around detected person 
    for (x_TL, y_TL, x_BR, y_BR) in filtered_windows:
        cv2.rectangle(img, (x_TL, y_TL), (x_BR, y_BR), (0, 255, 0), 3)
        
    plt.imshow(img)
    plt.axis('off')
    
    #saves analyzed file in results folder within static. query_img just takes the name of the file after "/static/results/"
    plt.savefig( project_directory + '/static/results/' + query_img[15:])
    return img

# only difference between video_detect_human and detect_human is first line
# don't need to imread since query_img is already a numpy array

def video_detect_human(query_img):
    img = query_img
    
    #define HOG as feature space
    svm = cv2.HOGDescriptor()
    
    #set SVM detector and define which detector is being using 
    svm.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())

    #detect people
    positive_windows, weights = svm.detectMultiScale(img, winStride=(8, 8), padding=(8, 8), scale=1.05, useMeanshiftGrouping=False)
    positive_windows = np.array([[x, y, x + w, y + h] for (x, y, w, h) in positive_windows])
    filtered_windows = non_max_suppression(positive_windows, probs=None, overlapThresh=0.65)
    
    #draw box
    for (x_TL, y_TL, x_BR, y_BR) in filtered_windows:
        cv2.rectangle(img, (x_TL, y_TL), (x_BR, y_BR), (0, 255, 0), 4)
    return img

#save file using video capture
# Adapted from: http://docs.opencv.org/3.0-beta/doc/py_tutorials/py_gui/py_video_display/py_video_display.html
def save_video(video_file):
    cap = cv2.VideoCapture(video_file)
    # Define the codec and create VideoWriter object
    #fourcc defines the type of compression
    fourcc = cv2.cv.CV_FOURCC('a','v','c','1') #'m','p','4','v'
    out = cv2.VideoWriter(project_directory + '/static/results/' + video_file[15:],fourcc, 20.0, (int(cap.get(3)), int(cap.get(4))))
    cap.set(cv2.cv.CV_CAP_PROP_FPS, 10)
    while cap.isOpened():
        ret, frame = cap.read()
        if ret==True:
            frame = video_detect_human(frame)
            # write the analyzed frame
            out.write(frame)
            
            cv2.imshow('COMS 3101 - Video',frame)
            
            if cv2.waitKey(1) & 0xFF == ord('q'):
                break
        else:
            break
    # Release everything if job is finished
    cap.release()
    out.release()
    cv2.destroyAllWindows()

# PART 2
# Flask App 
# Running on http://localhost:5000/
# File uploads portion adapted from http://flask.pocoo.org/docs/0.11/patterns/fileuploads/

app = Flask(__name__)
app.secret_key = 'secret'

UPLOAD_FOLDER = project_directory + '/static/uploads'

ALLOWED_EXTENSIONS = set(['mp4', 'jpg', 'jpeg'])
app.config['UPLOAD_FOLDER'] = UPLOAD_FOLDER

def allowed_file(filename):
    return '.' in filename and \
           filename.rsplit('.', 1)[1] in ALLOWED_EXTENSIONS    

@app.route('/',methods=['GET', 'POST'])
def upload_file():
    if request.method == 'POST':
        file = request.files['file']
        if file and allowed_file(file.filename):
            filename = secure_filename(file.filename)
            #saves file in uploads folder
            file.save(os.path.join(app.config['UPLOAD_FOLDER'], filename))            
            if filename[-4:] == 'jpeg' or filename[-3:] == 'jpg' or filename[-4:] == 'JPEG' or filename[-4:] == 'JPG':
                #runs detection algorithm on file saved in upload folder
                detect_human('static/uploads/' + filename)
                
                #redirects to the results page
                return redirect(url_for('image',filename=filename))
                
            elif filename[-3:] == 'mp4':
                #runs detection algorithm
                save_video('static/uploads/' + filename)
                
                #redirects to results page
                return redirect(url_for('video',filename=filename))

    return render_template('index.html')

#page resulting image with pedestrians detected
@app.route("/image/<filename>")
def image(filename):
    #before_image = "uploads/" + filename
    after_image = "results/" + filename
    return render_template('image.html', after_image = after_image)

@app.route("/video/<filename>")
def video(filename):
    #before_video = "uploads/" + filename
    after_video = "results/" + filename
    return render_template('video.html', after_video = after_video)

#runs app
if __name__ == "__main__":
    app.run()