## Lab 2: Getting started with OpenCV for image and video processing

Written by: Enrique Mireles Gutiérrez  
ID Number: 513944  
Bachelor: ITR  
Date: 2019-02-08  

### Introduction

OpenCV is a library used for artificial vision that contains more than 2500 different algorithms. It's an open source project that was created by Intel in 1999 and has grown to become the standard when it comes to image analysis and artificial vision algorithms. Now a days, the project continues to expand, adding day by day more algorithms to its repository.  

Currently, OpenCV supports C++, Python, Java and is available for different operating systems such as Windows, Linux, OSX, Android, and others. OpenCV for Python is used throughout this lab. Compared to C++, Python is slower. Nonetheless, OpenCV presents the tools to create wrappers for writting code in C++ and therefore mantaining the performance of C++ and the ease of use of Python all together. All in all, OpenCV is a good tool for fast prototyping and testing artificial vision algorithms.  

In order to write optimized code, help from other libraries is needed. An esential library for this is called Numpy. Numpy is a library used for manipulating matrixes and performing scientific computations. It has a powerful N-dimension array handler which excells the simple array implementations in raw Python. Therefore OpenCV + Numpy is esential for writing vision algorithms.  

### Objectives

This lab has the following objectives:
- Getting started with images: Among the fundamental of image processing and computer vision application are the reading, visualisation and saving of processed images. Hence, in this section, you will learn how to read an image from disk, how to visualise it and how to write it back to disk.
- Getting started with videos: You will focus on how to grab and visualise a live video sequence acquired from a web camera connected to your Raspberry Pi; in addition to this, you will learn how to read a video file from disk and do some video processing before this processed video is written to a video file.

### Procedure

This lab report is subdivided in smaller numbered programs shown below.

#### 1. Importing Libraries

The following libraries are used throughout the lab report:
- cv2: OpenCV library used for artificial vision.
- numpy: Library used for matrix operations.
- argparse: Used for parsing arguments passed through the console. In Jupyter, this library parses manually written arguments in a dictionary.

In [1]:
import cv2
import numpy as np
import argparse

#### 2. Constant definitions

The following lines define the constants used throughout the lab report:

In [2]:
IMAGE_IN_FILENAME = '../fig/vehicular-traffic.jpg'
IMAGE_OUT_FILENAME = 'vehicular-traffic-out.jpg'
VIDEO_IN_FILENAME = '../fig/highway_right_solid_white_line_short.mp4'

#### 3. Reading, visualizing and saving images

Some key functions used in this section are:

- `cv2.imread(filename[, flags])`
     - Loads an image from a file.
     - filename – Name of file to be loaded.
     - flags – CV_LOAD_IMAGE_ANYDEPTH | CV_LOAD_IMAGE_COLOR | CV_LOAD_IMAGE_GRAYSCALE
     - **returns** – Mat object.
- `cv2.namedWindow(winname[, flags])`
    - Creates a window.
    - name – Name of the window in the window caption that may be used as a window identifier.
    - flags – WINDOW_NORMAL | WINDOW_AUTOSIZE | WINDOW_OPENGL
    - **returns** – None
- `cv2.imshow(winname, mat)`
    - Displays an image in the specified window.
    - winname – Name of the window.
    - image – Image to be shown.
    - **returns** – None
- `cv2.waitKey([delay])`
    - Waits for a pressed key.
    - delay – Delay in milliseconds. 0 is the special value that means “forever”.
    - **returns** – the code of the pressed key or -1 if no key was pressed.
- `cv2.imwrite(filename, img[, params])`
    - Saves an image to a specified file.
    - filename – Name of the file.
    - image – Image to be saved.
    - params – Format-specific save parameters encoded as pairs. Check docs.
    - **returns** – retval
- `cv2.destroyAllWindows()`
    - Destroys all of the HighGUI windows.
    - **returns** – None

Information retrieved from: 
- https://docs.opencv.org/3.0-beta/modules/imgcodecs/doc/reading_and_writing_images.html?highlight=imwrite#cv2.imwrite
- https://docs.opencv.org/3.0-beta/modules/highgui/doc/user_interface.html?highlight=waitkey#waitkey

In [3]:
# read in input image
# alternatively, you can use cv2.IMREAD_GRAYSCALE
img_in = cv2.imread(IMAGE_IN_FILENAME, cv2.IMREAD_COLOR)

# create a new window for image visualisation purposes
# alternatively, you can use cv2.WINDOW_NORMAL
cv2.namedWindow("input image", cv2.WINDOW_AUTOSIZE)

# visualise input image
cv2.imshow("input image", img_in)

# convert input image from colour to greyscale
img_out = cv2.cvtColor(img_in, cv2.COLOR_BGR2GRAY)

# visualise greyscale image
cv2.imshow("greyscale image", img_out)

# wait for the user to press a key
key = cv2.waitKey(0)

# if user presses 's', the grayscale image is write to an image file
if key == ord("s"):

    cv2.imwrite(IMAGE_OUT_FILENAME, img_out)
    print('output image has been saved in %s' % IMAGE_OUT_FILENAME)

# destroy windows to free memory  
cv2.destroyAllWindows()
print('windows have been closed properly - bye!')

windows have been closed properly - bye!


#### 4. A more elaborated program to read, visualise, and save an image

Some key functions used in this section are:

- `cv2.cvtColor(src, code[, dst[, dstCn]])`
    - Converts an image from one color space to another.
    - src – input image
    - dst – output image of the same size and depth as src.
    - code – color space conversion code. COLOR_BGR2GRAY, COLOR_BGR2XYZ, COLOR_BGR2YCrCb, COLOR_BGR2HSV, COLOR_BGR2HLS, COLOR_BGR2Lab, COLOR_BGR2Luv, COLOR_BayerBG2BGR
    - dstCn – number of channels in the destination image.
    - **returns** – Destination image
- `argparse.ArgumentParser(prog=None, usage=None, description=None, epilog=None, parents=[], formatter_class=argparse.HelpFormatter, prefix_chars='-', fromfile_prefix_chars=None, argument_default=None, conflict_handler='error', add_help=True)`
    - Create a new ArgumentParser object.
    - prog – The name of the program (default: sys.argv[0])
    - usage – The string describing the program usage (default: generated from arguments added to parser)
    - description – Text to display before the argument help (default: none)
    - epilog – Text to display after the argument help (default: none)
    - parents – A list of ArgumentParser objects whose arguments should also be included
    - formatter_class – A class for customizing the help output
    - prefix_chars – The set of characters that prefix optional arguments (default: ‘-‘)
    - fromfile_prefix_chars – The set of characters that prefix files from which additional arguments should be read (default: None)
    - argument_default – The global default value for arguments (default: None)
    - conflict_handler – The strategy for resolving conflicting optionals (usually unnecessary)
    - add_help – Add a -h/--help option to the parser (default: True)
    - **returns** – an ArgumentParser Object
- `parser.add_argument(name or flags...[, action][, nargs][, const][, default][, type][, choices][, required][, help][, metavar][, dest])`
    - Define how a single command-line argument should be parsed.
    - name or flags – Either a name or a list of option strings, e.g. foo or -f, --foo.
    - action – The basic type of action to be taken when this argument is encountered at the command line.
    - nargs – The number of command-line arguments that should be consumed.
    - const – A constant value required by some action and nargs selections.
    - default – The value produced if the argument is absent from the command line.
    - type – The type to which the command-line argument should be converted.
    - choices – A container of the allowable values for the argument.
    - required – Whether or not the command-line option may be omitted (optionals only).
    - help – A brief description of what the argument does.
    - metavar – A name for the argument in usage messages.
    - dest – The name of the attribute to be added to the object returned by parse_args().
    - **returns** – None
- `vars([object])`
    - Returns the \_\_dict__ attribute of the given object if the object has \_\_dict__ attribute.
    - object – can be module, class, instance, or any object having \_\_dict__ attribute.
    - **returns** – dict Object.


Information retrieved from:
- https://docs.opencv.org/3.0-beta/modules/imgproc/doc/miscellaneous_transformations.html#void%20cvtColor(InputArray%20src,%20OutputArray%20dst,%20int%20code,%20int%20dstCn)
- https://docs.python.org/2/library/argparse.html#argparse.ArgumentParser.add_argument
- https://www.programiz.com/python-programming/methods/built-in/vars

In [4]:
def options():
    # parse command line arguments
    parser = argparse.ArgumentParser('Read, visualise and write image into disk')
    parser.add_argument('-i', '--in_image_name', help='input image name', required=True)
    parser.add_argument('-o', '--out_image_name', help='output image name', required=True)
    args = vars(parser.parse_args())
    return args

def processing_image(img_in_name, img_out_name):

    # read in image from file
    # alternatively, you can use cv2.IMREAD_GRAYSCALE
    img_in = cv2.imread(img_in_name, cv2.IMREAD_COLOR)

    # verify that image exists
    if img_in is None:
        print('ERROR: image ', img_in_name, 'could not be read')
        exit()

    # convert input image from colour to grayscale
    img_out = cv2.cvtColor(img_in, cv2.COLOR_BGR2GRAY)

    # create a new window for image purposes
    # alternatively, you can use cv2.WINDOW_NORMAL
    # that option will allow you for window resizing
    cv2.namedWindow("input image", cv2.WINDOW_AUTOSIZE)
    cv2.namedWindow("output image", cv2.WINDOW_AUTOSIZE)

    # visualise input and output image
    cv2.imshow("input image", img_in)
    cv2.imshow("output image", img_out)

    # wait for the user to press a key
    key = cv2.waitKey(0)

    # if user pressed 's', the grayscale image is write to disk
    if key == ord("s"):
        cv2.imwrite(img_out_name, img_out)
        print('output image has been saved in %s' % img_out)

    # destroy windows to free memory  
    cv2.destroyAllWindows()
    print('windows have been closed properly')

# main function
def main():    

    # uncomment these lines when running on jupyter notebook
    # and comment when running as a script on linux terminal
    args = {
            "in_image_name": IMAGE_IN_FILENAME,
            "out_image_name": IMAGE_OUT_FILENAME
            }
    in_image_name = args['in_image_name']
    out_image_name = args['out_image_name']

    # comment the following line when running on jupyter notebook
    # and uncomment when running as a script on linux terminarl
    # args = options()

    # call processing image
    processing_image(in_image_name, out_image_name)


# run first
if __name__=='__main__':
    main()

windows have been closed properly


#### 5. Basic program to capture live video from camera

Some key functions used in this section are:

- `cv2.VideoCapture([filename, device])`
    - VideoCapture constructors.
    - filename – name of the opened video file.
    - device – id of the opened video capturing device
    - **returns** – <VideoCapture \object> 
- `cv2.VideoCapture.read()`
    - Grabs, decodes and returns the next video frame.
    - **returns** – retval, image 
- `cv2.VideoCapture.release()`
    - Closes video file or capturing device.
    - **returns** – None
    
Information retrieved from:
- https://docs.opencv.org/3.0-beta/modules/videoio/doc/reading_and_writing_video.html?highlight=videocapture#videocapture-release

In [5]:
# create a VideoCapture object
cap = cv2.VideoCapture(0)

# main loop
while(True):

    # capture new frame
    ret, frame = cap.read()

    # convert from colour to grayscale image
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    # visualise image
    cv2.imshow('frame', frame)

    # wait for the user to press 'q' to close the window
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# release VideoCapture object
cap.release()

# destroy windows to free memory
cv2.destroyAllWindows()

#### 6. Basic program to capture live video from Raspberry Pi camera

Some key functions used in this section are:

- `PiCamera(camera_num=0, stereo_mode='none', stereo_decimate=False, resolution=None, framerate=None, sensor_mode=0, led_pin=None, clock_mode='reset', framerate_range=None)`
    - Provides a pure Python interface to the Raspberry Pi’s camera module.
    - **returns** – PiCamera object
- `PiCamera.resolution`
    - Retrieves or sets the resolution at which image captures, video recordings, and previews will be captured.
- `PiCamera.framerate`
    - Retrieves or sets the framerate at which video-port based image captures, video recordings, and previews will run.
- `PiRGBArray(camera, size=None)`
    - Produces a 3-dimensional RGB array from an RGB capture.
    - camera – PiCamera object.
    - **returns** – array
- `time.sleep(t)`
    - Suspends execution for the given number of seconds.
    - t − This is the number of seconds execution to be suspended.
    - **returns** – None
- `np.float32(c)`
    - Create a single precision float
    - c – number to convert.
    - **returns** – single precision float.
- `cv2.Canny(image, threshold1, threshold2[, edges[, apertureSize[, L2gradient]]])`
    - image – 8-bit input image.
    - edges – output edge map; single channels 8-bit image, which has the same size as image .
    - threshold1 – first threshold for the hysteresis procedure.
    - threshold2 – second threshold for the hysteresis procedure.
    - apertureSize – aperture size for the Sobel() operator.
    - L2gradient – a flag, indicating whether a more accurate L2 norm.
    - **returns** – edges  
- `ord(c)`
    - Returns an integer representing Unicode code point for the given Unicode character.
    - c – character string of length 1 whose Unicode code point is to be found.
    - **returns** – int

At the moment, no Raspberry Pi with Raspicam is available. Therefore, this section has no code to execute.

Information retrieved from:
- https://picamera.readthedocs.io/en/release-1.10/api_array.html#pirgbarray
- https://www.tutorialspoint.com/python/time_sleep.htm
- https://docs.scipy.org/doc/numpy-1.13.0/user/basics.types.html
- https://www.programiz.com/python-programming/methods/built-in/ord
- https://docs.opencv.org/3.0-beta/modules/imgproc/doc/feature_detection.html?highlight=cv2.canny#cv2.Canny

#### 7. A more elaborated program to capture, process and visualise video

Some key functions used in this section are:

- `cv2.videoCapture.isOpened()`
    - Returns true if video capturing has been initialized already.
    - **returns** – retval
- `cv2.videoCapture.get(propId)`
    - propId – Property identifier. 
    - Returns the specified VideoCapture property
    - **returns** – retval
- `cv2.flip(src, flipCode[, dst])`
    - Flips a 2D array around vertical, horizontal, or both axes
    - src – input array.
    - dst – output array of the same size and type as src.
    - flipCode – a flag to specify how to flip the array; 0 means flipping around the x-axis and positive value means flipping around y-axis. Negative value (for example, -1) means flipping around both axes.
    - **returns** – dst
    
Information retrieved from:
- https://docs.opencv.org/3.0-beta/modules/videoio/doc/reading_and_writing_video.html?highlight=cv2.videocapture.isopened#videocapture-get
- https://docs.opencv.org/3.0-beta/modules/core/doc/operations_on_arrays.html?highlight=flip#cv2.flip

In [6]:
# import required libraries
import numpy as np
import cv2


def configure_videoCapture(device_index):

    """
    Configure video capture object to handle video device.

    Parameters
        device_index: int value indicating the index number to access camera

    Returns
        cap: videoCapture-type object

    """

    # create a videoCapture object and returns either a True or False
    cap = cv2.VideoCapture(device_index)

    # if camera could not be opened, it displays an error and exits
    if not cap.isOpened():
        print("ERROR: Camera could not be opened")
        exit()

    # return videoCapture object 'cap'
    return cap


def print_video_frame_specs(cap):

    """
    Print video specifications such as video frame width and height, fps,
    brightness, contrast, saturation, gain, and exposure.

    Parameters
        cap: video capture object

    Returns
        None: this definition only prints information on the command line
              window.
    """    

    # retrieve video properties
    ret, frame = cap.read()
    frame_height, frame_width = frame.shape[:2]

    # verify that frame was properly captured
    if ret == False:
        print("ERROR: current frame could not be read")
        exit()

    else: # if so, video frame stats are displayed

        # print video frames specifications
        print('\nVideo specifications:')
        print('\tframe width: ', cap.get(cv2.CAP_PROP_FRAME_WIDTH))
        print('\tframe height: ', cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
        print('\tframe rate: ', cap.get(cv2.CAP_PROP_FPS))
        print('\tbrightness: ', cap.get(cv2.CAP_PROP_BRIGHTNESS))
        print('\tcontrast: ', cap.get(cv2.CAP_PROP_CONTRAST))
        print('\tsaturation: ', cap.get(cv2.CAP_PROP_SATURATION))
        print('\thue: ', cap.get(cv2.CAP_PROP_GAIN))
        print('\texposure: ', cap.get(cv2.CAP_PROP_EXPOSURE))

    # return None
    return None


def capture_and_process_video(cap):

    """
    Capture live video from a camera connected to your computer. Each frame is
    flipped and visualised together with the original frame on separate windows.

    Parameters
        cap: video capture object

    Returns
        None: none

    """

    # create a new window for image purposes
    cv2.namedWindow("input image", cv2.WINDOW_AUTOSIZE)  # alternatively, you can use cv2.WINDOW_NORMAL
    cv2.namedWindow("output image", cv2.WINDOW_AUTOSIZE) # that option will allow you for window resizing


    # main loop
    print('\ncapturing video ...')
    while(cap.isOpened()):

        # capture frame by frame
        ret, frame = cap.read()

        # verify that frame was properly captured
        if ret == False:
            print("ERROR: current frame could not be read")
            break

        # if frame was properly captured, it is converted
        # from a colour to a grayscale image
        frame_out = cv2.flip(frame,0)

        # visualise current frame and grayscale frame
        cv2.imshow("input image", frame)
        cv2.imshow("output image", frame_out)


        # wait for the user to press a key
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

    # return none
    return None


def free_memory(cap):

    """
    Free memory by releasing videoCapture 'cap' and by destroying/closing all
    open windows.

    Parameters
        cap: video capture object

    Returns
        None: none
    """

    # when finished, release the VideoCapture object and close windows to free memory
    print('closing camera ...')
    cap.release()
    print('camera closed')
    cv2.destroyAllWindows()
    print('program finished - bye!\n')

    # return none
    return None


def run_pipeline(device_index=0):
    """
    Run pipeline to capture, process and visualise both the original frame and
    processed frame.

    Parameters
        device_index: device index - 0 default

    Returns
        arg: None

    """

    # pipeline
    cap = configure_videoCapture(device_index)
    print_video_frame_specs(cap)
    capture_and_process_video(cap)
    free_memory(cap)

    # return none
    return None


# run pipeline    
run_pipeline(device_index = 0)


Video specifications:
	frame width:  640.0
	frame height:  480.0
	frame rate:  30.0
	brightness:  0.5019607843137255
	contrast:  0.12549019607843137
	saturation:  0.64
	hue:  -1.0
	exposure:  inf

capturing video ...
closing camera ...
camera closed
program finished - bye!



#### 8. Basic program to play a video file

In [7]:
# import required libraries
import numpy as np
import cv2

# create a VideoCapture object and specify video file to be read
cap = cv2.VideoCapture(VIDEO_IN_FILENAME)

# main loop
while(cap.isOpened()):

    # read current frame
    ret, frame = cap.read()

    # validate that frame was capture correctly
    if ret:

        # convert frame from colour to gray scale
        gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

        # show current frame
        cv2.imshow('frame',gray)

    # wait for the user to press 'q' to exit
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# release VideoCapture object
cap.release()

# destroy windows to free memory
cv2.destroyAllWindows()

### Conclusions

In this lab, basic operations in OpenCV were learned. These included reading images, opening video streams, and displaying them in windows. After that, some basic modifications and file savings were made. I believe the key in learning OpenCV is in getting to know the methods used in the library. My first experience with it was more than 6 years ago with a Raspberry Pi, and I remember the experience felt really overwhelming. Nonetheless, with the constant boom of artificial intelligence and computer vision aided algorithms I believe this lab will prove itself really useful.

### References

- NumPy. (2018). NumPy. Retrieved from: http://www.numpy.org/
- OpenCV. (2019). About. Retrieved from: https://opencv.org/about.html

_I hereby affirm that I have done this activity with academic integrity._