# Chapter 2 (Handling Files, Cameras, and GUIs)

In this chapter the Author concentrated on the following:
- Basic I/O Operations
- Reading/writing an image file
- Displaying images in a window
- Why is the Python module called cv2 not cv?
- Modes of imread()
- Modes of imwrite()
- Converting between an image and raw bytes
- Accessing image data with numpy.array
- Reading/writing a video file
- Capturing camera frames
- Displaying camera frames in a window

## Basic I/O Operations

There are some operations can be done on the image:
- Getting images or videos as input from disk , camera, online. 
- Producing images or videos as output into the disk or rows of data.

## Reading/Writing an image file

OpenCV provides the imread() and imwrite() functions that support various file formats for still images. Each pixel has a value, but the difference is in how the pixel is represented. In case Gray scale images, each pixel is represented by a single 8-bit integer, which means that the values for each pixel are in the 0-255 range.


imread() by default read an image in BGR (Blue, Green, Red), but tou can change it as in our example,

In [1]:
import cv2 as cv
import numpy as np

In [2]:
image = cv.imread('photos/rose.jfif', cv.IMREAD_GRAYSCALE)
image

array([[45, 46, 48, ..., 13, 13, 13],
       [44, 46, 48, ..., 13, 13, 13],
       [44, 45, 47, ..., 13, 13, 13],
       ...,
       [30, 30, 30, ..., 25, 25, 25],
       [31, 31, 31, ..., 26, 26, 26],
       [32, 32, 32, ..., 27, 27, 27]], dtype=uint8)

In [3]:
#Obtaining Image dimension (image structure)
image.shape

(183, 275)

In [4]:
#Obtaining 1st row in the Image
image[0]

array([45, 46, 48, 50, 53, 56, 58, 59, 58, 57, 56, 55, 54, 52, 51, 51, 45,
       45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 46, 46,
       46, 46, 46, 46, 46, 46, 52, 58, 67, 75, 80, 80, 77, 74, 77, 75, 70,
       64, 58, 52, 48, 45, 32, 32, 32, 31, 31, 30, 30, 30, 33, 33, 34, 35,
       37, 38, 39, 40, 44, 44, 44, 44, 44, 44, 44, 44, 43, 42, 41, 41, 40,
       39, 38, 38, 36, 37, 37, 38, 39, 40, 41, 41, 45, 46, 47, 48, 49, 51,
       52, 52, 58, 57, 57, 57, 56, 56, 55, 55, 53, 52, 51, 50, 48, 47, 46,
       45, 40, 40, 41, 42, 43, 44, 44, 45, 45, 44, 44, 44, 43, 43, 42, 42,
       44, 43, 43, 43, 42, 42, 41, 41, 44, 44, 43, 43, 42, 42, 42, 41, 44,
       43, 42, 40, 38, 36, 35, 34, 35, 34, 33, 32, 30, 29, 28, 27, 29, 29,
       29, 28, 28, 27, 27, 27, 33, 33, 34, 36, 37, 39, 40, 40, 40, 40, 41,
       41, 42, 42, 42, 43, 40, 40, 40, 39, 39, 38, 38, 38, 40, 40, 40, 40,
       40, 40, 40, 40, 36, 36, 35, 33, 32, 31, 30, 29, 29, 29, 28, 28, 27,
       27, 27, 27, 26, 26

In [5]:
#Obtaining the 1st pixel value at position(0,0)
image[0][0]

45

In [6]:
cv.imshow('gray', image)
cv.waitKey(0)

-1

Lets see an interesting example using OpenCV, where the image is in greyscale mode but it's read as BGR. How's the pixel's value represented?



In [7]:
bgr_image = cv.imread('photos/rose.jfif')
bgr_image[0].shape, bgr_image.shape

((275, 3), (183, 275, 3))

In [8]:
bgr_image[0][0]

array([45, 45, 45], dtype=uint8)

We notice from above, that the although the image is gray, it is read as BGR, it has three channel. Also, we notice that the value has been repeated in the 2nd and 3rd channel.

## Displaying images in a window
cv.imshow() can be used to display the image in python. But if you usse it only, the image will be displayed, and will disappear immediately. So, we need cv.waitKey(). 

cv.waitKey() takes value, this value can be:
- it waits miliseconds in case of positive vale.

- it ts for a key event infinitely (when delay≤0 ) 

Note, cv.waitKey() returns the code of the pressed key or -1 if no key was pressed before the specified time had elapsed.

In [9]:
'''
imshow() takes two parameters: the name of the frame and the image itself.

'''

cv.imshow('bgr', bgr_image)
cv.waitKey(0)

-1

## Why is the Python module called cv2 not cv?

Because it has introduced a better API, which leverages object-oriented programming as opposed to the previous cv module, which adhered to a more procedural style of programming.

## Modes of imread()

- IMREAD_ANYCOLOR = 4
- IMREAD_ANYDEPTH = 2
- IMREAD_COLOR = 1
- IMREAD_GRAYSCALE = 0
- IMREAD_LOAD_GDAL = 8
- IMREAD_UNCHANGED = -1


In [10]:
colored_image = cv.imread('photos/coloredRose.jfif')
cv.imshow('bgr', colored_image)
cv.waitKey(0)

-1

In [11]:
colored_image[0][0:10]

array([[20, 45, 19],
       [27, 55, 25],
       [35, 71, 35],
       [40, 84, 43],
       [45, 91, 49],
       [47, 93, 51],
       [48, 92, 53],
       [48, 91, 54],
       [49, 93, 57],
       [46, 90, 54]], dtype=uint8)

## Modes of imwrite()

The imwrite() function requires an image to be in the BGR or grayscale format with a certain number of bits per channel that the output format can support.

In [12]:
cv.imwrite('photos/MyPicGray.png', colored_image)

True

## Converting between an image and raw bytes

 An 8-bit grayscale image is a 2D array containing byte values. A 24-bit BGR image is a 3D array, which also contains byte values.

In [13]:
colored_image.shape

(148, 238, 3)

In colored_image:

- The first index is the pixel's y coordinate or row, 0 being the top. 
- The second index is the pixel's x coordinate or column, 0 being the leftmost. 
- The third index (if applicable) represents a color channel.

In [14]:
#To obtain each channel individually
r1 = colored_image[:,:,0] # get blue channel the 1st channel of 3 channels is 0
g1 = colored_image[:,:,1] # get green channel the 2nd channel of the 3 channels is 1
b1 = colored_image[:,:,2] # get red channel the 3rd channels of the channels is 2
r1,g1,b1

(array([[20, 27, 35, ..., 22, 16, 13],
        [22, 27, 37, ..., 14, 13, 12],
        [25, 30, 38, ...,  9, 12,  9],
        ...,
        [ 9, 11, 11, ..., 42, 38, 36],
        [10, 12, 10, ..., 42, 38, 35],
        [10, 12, 10, ..., 42, 37, 34]], dtype=uint8),
 array([[45, 55, 71, ..., 46, 40, 36],
        [47, 58, 73, ..., 35, 32, 32],
        [50, 61, 76, ..., 22, 25, 26],
        ...,
        [11, 10, 10, ..., 78, 74, 73],
        [12, 11,  9, ..., 78, 74, 72],
        [12, 11,  9, ..., 78, 73, 71]], dtype=uint8),
 array([[19, 25, 35, ..., 22, 16, 14],
        [21, 27, 37, ..., 13, 13, 13],
        [24, 30, 40, ...,  6, 11, 13],
        ...,
        [12, 12, 12, ..., 32, 28, 27],
        [13, 13, 11, ..., 32, 28, 26],
        [13, 13, 11, ..., 32, 27, 25]], dtype=uint8))

In [15]:
r1.shape,g1.shape,b1.shape

((148, 238), (148, 238), (148, 238))

Provided that an image has 8 bits per channel, we can cast it to a standard Python bytearray, which is one-dimensional as follows,


In [16]:
byteArray = bytearray(colored_image)


Conversely, provided that bytearray contains bytes in an appropriate order, we can cast and then reshape it to get a numpy.array type that is an image:

In [17]:
width=238
height= 148
bgrImage = np.array(byteArray).reshape(height, width, 3)
bgrImage[0][0:10]

array([[20, 45, 19],
       [27, 55, 25],
       [35, 71, 35],
       [40, 84, 43],
       [45, 91, 49],
       [47, 93, 51],
       [48, 92, 53],
       [48, 91, 54],
       [49, 93, 57],
       [46, 90, 54]], dtype=uint8)

In [18]:
cv.imshow('BGRByteArray', bgrImage)
cv.waitKey(0)

-1

## Accessing image data with numpy.array
Let's explore image manipulations from the start and step by step though, with a basic example: Let's say that you want to change the blue value of a particular pixel, for example, the pixel at coordinates, (150, 120)

The numpy.array type provides a very handy method, item(), which takes three parameters: the x (or left) position, y (or top), and the index within the array at (x, y) position and returns the value at the index position

In [19]:
colored_image.item(140, 120, 0) 


2

 Another itemset() method sets the value of a particular channel of a particular pixel to a specified value (itemset()
takes two arguments: a three-element tuple (x, y, and index) and the new value)

In [20]:
colored_image.itemset( (0, 0, 0), 0)
cv.imshow('Itemset', colored_image)
cv.waitKey(0)

-1

In [21]:
#Check the value at the 0,0,0
colored_image[0][0]

array([ 0, 45, 19], dtype=uint8)

One of interesting things we can do by accessing raw pixels with NumPy's array indexing is defining regions of interests (ROI). Once the region is defined, we can perform a number of operations, namely, binding this region to a variable, and then even defining a second region and assigning it.

In [22]:
my_roi = colored_image[0:40, 0:40]
colored_image[100:140, 100:140] = my_roi


In [23]:
cv.imshow('Itemset', colored_image)
cv.waitKey(0)

-1

Finally, there are a few interesting details we can obtain from numpy.array, such as 

- Shape: NumPy returns a tuple containing the width, height, and—if the image is in color—the number of channels. 
  This is useful to debug a type of image; if the image is monochromatic or grayscale, it will not contain a channel's value.


- Size: This property refers to the size of an image in pixels.


- Datatype: This property refers to the datatype used for an image (normally a variation of an unsigned integer type and the bits supported by this type, that is, uint8).

## Reading/writing a video file

OpenCV provides the VideoCapture and VideoWriter classes that support various video file formats.

In [24]:
# videoCapture is an object which can be used to get input video features via get() function
videoCapture = cv.VideoCapture('Video.mp4')
#Obtaining frame per second from the input video
fps = videoCapture.get(cv.CAP_PROP_FPS)

#obtainning the video size from hight and width of input video
size = (int(videoCapture.get(cv.CAP_PROP_FRAME_WIDTH)), 
        int(videoCapture.get(cv.CAP_PROP_FRAME_HEIGHT)))
'''
VideoWriter needs video's filename and video codec, frame per second, size

These are the codec's  options that are included:

• cv2.VideoWriter_fourcc('I','4','2','0'): This option is an uncompressed YUV encoding, 4:2:0 chroma subsampled. 
This encoding is widely compatible but produces large files. The file extension should be .avi.

• cv2.VideoWriter_fourcc('P','I','M','1'): This option is MPEG-1. The file extension should be .avi.

• cv2.VideoWriter_fourcc('X','V','I','D'): This option is MPEG-4 and a preferred option if you want the resulting 
video size to be average. The file extension should be .avi.

• cv2.VideoWriter_fourcc('T','H','E','O'): This option is Ogg Vorbis.The file extension should be .ogv.

• cv2.VideoWriter_fourcc('F','L','V','1'): This option is a Flash video. The file extension should be .flv

'''
videoWriter = cv.VideoWriter( 'MyOutputVid.avi', cv.VideoWriter_fourcc('X','V','I','D'), fps, size)

#Read the captured video 1st frame
success, frame = videoCapture.read()

#For debugging
print(success)
# Loop until there are no more frames.
while success: 
    #Write every frame until success=False
    videoWriter.write(frame)
    success, frame = videoCapture.read()


False


In [25]:
fps


0.0

## Capturing camera frames

A stream of camera frames is represented by the VideoCapture class too. However,for a camera, we construct a VideoCapture class by passing the camera's device index instead of a video's filename.

Note, 

- In the following piece of code has assumed the fps because the get() method of a VideoCapture class does not return an accurate value for the camera's frame rate; it always returns 0. So, we should have our own assumption about the fps value. It may come from an average of fps of any video.

- If an invalid index is used to construct a VideoCapture class, the VideoCapture class will not yield any frames; its read() method will return (false, None).  A good way to prevent it from trying to retrieve frames from VideoCapture that were not opened correctly is to use the VideoCapture.isOpened() method, which returns a Boolean

In [26]:
#0 is the index of your camera
cameraCapture = cv.VideoCapture(0)

fps = 30 # an assumption
# defining the size of onput frame
size = (int(cameraCapture.get(cv.CAP_PROP_FRAME_WIDTH)), int(cameraCapture.get(cv.CAP_PROP_FRAME_HEIGHT)))

videoWriter = cv.VideoWriter('MyCamVid.avi', cv.VideoWriter_fourcc('I','4','2','0'),fps, size)
success, frame = cameraCapture.read()
numFramesRemaining = 10 * fps - 1
print(success)
while success and numFramesRemaining > 0:
    videoWriter.write(frame)
    success, frame = cameraCapture.read()
    numFramesRemaining -= 1
    cameraCapture.release()

True


The number of cameras and their order is of course system-dependent. Unfortunately, OpenCV does not provide any means of querying the number of cameras or their properties.

The read() method is inappropriate incase synchronizing a set of cameras (such as a stereo camera or Kinect). Then, we use the grab() and retrieve() methods instead.

cameraCapture1 = cv.VideoCapture(index_camer1)

cameraCapture0 = cv.VideoCapture(index_camer0)


success0 = cameraCapture0.grab()

success1 = cameraCapture1.grab()

if success0 and success1:

    frame0 = cameraCapture0.retrieve()
    
    frame1 = cameraCapture1.retrieve()

## Displaying camera frames in a window

 Let's look at an example where we show the frames of a live camera input. 

In [1]:

import cv2 as cv

clicked=False

def onMouse(event, x, y, flags, param):
    #global keyword is used if you want to change a global variable inside a function.
    global clicked
    #Indicates that the left mouse button is double clicked
    if event == cv.EVENT_LBUTTONDBLCLK:
        clicked = True
        
        #print(clicked)
        #print(x,y)
        #print()


# Obtain camera input
cameraCapture = cv.VideoCapture(0)

#method is used to create a window with a suitable name
cv.namedWindow('MyWindow')

#Sets mouse handler for the specified window. sets the onMouse() to handle the click event on MyWindow window. 
cv.setMouseCallback('MyWindow', onMouse)

print('Showing camera feed. Click window or press any key tostop.')  
 
#read the video frames    
success, frame = cameraCapture.read()

# while no click, or no button pressed, show the frame window
while (success) and (not clicked) and (cv.waitKey(1) == -1):
    cv.imshow('MyWindow', frame)
    success, frame = cameraCapture.read()

#For closing and releasing camera.    
cv.destroyWindow('MyWindow')
cameraCapture.release()

Showing camera feed. Click window or press any key tostop.


waitkey() can return a specific character, so If you want to define a special character for a list of ASCII keycodes, see 
http://www.asciitable.com/

Note the following:
- OpenCV windows are only updated when waitKey() is called, and waitKey() only captures input when an OpenCV window has focus.

- The mouse callback events can be one of the following:

  - cv2.EVENT_MOUSEMOVE: This event refers to mouse movement.
   - cv2.EVENT_LBUTTONDOWN: This event refers to the left button down.
  - cv2.EVENT_RBUTTONDOWN: This refers to the right button down
  - cv2.EVENT_MBUTTONDOWN: This refers to the middle button down
  - cv2.EVENT_LBUTTONUP: This refers to the left button up
  - cv2.EVENT_RBUTTONUP: This event refers to the right button up
  - cv2.EVENT_MBUTTONUP: This event refers to the middle button up
  - cv2.EVENT_LBUTTONDBLCLK: This event refers to the left button being double-clicked
  - cv2.EVENT_RBUTTONDBLCLK: This refers to the right button being double-clicked
  - cv2.EVENT_MBUTTONDBLCLK: This refers to the middle button being double-clicked
  - cv2.EVENT_FLAG_LBUTTON: This event refers to the left button being pressed
  - cv2.EVENT_FLAG_RBUTTON: This event refers to the right button being pressed
  - cv2.EVENT_FLAG_MBUTTON: This event refers to the middle button being pressed
  - cv2.EVENT_FLAG_CTRLKEY: This event refers to the Ctrl key being pressed
  - cv2.EVENT_FLAG_SHIFTKEY: This event refers to the Shift key being pressed
  - cv2.EVENT_FLAG_ALTKEY: This event refers to the Alt key being pressed

