# Class 3

## Computer Vision
---

### Introduction to Computer Vision

Computer vision is a subfield of artificial intelligence and computer science that focuses on enabling machines to interpret and understand visual data from the world around them. The goal of computer vision is to enable computers to analyze, process, and understand images or videos in a way that mimics human vision.

### Applications of computer vision are numerous, including:

- Autonomous vehicles: Computer vision is essential in enabling autonomous vehicles to perceive their environment and make decisions based on what they see.

- Healthcare: Computer vision can be used in medical imaging to identify and diagnose diseases from X-rays, MRIs, and other medical images.

- Surveillance: Computer vision can be used in security systems to detect and track people and objects in real-time.

- Robotics: Computer vision is crucial in enabling robots to perceive their surroundings, identify objects, and interact with their environment.

### Some of the challenges in computer vision include:

1. Image recognition: Teaching a computer to recognize objects and patterns within an image is a complex process, especially with variations in lighting, orientation, and image quality.

2. Object tracking: Following a moving object across multiple frames in a video can be challenging, especially when objects are occluded, obscured, or change in appearance.

3. Depth perception: Determining the distance of objects from the camera is difficult, and it requires sophisticated algorithms and sensors to accurately measure depth.

4. Understanding context: Computers lack the contextual knowledge and common sense that humans possess, making it difficult for them to understand the meaning and significance of visual information.

## Computer vision libraries
- [OpenCV](https://opencv.org/)
  - OpenCV (Open Source Computer Vision): A powerful and widely used library for image and video processing that supports a wide range of algorithms and features for object detection, recognition, and tracking.

- [TensorFlow](https://www.tensorflow.org/)
  - TensorFlow: An open-source machine learning library for building and training deep neural networks that can be used for various computer vision tasks such as image classification, object detection, and segmentation.

- [Scikit-learn](https://scikit-learn.org/stable/)
  - scikit-image: A collection of algorithms and tools for image processing and computer vision that provide a simple and intuitive interface for performing common tasks such as segmentation, feature extraction, and filtering.
  
- [Keras](https://keras.io/)
  - Keras: A high-level neural network API that can be used with TensorFlow, Theano, or CNTK backend to build and train neural networks for computer vision tasks.

- [PyTorch](https://pytorch.org/)
  - PyTorch: A popular deep learning framework that provides a flexible and efficient way to build and train neural networks for computer vision applications.


- [Dlib](http://dlib.net/)
  - DLIB: A library of machine learning algorithms and tools for solving various computer vision problems such as face detection, landmark detection, and object tracking.

Install requirements:

In [None]:
pip install -r requirements.txt

## Basics of OpenCV

### Reading and displaying images


In [None]:
import cv2

# Load an image from file
img = cv2.imread('images/cat.jpg')

cv2.startWindowThread()
# Display the image
cv2.imshow('Cat', img)
# Wait for input
cv2.waitKey(0)
# close the window
cv2.destroyAllWindows()
# Mac Users hack (sic!) https://stackoverflow.com/questions/6116564/destroywindow-does-not-close-window-on-mac-using-python-and-opencv
cv2.waitKey(1)

In [None]:
import cv2

# Load an image from file
img = cv2.imread('images/tree.jpg')

cv2.startWindowThread()

cv2.imshow('Tree', img)

cv2.waitKey(0)

cv2.destroyWindow('Tree')

cv2.waitKey(1)

Note that the `imread()` function returns a NumPy array representing the image, which can be further processed using other OpenCV functions or standard NumPy operations. The `imshow()` function displays the image in a window, and the `waitKey()` function waits for a key press to close the window. 

In [None]:
# Reading a video

import cv2

capture = cv2.VideoCapture('videos/ski.mp4')

while(True):
    ret, frame = capture.read()
    cv2.imshow('Ski', frame)
    if cv2.waitKey(25) & 0xFF == ord('q'):
      break
  
capture.release()
cv2.destroyAllWindows()
cv2.waitKey(1)


In the above example, the `VideoCapture()` function is used to open a video file named `'ski.mp4'`  in the `videos` directory. Then, a loop is used to read the video frames one by one using the `read()` function of the video capture object. The loop will continue until there are no more frames to read or the user presses the `'q'` key to exit.

### Resizing and rescaling images

<div style="display: grid; grid-template-columns: 2fr 1fr; gap: 16px">
<div>
<p>
Rescaling and resizing images are two common operations used in image processing and computer vision applications to adjust the size and aspect ratio of an image.

Rescaling an image involves changing the size of the image while maintaining the same aspect ratio. This operation can be performed by multiplying the width and height of the original image by a scaling factor, which can be greater than or less than one depending on whether you want to increase or decrease the size of the image. Rescaling an image can be useful for reducing the computational load of processing large images or increasing the resolution of small images.

Resizing an image, on the other hand, involves changing the size and aspect ratio of the image by specifying new width and height dimensions. This operation can be performed by interpolating the pixel values of the original image to fit the new dimensions, which can result in loss of information or distortion if the aspect ratio is significantly changed. Resizing an image can be useful for preparing images for specific display or processing requirements, such as training a neural network or displaying images in a specific aspect ratio.

In summary, rescaling and resizing images are important operations in image processing and computer vision that can be used to adjust the size and aspect ratio of an image for specific applications. It is important to understand the differences between these two operations and choose the appropriate method depending on your specific needs.
</p>
</div>
<img src="https://cdn.mobilesyrup.com/wp-content/uploads/2019/02/zoom-enhance.gif"/>
</div>

In [None]:
def rescale_frame(frame, scale=0.50):
    width = int(frame.shape[1] * scale)
    height = int(frame.shape[0] * scale)
    dim = (width, height)
    return cv2.resize(frame, dim, interpolation = cv2.INTER_AREA)

In [None]:
capture = cv2.VideoCapture('videos/ski.mp4')

while(True):
    ret, frame = capture.read()
    frame_resized = rescale_frame(frame, scale=0.25)
    cv2.imshow('Ski', frame)
    cv2.imshow('Ski Resized', frame_resized)
    if cv2.waitKey(25) & 0xFF == ord('q'):
      break
  
capture.release()
cv2.destroyAllWindows()
cv2.waitKey(1)

In [None]:
img = cv2.imread('images/cat.jpg')
cv2.imshow('Cat', img)
cv2.imshow('Cat resized', rescale_frame(img, scale=0.25))
cv2.waitKey(0)

cv2.destroyWindow('Cat')
cv2.destroyWindow('Cat resized')

cv2.waitKey(1)

### Drawing and writing on images and videos

In [None]:
import numpy as np

In [None]:
# Drawing a red image
blank = np.zeros((300,300,3), dtype='uint8')

blank[:] = 0, 0, 255

cv2.imshow('Blank', blank)
cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.waitKey(1)


In [None]:
# Drawing a partialy red image
blank = np.zeros((300,300,3), dtype='uint8')

blank[100:150, 100:150] = 0, 0, 255

cv2.imshow('Blank', blank)
cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.waitKey(1)


In [None]:
# Drawing a partialy red image with a rectangle
blank = np.zeros((300,300,3), dtype='uint8')

# Background
blank[100:150, 100:150] = 0, 0, 255

cv2.rectangle(blank, (200,200), (300, 300), (255,255,0), thickness=cv2.FILLED)
cv2.rectangle(blank, (0,0), (blank.shape[1]//4, blank.shape[0]//4), (0,255,0), thickness=cv2.FILLED)
cv2.circle(blank, (blank.shape[1]//2, blank.shape[0]//2), 40, (255,255,255), thickness=2)

cv2.imshow('Blank', blank)
cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.waitKey(1)


### <span style="color:red">**Task**</span>

Recreate the following image using OpenCV:

<img src="images/face1.png" width="400"/>


In [None]:
# Code goes here

### Basic Functions

#### Converting to grayscale

In [None]:
img = cv2.imread('images/cat.jpg')
cv2.imshow('Cat', img)

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

cv2.imshow('Gray', gray)

cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.waitKey(1)

#### Blurring images

In [None]:
img = cv2.imread('images/cat.jpg')
cv2.imshow('Cat', img)

blur = cv2.GaussianBlur(img, (13,13), cv2.BORDER_DEFAULT)

cv2.imshow('Blur', blur)

cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.waitKey(1)

#### Edge detection

In [None]:
img = cv2.imread('images/cat.jpg')
cv2.imshow('Cat', img)


canny = cv2.Canny(img, 125, 175)
blur = cv2.GaussianBlur(img, (9,9), cv2.BORDER_DEFAULT)
canny_blur = cv2.Canny(blur, 125, 175)

cv2.imshow('Canny', canny)
cv2.imshow('Canny blur', canny_blur)

cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.waitKey(1)

#### Dilating and eroding images

In [None]:
img = cv2.imread('images/cat.jpg')
cv2.imshow('Cat', img)


blur = cv2.GaussianBlur(img, (9,9), cv2.BORDER_DEFAULT)
canny_blur = cv2.Canny(blur, 125, 175)
dilated = cv2.dilate(canny_blur, (3,3), iterations=3)

cv2.imshow('Canny', canny_blur)
cv2.imshow('Dilated', dilated)

eroded = cv2.erode(dilated, (3,3), iterations=3)

cv2.imshow('Eroded', eroded)

cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.waitKey(1)

#### Resizing images

In [None]:
img = cv2.imread('images/cat.jpg')
cv2.imshow('Cat', img)

resized = cv2.resize(img, (500,500), interpolation=cv2.INTER_AREA)
cv2.imshow('Resized', resized)
cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.waitKey(1)

#### Cropping images

In [None]:
img = cv2.imread('images/cat.jpg')
cv2.imshow('Cat', img)

cropped = img[600:900, 900:1600]
cv2.imshow('Cropped', cropped)
cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.waitKey(1)


### <span style="color:red">**Task**</span>

Crop the `tree` image to show only the tree. Then, resize the cropped image to 250 x 250 pixels. 

Prepare the image for display by converting it to grayscale and blurring it.

Do the edge detection on the blurred image, and then dilate the edges to make them thicker.

Finally, save the detected edges image to the `images` directory.


### Transitions

#### Translating images

In [None]:
img = cv2.imread('images/husky.jpg')
cv2.imshow('Original', img)

x_offset = 100
y_offset = 100

transMat = np. float32([[1,0 ,x_offset], [0,1,y_offset]])
dimensions = (img.shape[1], img.shape[0])
translated = cv2.warpAffine(img, transMat, dimensions)


cv2.imshow('Translated', translated)
cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.waitKey(1)

#### Rotating images

In [None]:
img = cv2.imread('images/husky.jpg')
cv2.imshow('Original', img)

angle = 45

rotateMat = cv2.getRotationMatrix2D((img.shape[1]//2, img.shape[0]//2), angle, 1)
dimensions = (img.shape[1], img.shape[0])
rotated = cv2.warpAffine(img, rotateMat, dimensions)

rotated2 = cv2.rotate(img, cv2.ROTATE_90_CLOCKWISE)

cv2.imshow('Rotated', rotated)
cv2.imshow('Rotated 2', rotated2)
cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.waitKey(1)

#### Flipping images

In [None]:
img = cv2.imread('images/husky.jpg')
cv2.imshow('Original', img)

flip = cv2.flip(img, 1)
cv2.imshow('Flip', flip)

cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.waitKey(1)

In [None]:
capture = cv2.VideoCapture('videos/ski.mp4')

while(True):
    ret, frame = capture.read()
    frame_resized = rescale_frame(frame, scale=0.25)
    cv2.imshow('Ski', frame)
    cv2.imshow('Ski Resized', frame_resized)
    if cv2.waitKey(25) & 0xFF == ord('q'):
      break
  
capture.release()
cv2.destroyAllWindows()
cv2.waitKey(1)

### Contours

Contours are a useful tool in image recognition because they provide a way to represent the shape and boundaries of objects in an image. A contour is a curve that connects continuous points of an object with the same intensity or color. By finding and analyzing contours in an image, we can extract important information about the objects present in the image, such as their shape, size, orientation, and position relative to other objects.

Contour detection is a common preprocessing step in many computer vision applications, such as object detection, segmentation, and tracking. For example, in object detection, we can use contours to isolate and identify individual objects in an image, by applying techniques such as thresholding and morphological operations to extract the contours from the image. In image segmentation, we can use contours to partition an image into regions based on their shapes and boundaries, allowing us to extract meaningful features from the image.

Overall, contours provide a powerful and flexible way to represent the structure and content of images, and are a fundamental tool in many image recognition and computer vision tasks.

#### Finding contours


In [None]:
img = cv2.imread('images/husky.jpg')
cv2.imshow('Original', img)

angle = 45

rotateMat = cv2.getRotationMatrix2D((img.shape[1]//2, img.shape[0]//2), angle, 1)
dimensions = (img.shape[1], img.shape[0])
rotated = cv2.warpAffine(img, rotateMat, dimensions)

rotated2 = cv2.rotate(img, cv2.ROTATE_90_CLOCKWISE)

cv2.imshow('Rotated', rotated)
cv2.imshow('Rotated 2', rotated2)
cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.waitKey(1)

In [None]:
img = cv2.imread('images/tree.jpg')
cv2.imshow('Dog', img)

# Scenario 1
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
canny = cv2.Canny(gray, 125, 175)

# Scenario 2 (with blur)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (5,5), cv2.BORDER_DEFAULT)
canny = cv2.Canny(blur, 125, 175)

cv2.imshow('Gray', gray)
cv2.imshow('Canny', canny)

contours, hierarchies = cv2.findContours(canny, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
print(f'{len(contours)} contour(s) found.')

cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.waitKey(1)

### Color spaces

In [None]:
img = cv2.imread('images/cat.jpg')
cv2.imshow('Original', img)

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.imshow('Gray', gray)
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
cv2.imshow('hsv', hsv)
lab = cv2.cvtColor(img, cv2.COLOR_BGR2LAB)
cv2.imshow('lab', lab)
rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
cv2.imshow('rgb', rgb)

cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.waitKey(1)

### Color channels

In [None]:
img = cv2.imread('images/parrot.jpg')
cv2.imshow('Original', img)

b,g,r = cv2.split(img)
cv2.imshow('Blue', b)
cv2.imshow('Green', g)
cv2.imshow('Red', r)

merged = cv2.merge([b,g,r])

cv2.imshow('Merged', merged)

cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.waitKey(1)

### Bluring

In [None]:
img = cv2.imread('images/parrot.jpg')
cv2.imshow('Original', img)

average33 = cv2.blur(img, (3,3))
cv2.imshow('Average 3 x 3', average33)

average77 = cv2.blur(img, (7,7))
cv2.imshow('Average 7 x 7', average77)

gaussian33 = cv2.GaussianBlur(img, (3,3), 0)
cv2.imshow('Gaussian 3 x 3', gaussian33)

gaussian77 = cv2.GaussianBlur(img, (7,7), 0)
cv2.imshow('Gaussian 7 x 7', gaussian77)

median33 = cv2.medianBlur(img, 3)
cv2.imshow('Median 3 x 3', median33)

median77 = cv2.medianBlur(img, 7)
cv2.imshow('Median 7 x 7', median77)

bilateral = cv2.bilateralFilter(img, 10, 40, 30)
cv2.imshow('Bilateral', bilateral)

cv2.waitKey(0)
cv2.destroyAllWindows() 
cv2.waitKey(1)

### Bitwise operations & masking

In [None]:
img = cv2.imread('images/cat.jpg')
# cv2.imshow('Original', img)

blank = np.zeros(img.shape[:2], dtype='uint8')

circle = cv2.circle(blank.copy(), (img.shape[1]//2, img.shape[0]//2), 350, 255, -1)
# cv2.imshow('Circle', circle)

bitwise_and = cv2.bitwise_and(img, img, mask=circle)
cv2.imshow('Bitwise AND', bitwise_and)

bitwise_not = cv2.bitwise_not(img, img, mask=circle)
cv2.imshow('Bitwise NOT', bitwise_not)
 
# cv2.imshow('Circle', circle) 
cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.waitKey(1)

In [None]:
# Display an image of a cat. Do a bitwise operations to mask out the cat's yellow eyes. Use color channels . Add greyscale to everything else.

img = cv2.imread('images/cat.jpg')

blank = np.zeros(img.shape[:2], dtype='uint8')
circle = cv2.circle(blank.copy(), (img.shape[1]//2 + 180, img.shape[0]//2 + 50), 275, 255, -1)

hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
cv2.imshow('hsv', hsv) 


bitwise_and = cv2.bitwise_and(img, img, mask=circle)
cv2.imshow('Bitwise AND', bitwise_and) 
# Detect yellow collor in the bitwise_and image
mask = cv2.inRange(bitwise_and, (80, 100, 0), (255, 255, 255))
cv2.imshow('Mask', mask) 

# Bitwise AND the mask and the original image to extract the yellow eyes
eyes = cv2.bitwise_and(img, img, mask=mask)

# Create a grayscale version of the image
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

mask_inv = cv2.bitwise_not(mask)
gray_masked = cv2.bitwise_and(gray, gray, mask=mask_inv)

# Invert the mask and apply it to the grayscale image to turn everything else into grayscale

# Combine the extracted yellow eyes and the grayscale image
result = cv2.add(cv2.cvtColor(gray,cv2.COLOR_GRAY2RGB), eyes)

# Display the original image and the result
# cv2.imshow('Original Image', img)
# cv2.imshow('Mask', circle)
cv2.imshow('Masked Image', result)

cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.waitKey(1)

In [None]:
img = cv2.imread('images/husky.jpg')
cv2.imshow('Original', img)

angle = 45

rotateMat = cv2.getRotationMatrix2D((img.shape[1]//2, img.shape[0]//2), angle, 1)
dimensions = (img.shape[1], img.shape[0])
rotated = cv2.warpAffine(img, rotateMat, dimensions)

rotated2 = cv2.rotate(img, cv2.ROTATE_90_CLOCKWISE)

cv2.imshow('Rotated', rotated)
cv2.imshow('Rotated 2', rotated2)
cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.waitKey(1)

### <span style="color:red">**Task**</span>

Recreate the following image using OpenCV:

<img src="images/cat_eyes.jpg" width="600"/>


In [None]:
img = cv2.imread('images/cat.jpg')

# Might be useful to start with:
# Convert the image to HSV space
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# Define lower and upper bounds of yellow color in HSV space
lower_yellow = np.array([10, 100, 100])
upper_yellow = np.array([30, 255, 255])
# Create a mask of yellow pixels
yellow_mask = cv2.inRange(hsv, lower_yellow, upper_yellow)


#
# Code goes here
#


cv2.imshow('Original Image', img)
cv2.imshow('Yellow Eyes', result)
cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.waitKey(1)

### Thresholding

Thresholding is a technique used in image processing and computer vision to separate objects or regions from the background based on their intensity values. The basic idea of thresholding is to convert an image into a binary image, where the pixels are classified as either foreground or background based on a specified threshold value. We use thresholding for several purposes, such as image segmentation, object detection, feature extraction, image enhancement, and image analysis.


In [None]:
img = cv2.imread('images/cat.jpg')
cv2.imshow('Dog', img)

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

ret, thresh = cv2.threshold(gray, 75, 255, cv2.THRESH_BINARY)
cv2.imshow('Thresh', thresh)

ret, thresh_inv = cv2.threshold(gray, 75, 255, cv2.THRESH_BINARY_INV)
cv2.imshow('Thresh Inv', thresh_inv)

adaptive_thresh_m = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 13, 5)
cv2.imshow('Adaptive Thresh Mean', adaptive_thresh_m)

adaptive_thresh_g = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 13, 5)
cv2.imshow('Adaptive Thresh Gaussian', adaptive_thresh_g)

cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.waitKey(1)

#### Edge detection - Laplacian and Sobel

In [None]:
img = cv2.imread('images/cat.jpg')
cv2.imshow('Original', img)

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.imshow('Gray', gray)

# Laplacian
lap = cv2.Laplacian(gray, cv2.CV_64F)
lap = np.uint8(np.absolute(lap))
cv2.imshow('Laplacian', lap)

# Sobel
sobelx = cv2.Sobel(gray, cv2.CV_64F, 1, 0)
sobely = cv2.Sobel(gray, cv2.CV_64F, 0, 1)
combined_sobel = cv2.bitwise_or(sobelx, sobely)
 
cv2.imshow('Sobel X', sobelx)
cv2.imshow('Sobel Y', sobely)
cv2.imshow('Combined Sobel', combined_sobel)
 
cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.waitKey(1)

Edge detection is a fundamental process in computer vision that aims to identify boundaries in an image. These boundaries represent regions where the pixel intensities undergo significant changes, such as corners, lines, or edges. The edges can be used to identify objects, track movements, and perform image segmentation.

There are several methods for edge detection, and the most popular ones are:

**Sobel Operator**: The Sobel operator is a gradient-based approach that detects edges by computing the gradient of the image intensity function.

**Canny Edge Detector**: The Canny edge detector is a multi-stage algorithm that involves smoothing the image, calculating the gradient, applying non-maximum suppression, and hysteresis thresholding.

**Laplacian Operator**: The Laplacian operator is a second-order differential operator that detects edges by computing the Laplacian of the image intensity function.


The differences between these methods lie in their accuracy, speed, and robustness to noise. For example, the Sobel operator is fast and computationally efficient but may miss small details or produce noisy edges. The Canny edge detector is more accurate and robust but requires more computational resources. The Laplacian operator is sensitive to noise but can detect edges at different scales. Choosing the right method depends on the application requirements and the characteristics of the image.

## Face detection

[Haar Cascades classifiers](https://github.com/opencv/opencv/tree/master/data/haarcascades)


#### Single face detection

In [None]:
import cv2
img = cv2.imread('images/woman.jpg')
cv2.imshow('Original', img)

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.imshow('Gray', gray)

haar_cascage_face = cv2.CascadeClassifier('haar_face.xml')
faces_rect = haar_cascage_face.detectMultiScale(gray, scaleFactor=1.2, minNeighbors=5)

print(f'Number of faces found = {len(faces_rect)}')

for (x, y, w, h) in faces_rect:
    cv2.rectangle(img, (x, y), (x+w, y+h), (0, 0, 255), thickness=4)

cv2.imshow('Detected Faces', img)

cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.waitKey(1)



#### Group of people

In [None]:
import cv2
img = cv2.imread('images/family.jpg')
cv2.imshow('Original', img)

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.imshow('Gray', gray)

haar_cascage_face = cv2.CascadeClassifier('haar_face.xml')
faces_rect = haar_cascage_face.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=3)

print(f'Number of faces found = {len(faces_rect)}')

for (x, y, w, h) in faces_rect:
    cv2.rectangle(img, (x, y), (x+w, y+h), (0, 0, 255), thickness=4)

cv2.imshow('Detected Faces', img)

cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.waitKey(1)



In [None]:
import cv2
img = cv2.imread('images/employees.jpg')
cv2.imshow('Original', img)

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.imshow('Gray', gray)

haar_cascage_face = cv2.CascadeClassifier('haar_face.xml')
faces_rect = haar_cascage_face.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=5)

print(f'Number of faces found = {len(faces_rect)}')

for (x, y, w, h) in faces_rect:
    cv2.rectangle(img, (x, y), (x+w, y+h), (0, 0, 255), thickness=4)

cv2.imshow('Detected Faces', img)

cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.waitKey(1)

### Face recognition

#### Face recognition model training using LBPH

In [None]:
import os 

DIR = r'images/recognizer'
people = [i for i in os.listdir(DIR) if not i.startswith('.')]

haar_cascade_face = cv2.CascadeClassifier('haar_face.xml')

def create_train():
    features = []
    labels = []
    
    for person in people:
        path = os.path.join(DIR, person)
        label = people.index(person)
        
        for img in os.listdir(path):
            if (img.startswith('.')):
                continue
            img_path = os.path.join(path, img)
            img = cv2.imread(img_path)
            gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
            
            faces_rect = haar_cascade_face.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=5)
            
            for (x, y, w, h) in faces_rect:
                faces_roi = gray[y:y+h, x:x+w]
                features.append(faces_roi)
                labels.append(label)
            
    return (features, labels)


features, labels = create_train()


In [None]:
len(features), len(labels)

In [None]:
import numpy as np
import cv2

face_recognizer = cv2.face.LBPHFaceRecognizer_create()

features = np.array(features, dtype='object')
labels = np.array(labels)

face_recognizer.train(features, labels)

face_recognizer.save('train/face_trained.yml')

np.save('train/features.npy', features)
np.save('train/labels.npy', labels)


#### Testing out the face recognition

In [None]:
DIR = r'images/recognizer'
people = [i for i in os.listdir(DIR) if not i.startswith('.')]
img = cv2.imread('images/test/biden.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.imshow('Test', gray)

features = np.load('train/features.npy', allow_pickle=True)
labels = np.load('train/labels.npy')
face_recognizer = cv2.face.LBPHFaceRecognizer_create()
face_recognizer.read('train/face_trained.yml')

haar_cascade_face = cv2.CascadeClassifier('haar_face.xml')

faces_rect = haar_cascade_face.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=4)

for (x, y, w, h) in faces_rect:
    faces_roi = gray[y:y+h, x:x+w]
    label, confidence = face_recognizer.predict(faces_roi)
    print(f'Label = {people[label]} with a confidence of {confidence}')
    cv2.putText(img, f'{str(people[label])} - {round(confidence,2)}%', (50, 50), cv2.FONT_HERSHEY_COMPLEX, 1.0, (0, 255, 0), thickness=2)
    cv2.rectangle(img, (x, y), (x+w, y+h), (0, 0, 255), thickness=2)


cv2.imshow('Detected Face', img)

cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.waitKey(1)

### Using Camera

In [None]:
import cv2

cam = cv2.VideoCapture(0)
haar_cascade_face = cv2.CascadeClassifier('haar_face.xml')


while True:
    check, frame = cam.read()
    
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    cv2.imshow('Test', gray)
    
    key = cv2.waitKey(1)
    if key == 27:
        break

cam.release()
cv2.destroyAllWindows()
cv2.waitKey(1)

In [None]:
import cv2

cam = cv2.VideoCapture(0)

haar_cascade_face = cv2.CascadeClassifier('haar_smile.xml')

while True:
    check, frame = cam.read()
    
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    cv2.imshow('gray', gray)
    
    faces_rect = haar_cascade_face.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=10)
    for (x, y, w, h) in faces_rect:
        cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 255, 0), thickness=2)
    
    cv2.imshow('Detected Face', frame)
    
    key = cv2.waitKey(1)
    if key == 27:
        break

cam.release()
cv2.destroyAllWindows()
cv2.waitKey(1)

### References

<a href="https://pl.freepik.com/darmowe-zdjecie/dosc-piekna-kobieta-z-blond-dlugie-wlosy-o-podekscytowany-i-szczesliwy-wyraz-twarzy_9116613.htm#page=2&query=usmiech&position=2&from_view=search&track=sph">Obraz autorstwa cookie_studio</a> na Freepik