## [Main Playlist we study from](<(https://www.youtube.com/playlist?list=PLb49csYFtO2HAdNGChGzohFJGnJnXBOqd)>)


#### What are images..?

1. _Images are Numpy Arrays made by pixels_
2. _Image Shape : (Height, Width, Number of Channels)_
3. _Pixel values range from 0 to 255_
   1. _In Binary Image -> Pixel Values are 0(OFF) and 1(ON) OR 255_
   2. _In 16-bit Images , pixel values ranges from 0 to 65535_
4. \*2 Popular types of images : (Gray-Scale & RGB)
   1. RGB has 3 Channels (3 Gray-Scale images mixed together) -> 3D Array OR 3 of 2D Arrays
   2. Gray-Scale images has only 1 channed -> 2D Array


In [14]:
import cv2

# Read an image
img = cv2.imread('../Images/cat.jpg')

img.shape

(220, 352, 3)

In [15]:
print(img.dtype)
print(type(img))

uint8
<class 'numpy.ndarray'>


In [16]:
# Write(Save) an Image
cv2.imwrite('cat.jpg', img)

True

In [17]:
# Visualize an Image
cv2.imshow('CAT', img)
cv2.waitKey(5000)

-1

#### Deal with Videos

_video : series of images (frames)_


In [12]:
# Read Video
video = cv2.VideoCapture('video.mp4')

print(type(video))
video

<class 'cv2.VideoCapture'>


< cv2.VideoCapture 0000021DAE31B8F0>

In [None]:
# Visaulize Video
ret, frame = video.read()
while ret:
    cv2.imshow('frame', frame)
    key = cv2.waitKey(40) & 0xFF
    # 40 = Amount of Time you should wait between different frames
    if key == ord('q'):
        break

    ret, frame = video.read()

video.release()
cv2.destroyAllWindows()

#### Webcam


In [20]:
# Read a webcam
webcam = cv2.VideoCapture(0)

webcam

< cv2.VideoCapture 0000021DAE31BF70>

In [21]:
# Visualize a webcam
while True:
    ret, frame = webcam.read()
    cv2.imshow('Frame', frame)

    key = cv2.waitKey(40) & 0xFF
    if key == ord('q'):
        break

webcam.release()
cv2.destroyAllWindows()

#### Basic Image Operations

1. Resizing : Scale up/down the image.
2. Cropping : Focus on important regions and discard the other.


In [None]:
# Resizing
import cv2

img = cv2.imread('../house.png')

cv2.imshow('Original Image', img)

img_resized = cv2.resize(img, dsize=(540, 800))

cv2.imshow('Resized Image', img_resized)
cv2.waitKey(0)

-1

In [17]:
# Cropping

print(img.shape)
cv2.imshow('Original Image', img)

# As image is a numpy array -> you can use indices to choose the rows and columns

cropped_img = img[50:200, 200:300]
cv2.imshow('Cropped Image', cropped_img)

cv2.waitKey(0)

(213, 326, 3)


-1

#### Colorspaces

[OpenCV Colorspaces](https://opencv.org/blog/color-spaces-in-opencv/#:~:text=While%20the%20RGB%20%28Red%2C%20Green%2C%20Blue%29%20is%20the,offering%20unique%20advantages%20for%20different%20image%20processing%20tasks.)

1. BGR = Blue, Green, Red -> cv2 read a colored image ar BGR by default
2. RGB = Red, Green, Blue
3. Gray-Scale = Shades of Gray
4. HSV = Hue, Saturation, Value


In [None]:
img = cv2.imread('../bird.png')
cv2.imshow('BGR BIRD', img)

img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
cv2.imshow('RGB BIRD', img_rgb)

img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Only 1 Channel , We lose alot of information
cv2.imshow('Gray BIRD', img_gray)

img_hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
cv2.imshow('HSV BIRD', img_hsv)

cv2.waitKey(0)

-1

#### Image Blurring : Smooth Image & Remove noise

[Theory](https://docs.opencv.org/4.x//dc/dd3/tutorial_gausian_median_blur_bilateral_filter.html)

[Code Example](<(https://docs.opencv.org/4.x/d4/d13/tutorial_py_filtering.html)>)

1. Normalized Box Filter
2. Gaussian Filter
3. Median Filter
4. Bilateral Filter


In [None]:
img = cv2.imread('../person.png')

cv2.imshow('Person', img)

# If we increase the kernel size -> Blurring increase but we can lose some information
kernel_size = 10  # Always square matrix (Even or Odd)
blurred_img = cv2.blur(img, ksize=(kernel_size, kernel_size))
cv2.imshow('Blurred Person', blurred_img)

# Gaussian Blur : Kernel must be Odd
gauss_blur_img = cv2.GaussianBlur(img, ksize=(15, 15), sigmaX=3.0)
cv2.imshow('Gausssian Blurred Person', gauss_blur_img)


# Median Blur : Kernel must be Odd
median_blur_img = cv2.medianBlur(img, 13)
cv2.imshow('Median Blurred Person', median_blur_img)

cv2.waitKey(0)

-1

In [2]:
img = cv2.imread('../noise.png')

cv2.imshow('Noised Image', img)

blurred_img = cv2.blur(img, ksize=(6, 5))
cv2.imshow('Blurred Person', blurred_img)

gauss_blur_img = cv2.GaussianBlur(img, ksize=(15, 15), sigmaX=3.0)
cv2.imshow('Gausssian Blurred Person', gauss_blur_img)

median_blur_img = cv2.medianBlur(img, 13)
cv2.imshow('Median Blurred Person', median_blur_img)

cv2.waitKey(0)

-1

#### Thresholding

[OpenCV Thresholding](<(https://docs.opencv.org/3.4/d7/d4d/tutorial_py_thresholding.html)>)

##### THresholding : If the pixel value is smaller than the threshold, it is set to 0, otherwise it is set to a maximum value.

_The function cv.threshold(..,..,..,..) is used to apply the thresholding_

1. The first argument is the source image, which should be a grayscale image.
2. The second argument is the threshold value which is used to classify the pixel values.
3. The third argument is the maximum value which is assigned to pixel values exceeding the threshold.
4. OpenCV provides different types of thresholding which is given by the fourth parameter of the function.

##### Simple Global Thresholding

1. cv.THRESH_BINARY
2. cv.THRESH_BINARY_INV
3. cv.THRESH_TRUNC
4. cv.THRESH_TOZERO
5. cv.THRESH_TOZERO_INV

##### Adaptive Thresholding

1. cv.ADAPTIVE_THRESH_MEAN_C: The threshold value is the mean of the neighbourhood area minus the constant C.
2. cv.ADAPTIVE_THRESH_GAUSSIAN_C: The threshold value is a gaussian-weighted sum of the neighbourhood values minus the constant C.

_The blockSize determines the size of the neighbourhood area and C is a constant that is subtracted from the mean or weighted sum of the neighbourhood pixels._


In [None]:
import cv2

img = cv2.imread('../bear.png')
cv2.imshow('BGR image', img)

img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.imshow('Gray-Scale image', img_gray)

# The method returns two outputs. The first is the threshold that was used and the second output is the thresholded image.
ret, threshold = cv2.threshold(
    src=img_gray, thresh=80, maxval=255, type=cv2.THRESH_BINARY)
cv2.imshow('THRESH-BINARY image', threshold)

threshold = cv2.blur(threshold, (7, 7))
ret, threshold = cv2.threshold(
    src=img_gray, thresh=80, maxval=255, type=cv2.THRESH_BINARY)
cv2.imshow('BLUR-THRESH-BINARY image', threshold)


cv2.waitKey(0)

-1

_Adaptive Thresholding : the algorithm determines the threshold for a pixel based on a small region around it. So we get different thresholds for different regions of the same image which gives better results for images with varying illumination._


In [None]:
img = cv2.imread('../paper.png')
cv2.imshow('BGR image', img)

img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.imshow('Gray-Scale image', img_gray)

ret, threshold = cv2.threshold(
    src=img_gray, thresh=80, maxval=255, type=cv2.THRESH_BINARY)
cv2.imshow('THRESH-BINARY image', threshold)

adaptive_threshold = cv2.adaptiveThreshold(
    img_gray, 250, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 21, 30)
cv2.imshow('ADAPTIVE-THRESH-GAUSS image', adaptive_threshold)

cv2.waitKey(0)

-1

#### Edge Detection : identify sharp changes in brightness (boundaries between different regions) that typically signify object boundaries, edges, lines, or textures.

1. Sobel : discrete differentiation operator that computes an approximation of the gradient of the image intensity function. It emphasizes edge detection in both horizontal and vertical directions by combining Gaussian smoothing and differentiation, making it less sensitive to noise.

2. Laplacian : second-order derivative operator that highlights regions of rapid intensity change, effective in edge detection. Unlike Sobel, which is directional, the Laplacian is non-directional and detects edges in all directions.

3. Canny : It combines gradient-based edge detection with advanced logic to ensure that the detected edges are thin, connected, and free from noise. Canny Edge Detection is one of the most popular edge-detection methods in use today because it is so robust and flexible. The algorithm itself follows a three-stage process for extracting edges from an image. Add to it image blurring, a necessary preprocessing step to reduce noise. This makes it a four-stage process, which includes:

   1. Noise Reduction
   2. Calculating the Intensity Gradient of the Image
   3. Suppression of False Edges
   4. Hysteresis Thresholding

_Edge in an image represents a boundary where there is a significant change in intensity or color. Edge detection is crucial for understanding the structure and features within an image, aiding in tasks like object recognition, segmentation, and tracking. Edges are typically detected by identifying areas with high intensity gradients, which can be achieved using various operators that compute derivatives of the image intensity function._

_Note:- To detect edges in color images, we typically convert the image to a color space where the intensity and color information are more easily separable or we compute gradients in each channel separately (grayscale)_

######

=============================================================================


_Sobel Edge-Detection_ -> cv2.Sobel(src, ddepth, dx, dy, ksize=3)

-

1. src: Input image (should be grayscale for edge detection).
2. dst (C++ only): Output image where the result is stored.
3. ddepth: Desired depth of the output image (e.g., CV_64F allows negative gradients).
4. dx: Order of the derivative in the x-direction (set to 1 to detect horizontal changes).
5. dy: Order of the derivative in the y-direction (set to 1 to detect vertical changes).
6. ksize: Size of the extended Sobel kernel (must be odd: 1, 3, 5, 7; use 1 for Scharr operator).

-

<img src='../sobel.png'>


In [None]:
img = cv2.imread('../tiger.png')
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Detect Horizontal Edges along X-direction
sobelx = cv2.Sobel(src=img_gray, ddepth=cv2.CV_64F, dx=1, dy=0, ksize=3)
# Detect Vertical Edges along Y-direction
sobely = cv2.Sobel(src=img_gray, ddepth=cv2.CV_64F, dx=0, dy=1, ksize=3)

# Combine the horizontal and vertical components -> Overall Edge Strength
gradient_magnitude = cv2.magnitude(sobelx, sobely)
# Convert it to uint8 (8-bit Unsigned Integer)
gradient_magnitude = cv2.convertScaleAbs(gradient_magnitude)

cv2.imshow('Sobel Edge-Detection', gradient_magnitude)
cv2.waitKey(0)

-1

_Laplacian Edge-Detection_ -> cv2.Sobel(src, ddepth, ksize=3)

-

1. src: Input image (usually in grayscale).
2. dst (C++ only): Output image to store the Laplacian result.
3. ddepth: Desired depth of the output image (e.g., CV_64F to capture negative values).
4. ksize: Size of the Laplacian kernel (must be odd and positive; typically 1, 3, 5, or 7). Use ksize=1 to apply a 3×3 kernel without scaling.

-

<img src='../laplacian.png'>


In [None]:
laplacian = cv2.Laplacian(src=img_gray, ddepth=cv2.CV_64F, ksize=3)

# Since the Laplacian output includes negative values (indicating direction of change),
# it’s converted to an absolute 8-bit format using convertScaleAbs to make it visually displayable.
laplacian = cv2.convertScaleAbs(laplacian)

cv2.imshow('Laplacian Edge-Detection', laplacian)

cv2.waitKey(0)

-1

_Canny Edge-Detection_ -> cv2.Canny(image, threshold1, threshold2)

-

1. image: Input image (must be in grayscale).
2. edges (C++ only): Output image where edges will be marked as white (255) on a black background.
3. threshold1: Lower boundary for the hysteresis thresholding.
4. threshold2: Upper boundary for the hysteresis thresholding.

-

<img src='../noise-reduction.png'>
<img src='../intensity-gradient.png'>
<img src='../suppression.png'>
<img src='../hysteresis.png'>


In [None]:
# Apply Gaussian Blur -> reduces noise and smoothens the image to prevent false edges.
img_gray_blur = cv2.GaussianBlur(src=img_gray, ksize=(5, 5), sigmaX=1.4)

# Select the values of thesholds by trial & error.
# detects edges by computing gradients and applying non-maximum suppression followed by hysteresis thresholding.
edges = cv2.Canny(image=img_gray_blur, threshold1=100, threshold2=200)
cv2.imshow('Canny Edge-Detection', edges)

cv2.waitKey(0)

-1

#### Morphological operations are image processing techniques that process images based on their shapes. These operations are typically applied to binary images but can also work on grayscale images. They are used to remove noise, separate objects, or enhance certain structures in an image.

1. Erosion:
   _Shrinks the white regions (foreground) in an image. Removes small noise by eroding boundaries of objects._

2. Dilation:
   _Expands the white regions (foreground) in an image. Useful for connecting broken parts of an object._

3. Opening: Erosion then Dilation
   _Removes small noise while preserving the shape of larger objects._

4. Closing: Dilation then Erosion
   _Fills small holes or gaps in the foreground._

5. Morphological Gradient:
   _The difference between dilation and erosion. Highlights the edges of objects._

6. Top Hat:
   _The difference between the input image and its opening. Useful for extracting small bright regions on a dark background._

7. Black Hat:
   _The difference between the closing of the image and the input image.Useful for extracting small dark regions on a bright background._

#####

==============================================================================


In [4]:
import cv2
import numpy as np

img = cv2.imread('../view.jpg')
cv2.imshow('BGR image', img)

img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.imshow('Gray-Scale image', img_gray)

# Structuring Element = Kernel
kernel = np.ones((5, 5), np.uint8)

cv2.waitKey(0)

-1

In [14]:
# Erosion : cv2.erode(img,kernel,iterations = 1)

cv2.imshow('Gray-Scale image', img_gray)

erosion = cv2.erode(img_gray, kernel=kernel, iterations=1)
cv2.imshow("Erosion", erosion)

cv2.waitKey(0)

-1

In [None]:
# Dilation : cv2.dilate(img,kernel,iterations = 1)

cv2.imshow('Gray-Scale image', img_gray)

dilation = cv2.dilate(img_gray, kernel=kernel, iterations=1)
cv2.imshow("Dilation", dilation)

cv2.waitKey(0)

-1

In [None]:
# Openning : cv2.morphologyEx(img, cv.MORPH_OPEN, kernel)
# Opening (Erosion ➜ Dilation)

cv2.imshow('Gray-Scale image', img_gray)

open = cv2.morphologyEx(img_gray, cv2.MORPH_OPEN, kernel=kernel, iterations=1)
cv2.imshow("Openning", open)

cv2.waitKey(0)

-1

In [None]:
# Closing : cv2.morphologyEx(img, cv.MORPH_CLOSE, kernel)
# Closing (Dilation ➜ Erosion)

cv2.imshow('Gray-Scale image', img_gray)

close = cv2.morphologyEx(img_gray, cv2.MORPH_CLOSE,
                         kernel=kernel, iterations=1)
cv2.imshow("Closing", close)

cv2.waitKey(0)

-1

In [None]:
# Morphological Gradient : cv2.morphologyEx(img, cv.MORPH_GRADIENT, kernel)
# Morphological Gradient (Dilation - Erosion)

cv2.imshow('Gray-Scale image', img_gray)

gradient = cv2.morphologyEx(
    img_gray, cv2.MORPH_TOPHAT, kernel=kernel, iterations=1)
cv2.imshow("Morphological Gradient", gradient)

cv2.waitKey(0)

-1

In [None]:
# Top Hat : cv2.morphologyEx(img, cv.MORPH_TOPHAT, kernel)
# Morphological Gradient (Dilation - Erosion)

cv2.imshow('Gray-Scale image', img_gray)

top_hat = cv2.morphologyEx(img_gray, cv2.MORPH_TOPHAT,
                           kernel=kernel, iterations=1)
cv2.imshow("Top Hat", top_hat)

cv2.waitKey(0)

-1

In [None]:
# Black Hat : cv2.morphologyEx(img, cv.MORPH_BLACKHAT, kernel)
# Black Hat (Closing - Original)

cv2.imshow('Gray-Scale image', img_gray)

black_hat = cv2.morphologyEx(
    img_gray, cv2.MORPH_BLACKHAT, kernel=kernel, iterations=1)
cv2.imshow("Black Hat", black_hat)

cv2.waitKey(0)

-1

#### [Drawing](<(https://docs.opencv.org/4.x/dc/da5/tutorial_py_drawing_functions.html)>)

_Learn to draw different geometric shapes with OpenCV_

1. Line
2. Rectangle
3. Circle
4. Text

-

1. Text data that you want to write
2. Position coordinates of where you want put it (i.e. bottom-left corner where data starts).
3. Font type (Check cv.putText() docs for supported fonts)
4. Font Scale (specifies the size of font)
5. regular things like color, thickness, lineType etc. For better look, lineType = cv.LINE_AA is recommended.

- _Common arguments as given below:_

-

1. img : The image where you want to draw the shapes
2. color : Color of the shape. for BGR, pass it as a tuple, eg: (255,0,0) for blue. For grayscale, just pass the scalar value.
3. thickness : Thickness of the line or circle etc. If -1 is passed for closed figures like circles, it will fill the shape. default thickness = 1
4. lineType : Type of line, whether 8-connected, anti-aliased line etc. By default, it is 8-connected. cv.LINE_AA gives anti-aliased line which looks great for curves.

-


In [None]:
# Create a black image
img = np.zeros((512, 512, 3), np.uint8)
cv2.imshow('Original', img)

# Draw a diagonal blue line with thickness of 5 px
line_img = cv2.line(img, (0, 0), (511, 511), (255, 0, 0), 5)
cv2.imshow('Line', line_img)

cv2.waitKey(0)

-1

_To draw the ellipse, we need to pass several arguments. One argument is_

1. the center location (x,y).
2. Next argument is axes lengths (major axis length, minor axis length).
3. angle is the angle of rotation of ellipse in anti-clockwise direction.
4. startAngle and endAngle denotes the starting and ending of ellipse arc measured in clockwise direction from major axis. i.e. giving values 0 and 360 gives the full ellipse.


In [None]:
ellipse = cv2.ellipse(img=img, center=(256, 256), axes=(
    100, 50), angle=0, startAngle=0, endAngle=180, color=(0, 0, 255), thickness=-1)

cv2.imshow('Ellipse', ellipse)

cv2.waitKey(0)

-1

_To draw a polygon, first you need coordinates of vertices. Make those points into an array of shape ROWSx1x2 where ROWS are number of vertices and it should be of type int32. Here we draw a small polygon of with four vertices in yellow color._


In [15]:
pts = np.array([[10, 5], [20, 30]], np.int32)
pts = pts.reshape((-1, 1, 2))

poly = cv2.polylines(img=img, pts=[pts], isClosed=False, color=(0, 255, 0))

cv2.imshow('Polygon', poly)

cv2.waitKey(0)

-1

In [None]:
board = cv2.imread('../board.png')
cv2.imshow('Board', board)

print(board.shape)

# Line
line_board = cv2.line(img=board, pt1=(50, 100), pt2=(
    30, 150), color=(0, 255, 0), thickness=5)
cv2.imshow('Line on Board', line_board)

# Rectangle : you need top-left corner and bottom-right corner of rectangle
rectangle_board = cv2.rectangle(img=board, pt1=(
    50, 70), pt2=(100, 150), color=(255, 255, 0), thickness=2)
cv2.imshow('Rectangle on Board', rectangle_board)

# Circle : you need its center coordinates and radius
circle_board = cv2.circle(img=board, center=(
    50, 100), radius=20, color=(0, 0, 255), thickness=5)
cv2.imshow('Circle on Board', circle_board)

# Text
text = "I am Ahmed Akram"
text_board = cv2.putText(img=board, text=text, org=(30, 70), fontFace=cv2.FONT_HERSHEY_SIMPLEX,
                         fontScale=0.5,
                         color=(255, 255, 255),
                         thickness=2)
cv2.imshow('Text on Board', text_board)

cv2.waitKey(0)

(198, 262, 3)


-1

#### [Contours](https://docs.opencv.org/3.4/d4/d73/tutorial_py_contours_begin.html)

_Contours can be explained simply as a curve joining all the continuous points (along the boundary), having same color or intensity. The contours are a useful tool for shape analysis and object detection and recognition._

_For better accuracy, use binary images. So before finding contours, apply threshold or canny edge detection.
Since OpenCV 3.2, findContours() no longer modifies the source image but returns a modified image as the first of three return parameters.
In OpenCV, finding contours is like finding white object from black background. So remember, object to be found should be white and background should be black._


In [None]:
import cv2

img = cv2.imread('../skybirds.png')
cv2.imshow('Birds', img)

img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.imshow('Bird - Gray', img_gray)

ret, thresh = cv2.threshold(img_gray, 90, 200, cv2.THRESH_BINARY)
cv2.imshow('Threshold', thresh)


contours, hierarchy = cv2.findContours(
    image=thresh, mode=cv2.RETR_TREE, method=cv2.CHAIN_APPROX_SIMPLE)

# contours is a Python list of all the contours in the image
for cnt in contours:
    if cv2.contourArea(cnt) > 150:
        cv2.drawContours(thresh, cnt, -1, (0, 255, 0), 1)
        x1, y1, w, h = cv2.boundingRect(cnt)
        cv2.rectangle(img, (x1, y1), (x1 + w, y1 + h), (0, 255, 255), 2)

cv2.imshow('Contours', img)

cv2.waitKey(0)

-1

#### Human Vision

##### Perception of Color

_Human Eye has 3 Types of Photoreceptor cells for color -> Cones, Responsibility Spectra of human cone cells centered at blue, green and red & Trichromacy is a characterstic of humans and some animals_


#### Project 1 : Color Detector


In [18]:
import cv2
from PIL import Image
import numpy as np

In [None]:
# utils.py

# Helper Function
def get_limits(color):
    # Insert the BGR Values which you want to convert to HSV
    c = np.uint8([[color]])
    hsvC = cv2.cvtColor(c, cv2.COLOR_BGR2HSV)

    lower_limit = hsvC[0][0][0] - 10, 100, 100
    upper_limit = hsvC[0][0][0] + 10, 255, 255

    lower_limit = np.array(lower_limit, dtype=np.uint8)
    upper_limit = np.array(upper_limit, dtype=np.uint8)

    return lower_limit, upper_limit

In [None]:
cam = cv2.VideoCapture(0)
yellow = [0, 255, 255]  # Yellow in BGR Colorspae

while True:
    ret, frame = cam.read()
    hsv_img = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)

    lower, upper = get_limits(color=yellow)
    mask = cv2.inRange(hsv_img, lower, upper)

    mask_ = Image.fromarray(mask)
    bbox = mask_.getbbox()
    # print(bbox)
    if bbox is not None:
        x1, y1, x2, y2 = bbox
        cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 5)

    cv2.imshow('Frame', frame)

    if cv2.waitKey(40) & 0xFF == ord('q'):
        break

cam.release()
cv2.destroyAllWindows()

#### Face Anonymizer
