# OpenCV for Beginners

*This Jupyter notebook is designed to introduce beginners to OpenCV, covering the basics with practical implementations and theoretical explanations.*

## **1. Introduction to OpenCV**

**OpenCV** (Open Source Computer Vision Library) is an open-source computer vision and machine learning library developed to provide a common infrastructure for computer vision applications and to accelerate the use of machine perception in commercial products. Initially developed by Intel, it supports various programming languages such as C++, Python, Java, and MATLAB, and is available on different platforms including Windows, Linux, macOS, iOS, and Android.

### **Why OpenCV?**

- **Wide Range of Functions**: OpenCV contains more than 2500 optimized algorithms for computer vision and machine learning.
- **Real-Time Applications**: Designed for computational efficiency with a strong focus on real-time applications.
- **Community and Support**: Large community of developers and comprehensive documentation.

### **Applications of OpenCV**

- **Image Processing**: Manipulation and transformation of images.
- **Video Analysis**: Real-time video processing, object detection, and tracking.
- **Machine Learning**: Implementing algorithms for facial recognition, object classification, etc.
- **Augmented Reality**: Overlaying virtual content on real-world scenes.


## **2. Installing OpenCV in Google Colab**

Google Colab is a cloud-based Jupyter notebook environment that allows you to write and execute Python code through your browser. It comes pre-installed with many libraries, but to ensure the latest version of OpenCV is available, we use `pip` to install it.


In [None]:
!pip install opencv-python

## **3. Importing Libraries**

We import the essential libraries required for image processing tasks:

- `cv2`: The OpenCV library in Python.
- `numpy`: A fundamental package for numerical computations, used for handling arrays.
- `matplotlib`: A plotting library for creating static, animated, and interactive visualizations.

In [None]:
import cv2
import numpy as np
from matplotlib import pyplot as plt

# For displaying videos
from IPython.display import clear_output


## **4. Reading and Displaying Images**

### **Theory**

#### **Reading Images**

- **Function**: `cv2.imread(filename, flags)`
- **Parameters**:
  - `filename`: Name of the image file to be loaded.
  - `flags`: Specifies the way in which the image should be read.
    - `cv2.IMREAD_COLOR`: Loads a color image.
    - `cv2.IMREAD_GRAYSCALE`: Loads image in grayscale mode.
    - `cv2.IMREAD_UNCHANGED`: Loads image as is (including alpha channel).

#### **Displaying Images**

- In Jupyter notebooks, we use `matplotlib.pyplot.imshow()` to display images.
- OpenCV uses BGR color space by default, while Matplotlib uses RGB. Therefore, we need to convert BGR images to RGB before displaying.

#### **Color Spaces**

- **BGR vs. RGB**: OpenCV uses BGR (Blue, Green, Red) while most libraries use RGB (Red, Green, Blue).

### **Image Loading Process**

1. **Upload**: Use Google Colab's `files.upload()` to upload images from your local machine.
2. **Read**: Use `cv2.imread()` to load the image into a NumPy array.
3. **Convert**: Change color space from BGR to RGB using `cv2.cvtColor()`.
4. **Display**: Use `plt.imshow()` to display the image.



In [None]:
# Upload an image from your local machine
from google.colab import files
uploaded = files.upload()

# Replace 'your_image.jpg' with the name of your uploaded image
img = cv2.imread(list(uploaded.keys())[0])

# Check if image was loaded
if img is None:
    print('Image not loaded. Please check the file name and path.')
else:
    # Convert BGR to RGB for display
    img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

    # Display the image
    plt.imshow(img_rgb)
    plt.axis('off')  # Hide axis
    plt.show()


## **5. Basic Image Operations**

Image processing involves manipulating images to achieve desired results. Basic operations include resizing, cropping, rotating, and translating images.

### **a. Resizing**

#### **Theory**

- **Resizing**: Changing the dimensions of an image.
- **Function**: `cv2.resize(src, dsize, fx, fy, interpolation)`
- **Parameters**:
  - `src`: Input image.
  - `dsize`: Desired size. If it is (0, 0), the size is calculated from `src` using `fx` and `fy`.
  - `fx` and `fy`: Scale factors along the x and y axes.
  - `interpolation`: Method used for resizing.
    - Common methods: `cv2.INTER_LINEAR`, `cv2.INTER_AREA`, `cv2.INTER_CUBIC`.

#### **Mathematical Model**

Resizing an image involves mapping pixel coordinates from the original image to the new image based on scaling factors:

\[ x_{\text{new}} = x_{\text{old}} \times \text{fx} \]
\[ y_{\text{new}} = y_{\text{old}} \times \text{fy} \]

In [None]:
# Resize the image to half its dimensions
resized_img = cv2.resize(img, (0,0), fx=0.5, fy=0.5)

# Display
plt.imshow(cv2.cvtColor(resized_img, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.show()

### **b. Cropping**

#### **Theory**

- **Cropping**: Extracting a region of interest (ROI) from an image.
- Achieved by slicing the NumPy array representing the image.
- Syntax: `cropped_img = img[startY:endY, startX:endX]`

In [None]:
# Crop the image [startY:endY, startX:endX]
cropped_img = img[100:400, 200:500]

# Display
plt.imshow(cv2.cvtColor(cropped_img, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.show()

### **c. Rotating**

#### **Theory**

- **Rotation**: Rotating an image by a certain angle around a pivot point.
- **Function**: `cv2.getRotationMatrix2D(center, angle, scale)`
- **Transformation Matrix**:

The rotation matrix \( M \) for rotating an image is defined as:

$$ M = \begin{bmatrix}
\alpha & \beta & (1 - \alpha) \cdot center_x - \beta \cdot center_y \\
-\beta & \alpha & \beta \cdot center_x + (1 - \alpha) \cdot center_y \\
\end{bmatrix} $$


Where:

$$ \alpha = scale \cdot \cos(\theta) $$  
$$ \beta = scale \cdot \sin(\theta) $$



- **Warping**: Apply the rotation matrix to the image using `cv2.warpAffine()`.

In [None]:
# Get image dimensions
(h, w) = img.shape[:2]
center = (w // 2, h // 2)

# Rotate the image by 45 degrees
M = cv2.getRotationMatrix2D(center, 45, 1.0)
rotated_img = cv2.warpAffine(img, M, (w, h))

# Display
plt.imshow(cv2.cvtColor(rotated_img, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.show()

## **6. Color Spaces**

### **Theory**

- **Color Spaces**: Different ways to represent the colors in an image.
- Common color spaces:
  - **RGB/BGR**: Red, Green, Blue.
  - **Grayscale**: Shades of gray, from black to white.
  - **HSV**: Hue, Saturation, Value.
  - **LAB**: Lightness and color-opponent dimensions.

#### **Conversions**

- Use `cv2.cvtColor()` to convert images between different color spaces.
- **Example**: Convert BGR to Grayscale using `cv2.COLOR_BGR2GRAY`.

### **Importance of Color Spaces**

- Different color spaces are useful for different types of image processing tasks.
- For example, HSV is useful for color-based segmentation because hue is independent of illumination.


In [None]:
# Convert BGR to Grayscale
gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Display
plt.imshow(gray_img, cmap='gray')
plt.axis('off')
plt.show()

## **7. Drawing Functions**

OpenCV provides functions to draw shapes like lines, rectangles, circles, and text on images.


### **a. Drawing a Line**

#### **Theory**

- **Function**: `cv2.line(img, pt1, pt2, color, thickness)`
- **Parameters**:
  - `img`: Image on which to draw.
  - `pt1`: Starting point (x1, y1).
  - `pt2`: Ending point (x2, y2).
  - `color`: Line color (B, G, R).
  - `thickness`: Line thickness.

In [None]:
# Copy the image to draw on
line_img = img.copy()

# Draw a line from top-left to bottom-right
cv2.line(line_img, (0, 0), (line_img.shape[1], line_img.shape[0]), (255, 0, 0), 5)

# Display
plt.imshow(cv2.cvtColor(line_img, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.show()

### **b. Drawing a Rectangle**

#### **Theory**

- **Function**: `cv2.rectangle(img, pt1, pt2, color, thickness)`
- **Parameters**:
  - `pt1`: One corner of the rectangle.
  - `pt2`: Opposite corner.

In [None]:
# Copy the image to draw on
rect_img = img.copy()

# Draw a rectangle
cv2.rectangle(rect_img, (100, 100), (400, 300), (0, 255, 0), 3)

# Display
plt.imshow(cv2.cvtColor(rect_img, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.show()

### **c. Drawing a Circle**

#### **Theory**

- **Function**: `cv2.circle(img, center, radius, color, thickness)`
- **Parameters**:
  - `center`: Center of the circle (x, y).
  - `radius`: Radius of the circle.
  - `thickness`: If `-1`, the circle is filled.

In [None]:
# Copy the image to draw on
circle_img = img.copy()

# Draw a circle at the center
center_coordinates = (circle_img.shape[1] // 2, circle_img.shape[0] // 2)
cv2.circle(circle_img, center_coordinates, 100, (0, 0, 255), -1)

# Display
plt.imshow(cv2.cvtColor(circle_img, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.show()

## **8. Image Arithmetic**

### **Theory**

#### **Image Addition**

- Combining two images by adding the pixel values.
- **Function**: `cv2.add(img1, img2)`
- Pixel values are added up to a maximum of 255.

#### **Blending**

- Weighted addition of two images to create a blending effect.
- **Function**: `cv2.addWeighted(img1, alpha, img2, beta, gamma)`
- **Formula**:

$$
\text{Result} = \alpha \times \text{img1} + \beta \times \text{img2} + \gamma
$$

Where:

- $\alpha$ and $\beta$ are weights.
- $\gamma$ is a scalar added to each sum.

#### **Requirements**

- Images must be of the same size and type.



In [None]:
# Upload another image
uploaded = files.upload()
img2 = cv2.imread(list(uploaded.keys())[0])

# Resize second image to match the first image
img2 = cv2.resize(img2, (img.shape[1], img.shape[0]))

# Blend images
blended_img = cv2.addWeighted(img, 0.7, img2, 0.3, 0)

# Display
plt.imshow(cv2.cvtColor(blended_img, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.show()

## **9. Thresholding**

### **Theory**

- **Thresholding**: Binarization of an image based on pixel intensity.
- Converts grayscale images to binary images.
- **Types of Thresholding**:
  - **Global Thresholding**: Single threshold value for the entire image.
  - **Adaptive Thresholding**: Threshold value calculated for smaller regions.
  - **Otsu's Binarization**: Automatic threshold calculation.

#### **Global Thresholding**

- **Function**: `cv2.threshold(src, thresh, maxval, type)`
- **Parameters**:
  - `thresh`: Threshold value.
  - `maxval`: Maximum value assigned to pixel values exceeding the threshold.
  - `type`: Thresholding type (e.g., `cv2.THRESH_BINARY`).

#### **Mathematical Model**

$$
\text{dst}(x,y) = \begin{cases}
\text{maxval} & \text{if } \text{src}(x,y) > \text{thresh} \\
0 & \text{otherwise}
\end{cases}
$$




In [None]:
# Apply global thresholding
ret, thresh_global = cv2.threshold(gray_img, 127, 255, cv2.THRESH_BINARY)

# Display
plt.imshow(thresh_global, cmap='gray')
plt.axis('off')
plt.show()

## **10. Blurring and Smoothing**

### **Theory**

- **Purpose**: Reduce noise and details in images.
- **Methods**:
  - **Averaging (Mean Filter)**: Replaces each pixel value with the average of its neighbors.
  - **Gaussian Blurring**: Weights pixels based on Gaussian distribution.
  - **Median Blurring**: Replaces each pixel value with the median of its neighbors.


#### **a. Averaging**

- **Function**: `cv2.blur(src, ksize)`
- **Kernel**: A matrix where all elements are equal and sum to 1.

In [None]:
# Apply average blurring
avg_blur = cv2.blur(img, (5, 5))

# Display
plt.imshow(cv2.cvtColor(avg_blur, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.show()

#### **b. Gaussian Blur**

- **Function**: `cv2.GaussianBlur(src, ksize, sigmaX)`
- **Kernel**: Gaussian kernel calculated using the Gaussian function.

#### **Gaussian Function**:

$$
G(x, y) = \frac{1}{2\pi\sigma^2} e^{-\frac{x^2 + y^2}{2\sigma^2}}
$$



In [None]:
# Apply Gaussian blurring
gauss_blur = cv2.GaussianBlur(img, (5, 5), 0)

# Display
plt.imshow(cv2.cvtColor(gauss_blur, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.show()

#### **c. Median Blur**

- **Function**: `cv2.medianBlur(src, ksize)`
- Effective for removing salt-and-pepper noise.

In [None]:
# Apply median blurring
median_blur = cv2.medianBlur(img, 5)

# Display
plt.imshow(cv2.cvtColor(median_blur, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.show()

## **11. Edge Detection**

### **Theory**

- **Edges**: Significant transitions in pixel intensity.
- **Purpose**: Identify object boundaries and features.

#### **Canny Edge Detector**

- **Function**: `cv2.Canny(image, threshold1, threshold2)`
- **Steps**:
  1. Noise reduction using Gaussian filter.
  2. Finding intensity gradient of the image.
  3. Non-maximum suppression to get rid of spurious response to edge detection.
  4. Hysteresis thresholding to decide potential edges.

#### **Mathematical Concepts**

- **Gradient Calculation**: Using Sobel operators to find the gradient magnitude and direction.


In [None]:
# Apply Canny Edge Detection
edges = cv2.Canny(img, 100, 200)

# Display
plt.imshow(edges, cmap='gray')
plt.axis('off')
plt.show()

## **12. Contours**

### **Theory**

- **Contours**: Curves joining all continuous points along the boundary with the same color or intensity.
- **Applications**: Shape analysis, object detection, recognition.

#### **Finding Contours**

- **Function**: `cv2.findContours(image, mode, method)`
- **Parameters**:
  - `mode`: Retrieval mode (e.g., `cv2.RETR_TREE`).
  - `method`: Approximation method (e.g., `cv2.CHAIN_APPROX_SIMPLE`).

#### **Drawing Contours**

- **Function**: `cv2.drawContours(image, contours, contourIdx, color, thickness)`



In [None]:
# Convert to grayscale and apply threshold
ret, thresh = cv2.threshold(cv2.cvtColor(img, cv2.COLOR_BGR2GRAY), 127, 255, 0)

# Find contours
contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

# Draw contours
contour_img = img.copy()
cv2.drawContours(contour_img, contours, -1, (0, 255, 0), 3)

# Display
plt.imshow(cv2.cvtColor(contour_img, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.show()

## **13. Histograms**

### **Theory**

- **Histogram**: A graphical representation of the distribution of pixel intensities.
- **Uses**:
  - Analyze image contrast, brightness.
  - Perform histogram equalization for contrast enhancement.

#### **Calculating Histograms**

- **Function**: `cv2.calcHist(images, channels, mask, histSize, ranges)`
- **Parameters**:
  - `images`: Source image.
  - `channels`: Index of the channel (e.g., `[0]` for grayscale).
  - `histSize`: Number of bins.
  - `ranges`: Range of pixel values.

In [None]:
# Calculate histogram
hist = cv2.calcHist([gray_img], [0], None, [256], [0, 256])

# Plot histogram
plt.plot(hist)
plt.title('Grayscale Histogram')
plt.xlabel('Pixel Intensity')
plt.ylabel('Frequency')
plt.show()

---

## **14. Morphological Operations**

### **Theory**

- **Purpose**: Remove imperfections introduced during thresholding.
- **Types**:
  - **Erosion**: Removes pixels on object boundaries.
  - **Dilation**: Adds pixels to object boundaries.
  - **Opening**: Erosion followed by dilation.
  - **Closing**: Dilation followed by erosion.

#### **Structuring Element (Kernel)**

- Defines the neighborhood over which the operation is applied.
- Commonly a square or circular matrix.

#### **a. Erosion**

- **Function**: `cv2.erode(src, kernel, iterations)`
- **Effect**: Shrinks bright regions.

#### **Mathematical Model**

$$
\text{Erosion}: A \ominus B = \{ z | (B)_z \subseteq A \}
$$

In [None]:
# Define kernel
kernel = np.ones((5,5), np.uint8)

# Apply erosion
erosion = cv2.erode(thresh_global, kernel, iterations=1)

# Display
plt.imshow(erosion, cmap='gray')
plt.axis('off')
plt.show()

#### **b. Dilation**

- **Function**: `cv2.dilate(src, kernel, iterations)`
- **Effect**: Expands bright regions.

#### **Mathematical Model**

$$
\text{Dilation}: A \oplus B = \{ z | (B̂)_z \cap A \neq \emptyset \}
$$

Where $ A $ is the image and $ B $ is the structuring element.



In [None]:
# Apply dilation
dilation = cv2.dilate(thresh_global, kernel, iterations=1)

# Display
plt.imshow(dilation, cmap='gray')
plt.axis('off')
plt.show()

## **15. Image Transformations**

Image transformations involve changing the geometric configuration of an image.


### **a. Affine Transformation**

#### **Theory**

- Preserves lines and parallelism (not necessarily distances and angles).
- **Transformation Matrix**:

$$
\begin{bmatrix}
x' \\
y' \\
\end{bmatrix}
=
\begin{bmatrix}
a_{00} & a_{01} \\
a_{10} & a_{11} \\
\end{bmatrix}
\begin{bmatrix}
x \\
y \\
\end{bmatrix}
+
\begin{bmatrix}
b_0 \\
b_1 \\
\end{bmatrix}
$$

- **Function**: `cv2.getAffineTransform(pts1, pts2)`

In [None]:
# Define points for transformation
rows, cols = img.shape[:2]
pts1 = np.float32([[50,50], [200,50], [50,200]])
pts2 = np.float32([[10,100], [200,50], [100,250]])

# Get the transformation matrix
M = cv2.getAffineTransform(pts1, pts2)

# Apply affine transformation
affine_img = cv2.warpAffine(img, M, (cols, rows))

# Display
plt.imshow(cv2.cvtColor(affine_img, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.show()

### **b. Perspective Transformation**

#### **Theory**

- Straight lines remain straight, but parallel lines may not remain parallel.
- **Transformation Matrix**:

$$
\begin{bmatrix}
x' \\
y' \\
w \\
\end{bmatrix}
=
\begin{bmatrix}
p_{00} & p_{01} & p_{02} \\
p_{10} & p_{11} & p_{12} \\
p_{20} & p_{21} & p_{22} \\
\end{bmatrix}
\begin{bmatrix}
x \\
y \\
1 \\
\end{bmatrix}
$$

Then, $ x_{\text{new}} = x'/w, y_{\text{new}} = y'/w $.

- **Function**: `cv2.getPerspectiveTransform(pts1, pts2)`

In [None]:
# Define points for transformation
pts1 = np.float32([[56,65], [368,52], [28,387], [389,390]])
pts2 = np.float32([[0,0], [300,0], [0,300], [300,300]])

# Get the transformation matrix
M = cv2.getPerspectiveTransform(pts1, pts2)

# Apply perspective transformation
perspective_img = cv2.warpPerspective(img, M, (300, 300))

# Display
plt.imshow(cv2.cvtColor(perspective_img, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.show()

---

## **16. Video Processing**

### **Theory**

- Processing video frames in real-time.
- Capturing video involves reading frames in a loop.

### **a. Capturing Video from Webcam**

#### **Note**

- Not supported in Google Colab due to lack of access to local hardware.

#### **Function**

- **Initialization**: `cv2.VideoCapture(0)`
- **Reading Frames**: `cap.read()`
- **Displaying Frames**: `cv2.imshow()`

In [None]:
# This code may not work in Google Colab due to lack of webcam access

# Initialize webcam
cap = cv2.VideoCapture(0)

while True:
    ret, frame = cap.read()
    if not ret:
        break

    # Display the resulting frame
    cv2.imshow('Webcam', frame)

    # Press 'q' to exit
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# Release the capture
cap.release()
cv2.destroyAllWindows()

### **b. Processing Video File**

#### **Theory**

- Similar to image processing, but applied to each frame.
- Useful for tasks like video stabilization, object tracking.

In [None]:
# Upload a video file
uploaded = files.upload()
video_path = list(uploaded.keys())[0]

# Open the video file
cap = cv2.VideoCapture(video_path)

while cap.isOpened():
    ret, frame = cap.read()
    if not ret:
        break

    # Convert to grayscale
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    # Display frame
    plt.imshow(gray, cmap='gray')
    plt.axis('off')
    plt.show()

    # Clear output to display next frame
    clear_output(wait=True)

cap.release()

## **17. Face Detection with Haar Cascades**

### **Theory**

- **Haar Cascades**: Machine learning-based approach where a cascade function is trained from a lot of positive and negative images.
- **Features**: Uses Haar-like features which are digital image features used in object recognition.

#### **Algorithm Steps**

1. **Haar Feature Selection**: Identifies features such as edges and lines.
2. **Integral Images**: Quickly computes the sum of values in a rectangular subset of a grid.
3. **Adaboost Training**: Selects the best features and trains classifiers.
4. **Cascading Classifiers**: Combines many weak classifiers to form a strong classifier.

#### **Function**

- **Loading Classifier**: `cv2.CascadeClassifier()`
- **Detection**: `detectMultiScale()`

In [None]:
# Load the cascade
face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')

# Convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Detect faces
faces = face_cascade.detectMultiScale(gray, 1.1, 4)

# Draw rectangle around the faces
for (x, y, w, h) in faces:
    cv2.rectangle(img, (x, y), (x+w, y+h), (255, 0, 0), 2)

# Display the output
plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.show()

## **18. Feature Detection using ORB**

### **Theory**

- **ORB (Oriented FAST and Rotated BRIEF)**: Efficient alternative to SIFT and SURF.
- **Features**:
  - **FAST**: Features from Accelerated Segment Test for keypoint detection.
  - **BRIEF**: Binary Robust Independent Elementary Features for descriptor extraction.

#### **Steps**

1. **Keypoint Detection**: Identifies points of interest in the image.
2. **Descriptor Computation**: Describes the neighborhood of keypoints.
3. **Feature Matching**: Matching descriptors between images.

#### **Advantages**

- Fast and computationally efficient.
- Scale and rotation invariant.



In [None]:
# Initialize ORB detector
orb = cv2.ORB_create()

# Find the keypoints with ORB
kp = orb.detect(img, None)

# Compute the descriptors
kp, des = orb.compute(img, kp)

# Draw only keypoints location (not size and orientation)
img_with_kp = cv2.drawKeypoints(img, kp, None, color=(0,255,0), flags=0)

# Display the image
plt.imshow(cv2.cvtColor(img_with_kp, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.show()

## **19. Object Tracking with Camshift**

### **Theory**

- **Camshift Algorithm**: Continuously Adaptive Mean Shift algorithm for object tracking.
- **Uses**: Tracking objects whose appearance may change over time.

#### **Algorithm Steps**

1. **Select ROI**: Define the initial position of the object.
2. **Compute Histogram**: Calculate the color histogram of the ROI.
3. **Backprojection**: Project the histogram back onto the image to find the object.
4. **Mean Shift**: Iteratively move the search window to the area of highest density.
5. **Update**: Adjust the size and orientation of the window.

#### **Mathematical Concepts**

- **Mean Shift Vector**:

$$
m = \frac{\sum_{i} x_i \cdot w(x_i)}{\sum_{i} w(x_i)}
$$

Where $ w(x_i) $ is the weight of pixel $ x_i $.


In [None]:
# This is a simplified example and may not work in Colab due to video capture limitations

# Set up initial location of window
r, h, c, w = 250, 90, 400, 125  # Hardcoded values
track_window = (c, r, w, h)

# Read the first frame of the video
ret, frame = cap.read()

# Set up the ROI for tracking
roi = frame[r:r+h, c:c+w]
hsv_roi = cv2.cvtColor(roi, cv2.COLOR_BGR2HSV)
mask = cv2.inRange(hsv_roi, np.array((0., 60.,32.)), np.array((180.,255.,255.)))
roi_hist = cv2.calcHist([hsv_roi],[0], mask,[180],[0,180])
cv2.normalize(roi_hist,roi_hist,0,255,cv2.NORM_MINMAX)

# Setup the termination criteria
term_crit = ( cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 1 )

while True:
    ret ,frame = cap.read()
    if ret == True:
        hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
        dst = cv2.calcBackProject([hsv],[0],roi_hist,[0,180],1)

        # Apply CamShift to get the new location
        ret, track_window = cv2.CamShift(dst, track_window, term_crit)

        # Draw it on image
        pts = cv2.boxPoints(ret)
        pts = np.int0(pts)
        img2 = cv2.polylines(frame,[pts],True, 255,2)

        # Display
        plt.imshow(cv2.cvtColor(img2, cv2.COLOR_BGR2RGB))
        plt.axis('off')
        plt.show()

        # Clear output to display next frame
        clear_output(wait=True)

    else:
        break

cv2.destroyAllWindows()
cap.release()


## **20. Template Matching**

### **Theory**

- **Template Matching**: Finding the occurrence of a template image within a larger image.
- **Methods**:
  - **Correlation-Based Matching**: Measures similarity between template and image regions.
  - **Matching Methods**: `cv2.TM_CCOEFF`, `cv2.TM_SQDIFF`, etc.

#### **Function**

- **Matching**: `cv2.matchTemplate(image, template, method)`
- **Normalization**: Result is normalized between 0 and 1.
- **Thresholding**: Define a threshold to identify matches.

#### **Mathematical Model**

- **Cross-Correlation**:

$$
R(x, y) = \sum_{i,j} \left[ T(i,j) \cdot I(x+i, y+j) \right]
$$

Where $ T $ is the template, and $ I $ is the image.

In [None]:
# Upload template image
uploaded = files.upload()
template = cv2.imread(list(uploaded.keys())[0], 0)
w, h = template.shape[::-1]

# Convert main image to grayscale
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Apply template Matching
res = cv2.matchTemplate(img_gray, template, cv2.TM_CCOEFF_NORMED)
threshold = 0.8
loc = np.where(res >= threshold)

for pt in zip(*loc[::-1]):
    cv2.rectangle(img, pt, (pt[0] + w, pt[1] + h), (0,0,255), 2)

# Display the result
plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.show()

## **Conclusion**

This notebook provided an extensive introduction to OpenCV, covering fundamental concepts, practical implementations, and theoretical underpinnings of various image processing techniques. Understanding both the code and the mathematical models behind these operations is crucial for developing advanced computer vision applications.