# Computer Vision with Opencv 

OpenCV (Open Source Computer Vision Library) is an open-source computer vision and machine learning software library. It provides a wide range of tools and algorithms for various computer vision tasks, including image and video manipulation, object detection, tracking, machine learning, and more. Below, I'll provide an overview of OpenCV along with some key equations and information:

### 1. **Image Representation in OpenCV**:
   - **Color Spaces**: OpenCV supports various color spaces, including RGB, BGR, grayscale, HSV, LAB, YUV, etc.
   - **Image Storage**: Images are represented as NumPy arrays, with dimensions (height, width, channels) for color images, and (height, width) for grayscale images.
   - **Data Types**: Pixel values are typically represented as unsigned 8-bit integers (0-255) or floating-point numbers (0.0-1.0).

### 2. **Image Input and Output**:
   - **Reading and Writing Images**: OpenCV provides functions like `cv2.imread()` and `cv2.imwrite()` for reading and writing images in various formats (JPEG, PNG, BMP, etc.).

### 3. **Image Processing**:
   - **Filtering and Convolution**: OpenCV offers functions for applying linear filters and convolutions to images, including Gaussian blur, median blur, Sobel edge detection, etc.
   - **Thresholding**: Thresholding techniques can be used to segment images into binary masks based on pixel intensity.
   - **Morphological Operations**: Erosion, dilation, opening, and closing operations for shape analysis and noise reduction.

### 4. **Feature Detection and Description**:
   - **Feature Detection**: OpenCV provides implementations of feature detection algorithms like Harris corner detector, Shi-Tomasi corner detector, FAST, etc.
   - **Feature Description**: Feature descriptors like SIFT, SURF, ORB, and BRISK are available for describing keypoints detected in images.

### 5. **Object Detection and Recognition**:
   - **Cascade Classifiers**: OpenCV includes pre-trained Haar cascade classifiers for detecting objects like faces, eyes, and cars.
   - **Deep Learning Models**: Integration with deep learning frameworks like TensorFlow and PyTorch allows using state-of-the-art object detection models (e.g., SSD, YOLO, Faster R-CNN).

### 6. **Image Transformation and Geometric Operations**:
   - **Image Resizing and Scaling**: Functions for resizing and scaling images to desired dimensions.
   - **Rotation and Affine Transformation**: OpenCV provides functions for rotating images and applying affine transformations.

### 7. **Camera Calibration and 3D Vision**:
   - **Camera Calibration**: Functions for camera calibration, including intrinsic and extrinsic parameter estimation using chessboard or circle grid patterns.
   - **Stereo Vision**: OpenCV supports stereo vision techniques for depth estimation and 3D reconstruction using calibrated stereo camera systems.

### 8. **Video Processing**:
   - **Video Input and Output**: Functions for reading and writing videos, and capturing frames from live camera streams.
   - **Video Analysis**: Techniques for object tracking, motion detection, and optical flow estimation in video sequences.

### 9. **Machine Learning and Deep Learning**:
   - **Integration with ML Libraries**: OpenCV provides interfaces for using machine learning algorithms from libraries like scikit-learn.
   - **Deep Neural Networks**: Support for deep learning models and frameworks like TensorFlow, Keras, and PyTorch.

### 10. **Graphical User Interface (GUI)**:
   - **HighGUI Module**: OpenCV's HighGUI module provides functions for creating graphical user interfaces for visualizing images, videos, and processing results.

### 11. **Performance Optimization**:
   - **Hardware Acceleration**: OpenCV supports hardware acceleration using technologies like Intel OpenVINO and CUDA for NVIDIA GPUs.
   - **Multithreading**: Functions are optimized for multithreading to leverage multicore CPUs.

OpenCV documentation and tutorials provide detailed explanations of algorithms and functions along with examples of usage. While I've mentioned some common equations and concepts, specific equations may vary depending on the algorithm or technique being implemented. If you have any specific equations or concepts you'd like to explore further, feel free to ask!

# Lists of all the topics in Computer Vision with Opencv

Certainly! Here's a comprehensive list of topics in computer vision with OpenCV:

1. **Image Input and Output**:
   - Reading and writing images in various formats
   - Displaying images in windows
   - Video input and output

2. **Image Processing**:
   - Filtering and convolution
   - Thresholding and binary image operations
   - Morphological operations (erosion, dilation, opening, closing)
   - Image transformations (resizing, rotation, cropping)
   - Histogram equalization and image enhancement
   - Image gradients and edge detection

3. **Feature Detection and Description**:
   - Corner detection (Harris corner detector, Shi-Tomasi corner detector)
   - Blob detection
   - Scale-invariant feature detection (SIFT, SURF)
   - Binary robust independent elementary features (BRISK)
   - Oriented FAST and rotated BRIEF (ORB)

4. **Object Detection and Tracking**:
   - Haar cascade classifiers for object detection (e.g., face detection)
   - Histogram of Oriented Gradients (HOG) for object detection
   - Deep learning-based object detection models (YOLO, SSD, Faster R-CNN)
   - Object tracking algorithms (KCF, CSRT, MIL)

5. **Image Segmentation**:
   - Thresholding techniques
   - Contour detection and extraction
   - Region-based segmentation
   - GrabCut algorithm for interactive segmentation

6. **Camera Calibration and 3D Vision**:
   - Camera calibration (intrinsic and extrinsic parameters)
   - Stereo vision and depth estimation
   - Structure from Motion (SfM)
   - Epipolar geometry and fundamental matrix estimation

7. **Motion Analysis**:
   - Optical flow estimation
   - Dense optical flow algorithms (Farneback, Lucas-Kanade)
   - Background subtraction and motion detection

8. **Feature Matching and Homography**:
   - Feature matching algorithms (Brute-Force, FLANN)
   - Homography estimation for image registration
   - Perspective transformation and image warping

9. **Machine Learning and Deep Learning Integration**:
   - Interface with machine learning libraries (scikit-learn)
   - Integration with deep learning frameworks (TensorFlow, Keras, PyTorch)
   - Fine-tuning pre-trained deep learning models

10. **Graphical User Interface (GUI)**:
    - Displaying images and results in windows
    - Handling mouse events for interactive applications
    - Creating custom GUI elements for user interaction

11. **Performance Optimization**:
    - Hardware acceleration (OpenVINO, CUDA)
    - Multithreading for parallel processing
    - Memory management and efficient data structures

12. **Augmented Reality (AR) and Virtual Reality (VR)**:
    - Marker-based AR applications
    - Image and pattern recognition for AR
    - Integration with game engines for VR development

13. **Object Recognition and Scene Understanding**:
    - Scene classification and recognition
    - Object counting and density estimation
    - Text detection and recognition

14. **Deep Learning-based Image Generation**:
    - Generative Adversarial Networks (GANs) for image synthesis
    - Style transfer and artistic rendering
    - Super-resolution and image inpainting

15. **Embedded and IoT Applications**:
    - Optimization for embedded platforms (Raspberry Pi, Arduino)
    - Real-time processing for IoT devices
    - Edge computing and deployment on resource-constrained devices

These topics cover a wide range of computer vision techniques and applications that can be implemented using OpenCV. Each topic can be explored further with tutorials, documentation, and examples provided by the OpenCV community.

# Image Representation in OpenCV

In [1]:
import cv2
import numpy as np

# Read an image using OpenCV
image_path = "My.jpg"
image = cv2.imread(image_path)

# Check if the image was successfully loaded
if image is None:
    print("Error: Unable to load image.")
else:
    # Get image shape (height, width, channels)
    height, width, channels = image.shape

    # Check if the image is grayscale or color
    if channels == 1:
        print("Grayscale image")
    elif channels == 3:
        print("Color image")

    # Display image dimensions
    print("Image dimensions: {} x {}".format(width, height))

    # Display pixel values at specific coordinates
    x, y = 100, 50
    pixel_value = image[y, x]
    print("Pixel value at ({}, {}): {}".format(x, y, pixel_value))

    # Display pixel values in the entire image
    print("Pixel values in the entire image:")
    print(image)

    # Display the image using OpenCV
    cv2.imshow("Image", image)
    cv2.waitKey(0)
    cv2.destroyAllWindows()


Color image
Image dimensions: 1920 x 1080
Pixel value at (100, 50): [ 2 31  6]
Pixel values in the entire image:
[[[  1  32   3]
  [ 10  41  12]
  [ 12  43  12]
  ...
  [ 22  39  48]
  [ 42  54  64]
  [ 30  43  51]]

 [[  2  33   6]
  [  0  31   4]
  [  0  28   0]
  ...
  [ 35  51  63]
  [ 51  66  75]
  [ 28  44  51]]

 [[  5  34  11]
  [  0  23   0]
  [  8  37  12]
  ...
  [  0  15  27]
  [  5  21  33]
  [  0  16  25]]

 ...

 [[102 178 184]
  [104 180 186]
  [106 180 184]
  ...
  [ 75 149 173]
  [ 71 147 170]
  [ 79 155 178]]

 [[ 98 175 178]
  [101 178 181]
  [110 184 188]
  ...
  [ 83 157 183]
  [ 83 158 184]
  [ 93 168 194]]

 [[105 182 185]
  [102 179 182]
  [107 181 185]
  ...
  [ 98 172 198]
  [ 90 165 191]
  [ 93 168 194]]]


In [2]:
import cv2
import numpy as np

# Read an image using OpenCV
image_path = "My.jpg"
image = cv2.imread(image_path)

# Check if the image was successfully loaded
if image is None:
    print("Error: Unable to load image.")
else:
    # Convert BGR (default) to RGB color space
    rgb_image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

    # Convert BGR to Grayscale
    grayscale_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

    # Convert BGR to HSV
    hsv_image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)

    # Convert BGR to LAB
    lab_image = cv2.cvtColor(image, cv2.COLOR_BGR2LAB)

    # Convert BGR to YUV
    yuv_image = cv2.cvtColor(image, cv2.COLOR_BGR2YUV)

    # Display the original and converted images
    cv2.imshow("Original Image", image)
    cv2.imshow("RGB Image", rgb_image)
    cv2.imshow("Grayscale Image", grayscale_image)
    cv2.imshow("HSV Image", hsv_image)
    cv2.imshow("LAB Image", lab_image)
    cv2.imshow("YUV Image", yuv_image)

    # Wait for a key press and close all windows
    cv2.waitKey(0)
    cv2.destroyAllWindows()



In [3]:
import cv2
import numpy as np

# Read an image using OpenCV
image_path = "My.jpg"
image = cv2.imread(image_path)

# Check if the image was successfully loaded
if image is None:
    print("Error: Unable to load image.")
else:
    # Display image dimensions
    height, width = image.shape[:2]
    print("Image dimensions: {} x {}".format(width, height))

    # Check if the image is grayscale or color
    if len(image.shape) == 2:
        print("Grayscale image")
    elif len(image.shape) == 3:
        print("Color image with {} channels".format(image.shape[2]))

    # Display pixel values at specific coordinates
    x, y = 100, 50
    pixel_value = image[y, x]
    print("Pixel value at ({}, {}): {}".format(x, y, pixel_value))

    # Display pixel values in the entire image
    print("Pixel values in the entire image:")
    print(image)

    # Display the image using OpenCV
    cv2.imshow("Image", image)
    cv2.waitKey(0)
    cv2.destroyAllWindows()


Image dimensions: 1920 x 1080
Color image with 3 channels
Pixel value at (100, 50): [ 2 31  6]
Pixel values in the entire image:
[[[  1  32   3]
  [ 10  41  12]
  [ 12  43  12]
  ...
  [ 22  39  48]
  [ 42  54  64]
  [ 30  43  51]]

 [[  2  33   6]
  [  0  31   4]
  [  0  28   0]
  ...
  [ 35  51  63]
  [ 51  66  75]
  [ 28  44  51]]

 [[  5  34  11]
  [  0  23   0]
  [  8  37  12]
  ...
  [  0  15  27]
  [  5  21  33]
  [  0  16  25]]

 ...

 [[102 178 184]
  [104 180 186]
  [106 180 184]
  ...
  [ 75 149 173]
  [ 71 147 170]
  [ 79 155 178]]

 [[ 98 175 178]
  [101 178 181]
  [110 184 188]
  ...
  [ 83 157 183]
  [ 83 158 184]
  [ 93 168 194]]

 [[105 182 185]
  [102 179 182]
  [107 181 185]
  ...
  [ 98 172 198]
  [ 90 165 191]
  [ 93 168 194]]]


In [5]:
import cv2
import numpy as np

# Read an image using OpenCV
image_path = "My.jpg"
image = cv2.imread(image_path)

# Check if the image was successfully loaded
if image is None:
    print("Error: Unable to load image.")
else:
    # Display pixel values before normalization
    print("Pixel values before normalization (unsigned 8-bit integers):")
    print(image)

    # Normalize pixel values to the range [0, 1]
    normalized_image = image.astype(np.float32) / 255.0

    # Display pixel values after normalization
    print("\nPixel values after normalization (floating-point numbers):")
    print(normalized_image)

    # Display the image using OpenCV
    cv2.imshow("Original Image", image)
    cv2.imshow("Normalized Image", normalized_image)
    cv2.waitKey(0)
    cv2.destroyAllWindows()


Pixel values before normalization (unsigned 8-bit integers):
[[[  1  32   3]
  [ 10  41  12]
  [ 12  43  12]
  ...
  [ 22  39  48]
  [ 42  54  64]
  [ 30  43  51]]

 [[  2  33   6]
  [  0  31   4]
  [  0  28   0]
  ...
  [ 35  51  63]
  [ 51  66  75]
  [ 28  44  51]]

 [[  5  34  11]
  [  0  23   0]
  [  8  37  12]
  ...
  [  0  15  27]
  [  5  21  33]
  [  0  16  25]]

 ...

 [[102 178 184]
  [104 180 186]
  [106 180 184]
  ...
  [ 75 149 173]
  [ 71 147 170]
  [ 79 155 178]]

 [[ 98 175 178]
  [101 178 181]
  [110 184 188]
  ...
  [ 83 157 183]
  [ 83 158 184]
  [ 93 168 194]]

 [[105 182 185]
  [102 179 182]
  [107 181 185]
  ...
  [ 98 172 198]
  [ 90 165 191]
  [ 93 168 194]]]

Pixel values after normalization (floating-point numbers):
[[[0.00392157 0.1254902  0.01176471]
  [0.03921569 0.16078432 0.04705882]
  [0.04705882 0.16862746 0.04705882]
  ...
  [0.08627451 0.15294118 0.1882353 ]
  [0.16470589 0.21176471 0.2509804 ]
  [0.11764706 0.16862746 0.2       ]]

 [[0.00784314 0.129

# Image Input and Output

In [6]:
import cv2

# Read an image using OpenCV
image_path = "My.jpg"
image = cv2.imread(image_path)

# Check if the image was successfully loaded
if image is None:
    print("Error: Unable to load image.")
else:
    # Display the original image
    cv2.imshow("Original Image", image)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

    # Write the image to a new file
    output_image_path = "output_image.jpg"
    cv2.imwrite(output_image_path, image)

    # Check if the image was successfully written
    if cv2.imread(output_image_path) is not None:
        print("Image saved successfully as:", output_image_path)
    else:
        print("Error: Unable to save image.")


Image saved successfully as: output_image.jpg


# Image Processing

In [7]:
import cv2

# Read an image using OpenCV
image_path = "My.jpg"
image = cv2.imread(image_path)

# Check if the image was successfully loaded
if image is None:
    print("Error: Unable to load image.")
else:
    # Apply Gaussian blur
    blurred_image = cv2.GaussianBlur(image, (5, 5), 0)

    # Apply median blur
    median_blurred_image = cv2.medianBlur(image, 5)

    # Apply Sobel edge detection
    sobel_x = cv2.Sobel(image, cv2.CV_64F, 1, 0, ksize=5)
    sobel_y = cv2.Sobel(image, cv2.CV_64F, 0, 1, ksize=5)
    sobel_edges = cv2.magnitude(sobel_x, sobel_y)

    # Display the original and processed images
    cv2.imshow("Original Image", image)
    cv2.imshow("Gaussian Blur", blurred_image)
    cv2.imshow("Median Blur", median_blurred_image)
    cv2.imshow("Sobel Edges", sobel_edges.astype('uint8'))

    # Wait for a key press and close all windows
    cv2.waitKey(0)
    cv2.destroyAllWindows()


In [8]:
import cv2

# Read an image using OpenCV
image_path = "My.jpg"
image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

# Check if the image was successfully loaded
if image is None:
    print("Error: Unable to load image.")
else:
    # Apply simple thresholding
    _, binary_threshold = cv2.threshold(image, 127, 255, cv2.THRESH_BINARY)

    # Apply adaptive thresholding
    adaptive_threshold = cv2.adaptiveThreshold(image, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 2)

    # Display the original and processed images
    cv2.imshow("Original Image", image)
    cv2.imshow("Binary Threshold", binary_threshold)
    cv2.imshow("Adaptive Threshold", adaptive_threshold)

    # Wait for a key press and close all windows
    cv2.waitKey(0)
    cv2.destroyAllWindows()


In [9]:
import cv2
import numpy as np

# Read an image using OpenCV
image_path = "My.jpg"
image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

# Check if the image was successfully loaded
if image is None:
    print("Error: Unable to load image.")
else:
    # Apply erosion
    kernel = np.ones((5, 5), np.uint8)
    eroded_image = cv2.erode(image, kernel, iterations=1)

    # Apply dilation
    dilated_image = cv2.dilate(image, kernel, iterations=1)

    # Apply opening (erosion followed by dilation)
    opened_image = cv2.morphologyEx(image, cv2.MORPH_OPEN, kernel)

    # Apply closing (dilation followed by erosion)
    closed_image = cv2.morphologyEx(image, cv2.MORPH_CLOSE, kernel)

    # Display the original and processed images
    cv2.imshow("Original Image", image)
    cv2.imshow("Eroded Image", eroded_image)
    cv2.imshow("Dilated Image", dilated_image)
    cv2.imshow("Opened Image", opened_image)
    cv2.imshow("Closed Image", closed_image)

    # Wait for a key press and close all windows
    cv2.waitKey(0)
    cv2.destroyAllWindows()


# Feature Detection and Description

In [11]:
import cv2
import numpy as np

# Load image
image = cv2.imread('My.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Harris corner detection
harris_corners = cv2.cornerHarris(gray, 2, 3, 0.04)  # You can adjust parameters here
harris_corners = cv2.dilate(harris_corners, None)  # Dilate corner points to enhance visibility

# Threshold for an optimal value, it may vary depending on the image.
image[harris_corners > 0.01 * harris_corners.max()] = [0, 0, 255]  # Draw red dots on corners

# Shi-Tomasi corner detection
corners = cv2.goodFeaturesToTrack(gray, 25, 0.01, 10)  # Parameters can be adjusted
corners = np.int0(corners)

for corner in corners:
    x, y = corner.ravel()
    cv2.circle(image, (x, y), 3, 255, -1)  # Draw circles on corners

cv2.imshow('Harris & Shi-Tomasi Corner Detection', image)
cv2.waitKey(0)
cv2.destroyAllWindows()


  corners = np.int0(corners)


In [14]:
import cv2

# Load image
image = cv2.imread('My.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# SIFT (Scale-Invariant Feature Transform)
sift = cv2.SIFT_create()
keypoints_sift, descriptors_sift = sift.detectAndCompute(gray, None)

# SURF (Speeded-Up Robust Features)
#surf = cv2.xfeatures2d.SURF_create()
#keypoints_surf, descriptors_surf = surf.detectAndCompute(gray, None)

# ORB (Oriented FAST and Rotated BRIEF)
orb = cv2.ORB_create()
keypoints_orb, descriptors_orb = orb.detectAndCompute(gray, None)

# BRISK (Binary Robust Invariant Scalable Keypoints)
brisk = cv2.BRISK_create()
keypoints_brisk, descriptors_brisk = brisk.detectAndCompute(gray, None)

# Draw keypoints on the image
image_sift = cv2.drawKeypoints(image, keypoints_sift, None, flags=cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
#image_surf = cv2.drawKeypoints(image, keypoints_surf, None, flags=cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
image_orb = cv2.drawKeypoints(image, keypoints_orb, None, flags=cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
image_brisk = cv2.drawKeypoints(image, keypoints_brisk, None, flags=cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)

# Display images
cv2.imshow('SIFT Keypoints', image_sift)
#cv2.imshow('SURF Keypoints', image_surf)
cv2.imshow('ORB Keypoints', image_orb)
cv2.imshow('BRISK Keypoints', image_brisk)
cv2.waitKey(0)
cv2.destroyAllWindows()


# Object Detection and Recognition

In [17]:
import cv2

image_path = "My.jpg"
window_name = f"Detected Objects in {image_path}"
original_image = cv2.imread(image_path)

# Convert the image to grayscale for easier computation
image_grey = cv2.cvtColor(original_image, cv2.COLOR_RGB2GRAY)

cascade_classifier = cv2.CascadeClassifier(
    f"{cv2.data.haarcascades}haarcascade_eye.xml")
detected_objects = cascade_classifier.detectMultiScale(image_grey, minSize=(50, 50))

# Draw rectangles on the detected objects
if len(detected_objects) != 0:
    for (x, y, width, height) in detected_objects:
        cv2.rectangle(original_image, (x, y),
                      (x + height, y + width),
                      (0, 255, 0), 2)

cv2.namedWindow(window_name, cv2.WINDOW_KEEPRATIO)
cv2.imshow(window_name, original_image)
cv2.resizeWindow(window_name, 400, 400)
cv2.waitKey(0)
cv2.destroyAllWindows()

In [18]:
import tensorflow as tf
from tensorflow.keras.applications import VGG16
from tensorflow.keras.layers import Conv2D, Reshape

# Define the model architecture
def create_object_detection_model(input_shape, num_classes):
    base_model = VGG16(weights='imagenet', include_top=False, input_shape=input_shape)
    
    x = base_model.output
    x = Conv2D(1024, (3, 3), activation='relu')(x)
    predictions = Conv2D(num_classes, (1, 1), activation='softmax')(x)
    predictions = Reshape((-1, num_classes))(predictions)
    
    model = tf.keras.Model(inputs=base_model.input, outputs=predictions)
    
    return model

# Example usage
input_shape = (224, 224, 3)
num_classes = 20  # Number of classes
model = create_object_detection_model(input_shape, num_classes)


Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5
[1m58889256/58889256[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m206s[0m 4us/step


In [20]:
import torch
import torch.nn as nn
import torchvision.models as models

# Define the model architecture
class ObjectDetectionModel(nn.Module):
    def __init__(self, num_classes):
        super(ObjectDetectionModel, self).__init__()
        self.base_model = models.vgg16(pretrained=True)
        self.features = self.base_model.features
        self.classifier = nn.Sequential(
            nn.Conv2d(512, 1024, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(1024, num_classes, kernel_size=1)
        )

    def forward(self, x):
        x = self.features(x)
        x = self.classifier(x)
        x = x.permute(0, 2, 3, 1)
        return x.view(x.size(0), -1, x.size(-1))

# Example usage
num_classes = 20  # Number of classes
model = ObjectDetectionModel(num_classes)


Downloading: "https://download.pytorch.org/models/vgg16-397923af.pth" to C:\Users\varun/.cache\torch\hub\checkpoints\vgg16-397923af.pth
100%|████████████████████████████████████████████████████████████████████████████████| 528M/528M [16:17<00:00, 566kB/s]


#  Image Transformation and Geometric Operations

In [21]:
import cv2

def resize_image(image, new_width=None, new_height=None, scale_factor=None):
    """
    Resize the image to the specified dimensions.

    Args:
    - image: Input image.
    - new_width: Desired width of the output image.
    - new_height: Desired height of the output image.
    - scale_factor: Scaling factor to be applied to the image.

    Returns:
    - Resized image.
    """
    if new_width is not None and new_height is not None:
        # Resize based on specific width and height
        resized_image = cv2.resize(image, (new_width, new_height))
    elif scale_factor is not None:
        # Resize based on scale factor
        resized_image = cv2.resize(image, None, fx=scale_factor, fy=scale_factor)
    else:
        raise ValueError("Either new_width and new_height or scale_factor must be provided.")

    return resized_image

# Load an image
image = cv2.imread('My.jpg')

# Resize the image to a specific width and height
resized_image_wh = resize_image(image, new_width=300, new_height=200)

# Resize the image based on a scale factor
resized_image_sf = resize_image(image, scale_factor=0.5)

# Display the original and resized images
cv2.imshow('Original Image', image)
cv2.imshow('Resized Image (Width, Height)', resized_image_wh)
cv2.imshow('Resized Image (Scale Factor)', resized_image_sf)
cv2.waitKey(0)
cv2.destroyAllWindows()


In [22]:
import cv2
import numpy as np

def rotate_image(image, angle):
    """
    Rotate the image by the specified angle.

    Args:
    - image: Input image.
    - angle: Angle of rotation in degrees.

    Returns:
    - Rotated image.
    """
    h, w = image.shape[:2]
    center = (w // 2, h // 2)
    rotation_matrix = cv2.getRotationMatrix2D(center, angle, 1.0)
    rotated_image = cv2.warpAffine(image, rotation_matrix, (w, h))
    return rotated_image

def apply_affine_transformation(image, translation=(0, 0), scale=1.0, shear=0.0, angle=0.0):
    """
    Apply affine transformation to the image.

    Args:
    - image: Input image.
    - translation: Tuple (tx, ty) specifying the translation in the x and y directions.
    - scale: Scaling factor.
    - shear: Shearing angle in degrees.
    - angle: Rotation angle in degrees.

    Returns:
    - Transformed image.
    """
    h, w = image.shape[:2]
    center = (w // 2, h // 2)

    # Build the transformation matrix
    transformation_matrix = cv2.getRotationMatrix2D(center, angle, scale)
    transformation_matrix[:, 2] += translation  # Apply translation

    # Apply affine transformation
    transformed_image = cv2.warpAffine(image, transformation_matrix, (w, h), flags=cv2.INTER_LINEAR)

    return transformed_image

# Load an image
image = cv2.imread('My.jpg')

# Rotate the image by 45 degrees
rotated_image = rotate_image(image, angle=45)

# Apply affine transformation (translation, scale, shear, and rotation)
transformed_image = apply_affine_transformation(image, translation=(50, 50), scale=1.5, shear=20, angle=30)

# Display the original, rotated, and transformed images
cv2.imshow('Original Image', image)
cv2.imshow('Rotated Image', rotated_image)
cv2.imshow('Transformed Image', transformed_image)
cv2.waitKey(0)
cv2.destroyAllWindows()


# Camera Calibration and 3D Vision:
Camera Calibration: Functions for camera calibration, including intrinsic and extrinsic parameter estimation using chessboard or circle grid patterns.
Stereo Vision: OpenCV supports stereo vision techniques for depth estimation and 3D reconstruction using calibrated stereo camera systems

In [None]:
import cv2
import numpy as np

# Prepare object points, like (0,0,0), (1,0,0), (2,0,0) ..., (6,5,0)
objp = np.zeros((6*7, 3), np.float32)
objp[:,:2] = np.mgrid[0:7,0:6].T.reshape(-1,2)

# Arrays to store object points and image points from all the images.
objpoints = [] # 3d point in real world space
imgpoints = [] # 2d points in image plane.

images = glob.glob('*.jpg')

for fname in images:
    img = cv2.imread(fname)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    # Find the chessboard corners
    ret, corners = cv2.findChessboardCorners(gray, (7,6), None)

    # If found, add object points, image points (after refining them)
    if ret == True:
        objpoints.append(objp)

        corners2 = cv2.cornerSubPix(gray, corners, (11,11), (-1,-1), criteria)
        imgpoints.append(corners2)

        # Draw and display the corners
        img = cv2.drawChessboardCorners(img, (7,6), corners2, ret)
        cv2.imshow('img', img)
        cv2.waitKey(500)

cv2.destroyAllWindows()

# Calibrate camera
ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(objpoints, imgpoints, gray.shape[::-1], None, None)


In [None]:
# Load the image
img = cv2.imread('My.jpg')

# Undistort the image using camera matrix and distortion coefficients
undistorted_img = cv2.undistort(img, mtx, dist, None, mtx)

# Perform other 3D vision tasks such as depth estimation or 3D reconstruction

In [4]:
import cv2
import numpy as np

# Load left and right images
left_image = cv2.imread('My.jpg', 0)
right_image = cv2.imread('My.jpg', 0)

# Stereo calibration (you need to provide calibration parameters)
# For stereo calibration, you typically capture a set of images of a calibration pattern
# and use them to compute camera matrices and distortion coefficients
# Once you have these parameters, you can use them to rectify your stereo images.

# Stereo rectification
# Once you have the calibration parameters, you can rectify your stereo images to ensure corresponding
# points in the two images lie along the same horizontal lines. This simplifies stereo correspondence
# by making it a 1D search problem instead of a 2D search problem.

# Perform stereo matching to find correspondences between the left and right images
stereo = cv2.StereoBM_create(numDisparities=16, blockSize=15)  # Adjust parameters as needed
disparity = stereo.compute(left_image, right_image)

# Normalize the disparity map for visualization
disparity = cv2.normalize(disparity, disparity, alpha=0, beta=255, norm_type=cv2.NORM_MINMAX, dtype=cv2.CV_8U)

# Display the disparity map
cv2.imshow('Disparity Map', disparity)
cv2.waitKey(0)
cv2.destroyAllWindows()


# Video Processing:
Video Input and Output: Functions for reading and writing videos, and capturing frames from live camera streams.
Video Analysis: Techniques for object tracking, motion detection, and optical flow estimation in video sequences.

In [5]:
import cv2

# Read video from file
video_capture = cv2.VideoCapture('My.mp4')

# Check if the video file is opened successfully
if not video_capture.isOpened():
    print("Error: Could not open video file.")
    exit()

# Get video properties
frame_width = int(video_capture.get(cv2.CAP_PROP_FRAME_WIDTH))
frame_height = int(video_capture.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = int(video_capture.get(cv2.CAP_PROP_FPS))
total_frames = int(video_capture.get(cv2.CAP_PROP_FRAME_COUNT))

# Define video writer
output_video = cv2.VideoWriter('output_video.mp4', cv2.VideoWriter_fourcc(*'mp4v'), fps, (frame_width, frame_height))

# Process each frame in the video
while video_capture.isOpened():
    ret, frame = video_capture.read()

    if not ret:
        break

    # Perform any operations on the frame (e.g., image processing)
    # For example, you can convert the frame to grayscale
    # gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    # Write processed frame to the output video
    output_video.write(frame)

    # Display the frame
    cv2.imshow('Frame', frame)

    # Break the loop if 'q' key is pressed
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# Release video capture and writer
video_capture.release()
output_video.release()

# Close all OpenCV windows
cv2.destroyAllWindows()


In [3]:
cap = cv2.VideoCapture('My.mp4')
ret,frame=cap.read()
x,y,w,h = cv2.selectROI(frame)
track_window = (x, y, w, h)
roi = frame[y:y+h, x:x+w]
hsv_roi =  cv2.cvtColor(roi, cv2.COLOR_BGR2HSV)
mask = cv2.inRange(hsv_roi, np.array((0., 60.,32.)), np.array((180.,255.,255.)))
roi_hist = cv2.calcHist([hsv_roi],[0],mask,[180],[0,180])
cv2.normalize(roi_hist,roi_hist,0,255,cv2.NORM_MINMAX)
term_crit = ( cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 1 )
while(1):
   ret, frame = cap.read()
   if ret == True:
       hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
       dst = cv2.calcBackProject([hsv],[0],roi_hist,[0,180],1)
       ret, track_window = cv2.meanShift(dst, track_window, term_crit)
       x,y,w,h = track_window
       img2 = cv2.rectangle(frame, (x,y), (x+w,y+h), 255,2)
       cv2.imshow('img2',img2)
       k = cv2.waitKey(30) & 0xff
       if k == 27:
           break
   else:
       break
cv2.destroyAllWindows()

In [10]:
# importing libraries
import cv2
import numpy

# Defining a function motionDetection
def motionDetection():
    # capturing video in real time
    cap = cv2.VideoCapture(0)

    # reading frames sequentially
    ret, frame1 = cap.read()
    ret, frame2 = cap.read()

    while cap.isOpened():

        # difference between the frames
        diff = cv2.absdiff(frame1, frame2)
        diff_gray = cv2.cvtColor(diff, cv2.COLOR_BGR2GRAY)
        blur = cv2.GaussianBlur(diff_gray, (5, 5), 0)
        _, thresh = cv2.threshold(blur, 20, 255, cv2.THRESH_BINARY)
        dilated = cv2.dilate(thresh, None, iterations=3)
        contours, _ = cv2.findContours(
            dilated, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

        for contour in contours:
            (x, y, w, h) = cv2.boundingRect(contour)
            if cv2.contourArea(contour) < 900:
                continue
            cv2.rectangle(frame1, (x, y), (x+w, y+h), (0, 255, 0), 2)
            cv2.putText(frame1, "STATUS: {}".format('MOTION DETECTED'), (10, 60), cv2.FONT_HERSHEY_SIMPLEX,
                        1, (217, 10, 10), 2)

        # cv.drawContours(frame1, contours, -1, (0, 255, 0), 2)

        cv2.imshow("Video", frame1)
        frame1 = frame2
        ret, frame2 = cap.read()

        if cv2.waitKey(50) == 27:
            break

    cap.release()
    cv2.destroyAllWindows()


if __name__ == "__main__":
    motionDetection()

KeyboardInterrupt: 

# Machine Learning and Deep Learning

In [4]:
import cv2
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score

# Generate synthetic data for demonstration
X = np.random.rand(100, 10)  # Features
y = np.random.randint(0, 2, 100)  # Labels (binary classification)

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train a machine learning model using scikit-learn
clf = RandomForestClassifier(n_estimators=100)
clf.fit(X_train, y_train)

# Make predictions on the test set
predictions = clf.predict(X_test)

# Evaluate the model
accuracy = accuracy_score(y_test, predictions)
print("Accuracy:", accuracy)

# Now, let's integrate with OpenCV for further image processing or computer vision tasks

# Generate a synthetic test image for demonstration
test_image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)

# Perform some preprocessing or feature extraction on the image (if needed)
# For example, you might resize the image or convert it to grayscale

# Make predictions on the image using the trained model
# Assuming you have extracted features from the image and stored them in 'features'
# For the sake of this example, let's assume 'features' is a random feature vector
features = np.random.rand(10)  # Random feature vector
result = clf.predict([features])

# Display the result
print("Prediction:", result)


Accuracy: 0.75
Prediction: [1]


In [5]:
import tensorflow as tf
from tensorflow.keras import layers, models

# Define the architecture of the neural network
def create_model():
    model = models.Sequential([
        layers.Dense(64, activation='relu', input_shape=(784,)),
        layers.Dense(64, activation='relu'),
        layers.Dense(10, activation='softmax')
    ])
    model.compile(optimizer='adam',
                  loss='sparse_categorical_crossentropy',
                  metrics=['accuracy'])
    return model

# Load dataset (for example, MNIST)
mnist = tf.keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# Preprocess the data
train_images, test_images = train_images / 255.0, test_images / 255.0

# Flatten the images
train_images = train_images.reshape((-1, 784))
test_images = test_images.reshape((-1, 784))

# Create the model
model = create_model()

# Train the model
model.fit(train_images, train_labels, epochs=5)

# Evaluate the model
test_loss, test_acc = model.evaluate(test_images, test_labels)
print('Test accuracy:', test_acc)


Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
[1m11490434/11490434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1179s[0m 103us/step


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 1ms/step - accuracy: 0.8585 - loss: 0.4787
Epoch 2/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 871us/step - accuracy: 0.9617 - loss: 0.1262
Epoch 3/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 845us/step - accuracy: 0.9723 - loss: 0.0895
Epoch 4/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 870us/step - accuracy: 0.9787 - loss: 0.0690
Epoch 5/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 847us/step - accuracy: 0.9836 - loss: 0.0534
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 719us/step - accuracy: 0.9693 - loss: 0.1037
Test accuracy: 0.973800003528595


In [6]:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms

# Define the architecture of the neural network
class NeuralNet(nn.Module):
    def __init__(self):
        super(NeuralNet, self).__init__()
        self.fc1 = nn.Linear(784, 64)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(64, 64)
        self.fc3 = nn.Linear(64, 10)

    def forward(self, x):
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        x = self.relu(x)
        x = self.fc3(x)
        return x

# Load dataset (for example, MNIST)
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])
trainset = torchvision.datasets.MNIST(root='./data', train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=32, shuffle=True)

# Create the model
model = NeuralNet()

# Define loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Train the model
for epoch in range(5):
    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        inputs, labels = data
        inputs = inputs.view(-1, 784)
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.item()
        if i % 1000 == 999:  # Print every 1000 mini-batches
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 1000))
            running_loss = 0.0

print('Finished Training')

# Evaluate the model
testset = torchvision.datasets.MNIST(root='./data', train=False, download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=32, shuffle=False)

correct = 0
total = 0
with torch.no_grad():
    for data in testloader:
        images, labels = data
        images = images.view(-1, 784)
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print('Accuracy of the network on the 10000 test images: %d %%' % (100 * correct / total))


Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 503: Service Unavailable

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz to ./data\MNIST\raw\train-images-idx3-ubyte.gz


100%|█████████████████████████████████████████████████████████████████████| 9912422/9912422 [14:52<00:00, 11106.67it/s]


Extracting ./data\MNIST\raw\train-images-idx3-ubyte.gz to ./data\MNIST\raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to ./data\MNIST\raw\train-labels-idx1-ubyte.gz


100%|██████████████████████████████████████████████████████████████████████████| 28881/28881 [00:06<00:00, 4230.02it/s]


Extracting ./data\MNIST\raw\train-labels-idx1-ubyte.gz to ./data\MNIST\raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to ./data\MNIST\raw\t10k-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 503: Service Unavailable

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz to ./data\MNIST\raw\t10k-images-idx3-ubyte.gz


100%|█████████████████████████████████████████████████████████████████████| 1648877/1648877 [02:25<00:00, 11326.97it/s]


Extracting ./data\MNIST\raw\t10k-images-idx3-ubyte.gz to ./data\MNIST\raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 503: Service Unavailable

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz to ./data\MNIST\raw\t10k-labels-idx1-ubyte.gz


100%|████████████████████████████████████████████████████████████████████████████| 4542/4542 [00:00<00:00, 9187.19it/s]


Extracting ./data\MNIST\raw\t10k-labels-idx1-ubyte.gz to ./data\MNIST\raw

[1,  1000] loss: 0.478
[2,  1000] loss: 0.214
[3,  1000] loss: 0.163
[4,  1000] loss: 0.132
[5,  1000] loss: 0.117
Finished Training
Accuracy of the network on the 10000 test images: 96 %


# Graphical User Interface (GUI):
HighGUI Module: OpenCV's HighGUI module provides functions for creating graphical user interfaces for visualizing images, videos, and processing results

In [7]:
import cv2

def display_image(image_path):
    # Load image
    image = cv2.imread(image_path)

    # Check if image is loaded successfully
    if image is None:
        print("Error: Unable to load image.")
        return

    # Display the image in a window
    cv2.imshow("Image", image)

    # Wait for a key press and close the window when any key is pressed
    cv2.waitKey(0)
    cv2.destroyAllWindows()

# Path to the image you want to display
image_path = "My.jpg"

# Display the image
display_image(image_path)


#  Performance Optimization

In [8]:
import cv2

def load_and_process_image(image_path):
    # Load image
    image = cv2.imread(image_path)

    # Check if image is loaded successfully
    if image is None:
        print("Error: Unable to load image.")
        return

    # Convert the image to grayscale
    gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

    # Perform some image processing operations
    processed_image = cv2.blur(gray_image, (5, 5))

    return processed_image

def main():
    # Path to the image you want to process
    image_path = "My.jpg"

    # Load and process the image
    processed_image = load_and_process_image(image_path)

    # Display the processed image
    cv2.imshow("Processed Image", processed_image)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

if __name__ == "__main__":
    main()


In [9]:
import threading
import time

# Function to perform some computation (dummy example)
def compute_task(data):
    result = 0
    for i in range(data):
        result += i
    return result

# Function to be executed by each thread
def thread_task(data, thread_num):
    print("Thread {} started".format(thread_num))
    result = compute_task(data)
    print("Thread {} finished. Result: {}".format(thread_num, result))

def main():
    # Data to be processed
    data = 10000000

    # Number of threads to create
    num_threads = 4

    # Create and start threads
    threads = []
    for i in range(num_threads):
        thread = threading.Thread(target=thread_task, args=(data, i))
        thread.start()
        threads.append(thread)

    # Wait for all threads to finish
    for thread in threads:
        thread.join()

    print("All threads have finished execution.")

if __name__ == "__main__":
    start_time = time.time()
    main()
    end_time = time.time()
    print("Execution time:", end_time - start_time, "seconds")


Thread 0 started
Thread 1 started
Thread 2 started
Thread 3 started
Thread 0 finished. Result: 49999995000000
Thread 1 finished. Result: 49999995000000
Thread 2 finished. Result: 49999995000000
Thread 3 finished. Result: 49999995000000
All threads have finished execution.
Execution time: 1.5782699584960938 seconds
