# OpenCV + NumPy Cheatsheet

---

## 🗾️ 1. Image I/O & Properties

| Task | Code / Explanation |
|------|--------------------|
| Read a color image | `img = cv2.imread(image_path)` |
| Read grayscale image | `img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)` |
| Get image dimensions | `h, w, c = img.shape` (height, width, channels) |
| Check image type | `img.dtype` |
| Convert to float | `img = img.astype(np.float32)` |

---

## 🎨 2. Image Initialization

| Task | Code |
|------|------|
| Create empty (zeros) image | `img = np.zeros((h, w, c), np.uint8)` |
| Clone shape & type | `img2 = np.zeros_like(img)` |
| Create image filled with 1s | `img = np.ones((h, w), np.float32)` |

---

## 🔀 3. Image Manipulation

| Task | Code |
|------|------|
| Invert image | `img_inv = 255 - img` |
| Normalize values to [0, 1] | `img = img / 255.0` |
| Custom normalization | `img[y,x] = ((img[y,x] - min) / (max - min)) * 255` |
| Resize image | `resized = cv2.resize(img, (width, height))` |
| Concatenate images | `cv2.hconcat([img1, img2])` (horizontal) |

---

## 🖛️ 4. Filtering & Convolution

| Task | Code |
|------|------|
| 2D Convolution | `cv2.filter2D(img, -1, kernel)` |
| Normalize result | `cv2.normalize(imgRes, imgRes, 0, 255, cv2.NORM_MINMAX)` |

---

## 🎺 5. Histogram & Plotting (Matplotlib)

| Task | Code |
|------|------|
| Init histogram | `histo = np.zeros((256, 1), np.uint16)` |
| Calculate histogram | `cv2.calcHist([img], [0], None, [256], [0,256])` |
| Plot histogram | `plt.plot(histo); plt.xlim([0, 255]); plt.show()` |

---

## 🔱 6. Trackbars (Interactive Sliders)

| Task | Code |
|------|------|
| Create trackbar | `cv2.createTrackbar("thresh", "window", 0, 255, callback_fn)` |

---

## 🧬 7. Morphological Operations

| Task | Code |
|------|------|
| Thresholding | `cv2.threshold(img, 128, 255, cv2.THRESH_BINARY, img)` |
| Erode | `cv2.erode(img, kernel)` |
| Dilate | `cv2.dilate(img, kernel)` |
| MorphologyEx (e.g., Gradient) | `cv2.morphologyEx(img, cv2.MORPH_GRADIENT, kernel)` |
| Kernel structure | `cv2.getStructuringElement(cv2.MORPH_CROSS, (size, size))` |

---

## 🔍 8. Gradient & Edge Detection

| Task | Code |
|------|------|
| Compute gradients | `grad_x = img[:, :-1] - img[:, 1:]` |
| Pad arrays | `grad_x = np.pad(grad_x, ((0,0),(0,1)), mode='constant')` |
| Gradient magnitude | `grad = np.sqrt(grad_x**2 + grad_y**2)` |

---

## 🎥 9. Video & Camera Input

| Task | Code |
|------|------|
| From webcam | `cv2.VideoCapture(0)` |
| From file | `cv2.VideoCapture('video.avi')` |
| From phone | `cv2.VideoCapture("http://IP:PORT/video")` |
| Get frame size | `w = int(cap.get(3)), h = int(cap.get(4))` |
| Read frame | `ret, frame = cap.read()` |
| Flip frame | `cv2.flip(frame, 1, frame)` |

---

## 📀 10. Video Output (Recording)

| Task | Code |
|------|------|
| Define codec | `fourcc = cv2.VideoWriter_fourcc('X','V','I','D')` |
| Init writer | `out = cv2.VideoWriter('file.avi', fourcc, 30, (w, h))` |
| Write frame | `out.write(frame)` |
| Release | `out.release()` |

---

## 🔴 11. Image Color Spaces

| Task | Code |
|------|------|
| Split BGR | `img_b[:,:,0], img_g[:,:,1], img_r[:,:,2] = img[:,:,0], img[:,:,1], img[:,:,2]` |
| RGB to gray (manual) | `gray = (b + g + r) / 3` or use weights |
| BGR to HSV | `img_hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HLS)` |
| BGR to Grayscale | `cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)` |

---

## 🔄 12. Utility / Control Flow

| Task | Code |
|------|------|
| Wait for key | `cv2.waitKey(0)` |
| Destroy all windows | `cv2.destroyAllWindows()` |
| Quit condition | `if cv2.waitKey(20) & 0xFF == ord('q'):` |
| Random point | `x, y = randrange(w), randrange(h)` |

---

## 🧠 13. Data Types in NumPy

| Type | Range |
|------|--------|
| `uint8` | [0, 255] |
| `int8` | [-128, 127] |
| `uint16` | [0, 65535] |
| `int16` | [-32768, 32767] |
| `uint32` | [0, 4294967295] |
| `int32` | [-2147483648, 2147483647] |
| `float32` | ~ [1.2e-38, 3.4e+38] |


In [1]:
''' TP 1 – Negative Transformation of an Image '''
import cv2
import numpy as np

# Load the image from file
image_path = 'sadcat.jpeg'
image = cv2.imread(image_path)

# Check if the image was loaded successfully
if image is None:
    print("Error: The image couldn't be loaded. Please check the file path.")
    exit(0)
else:
    # Get image dimensions (height, width, channels)
    h, w, c = image.shape

    # Create a float32 image with the same shape as the original
    imgRes = np.zeros(image.shape, np.float32)

    # Compute the negative of the image using vectorized operation
    # Subtract each pixel value from 255 (max intensity) to invert the image
    imgRes[:] = 255 - image[:]

    # Normalize the result to [0, 1] range (optional step depending on use case)
    imgRes = imgRes / 255.

    # Display the original and the resulting images
    cv2.imshow("Original Image", image)
    cv2.imshow("Negative Image", imgRes)

    # Wait for a key press before closing the image windows
    cv2.waitKey(0)
    cv2.destroyAllWindows()


In [None]:
''' TP 2 – Manual Normalization of a Grayscale Image '''
import cv2
import numpy as np
import matplotlib.pyplot as plt

# Load the image in grayscale mode
image_path = 'sadcat.jpeg'
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

# Simulate an image with low brightness by dividing pixel values by 2
img[:] = img[:] / 2

# Uncomment to save this darker version if needed
# cv2.imwrite("img2.jfif", img)

# Check if the image was loaded successfully
if img is None:
    print("Error: The image couldn't be loaded. Please check the file path.")
    exit(0)

# Create an empty image to store the normalized result
imgNorm = np.zeros(img.shape, np.uint8)

# Get the image dimensions
h, w = img.shape

# Initialize min and max values for normalization
min_val = 255
max_val = 0

# Find the minimum and maximum pixel values in the image
for y in range(h):
    for x in range(w):
        if img[y, x] > max_val:
            max_val = img[y, x]
        if img[y, x] < min_val:
            min_val = img[y, x]

# Normalize the pixel values to span the full range [0, 255]
for y in range(h):
    for x in range(w):
        imgNorm[y, x] = ((img[y, x] - min_val) / (max_val - min_val)) * 255

# Print the original min and max values (for verification)
print("Min pixel value:", min_val)
print("Max pixel value:", max_val)

# Display the original (darkened) and normalized images
cv2.imshow("Original Image (Darkened)", img)
cv2.imshow("Normalized Image", imgNorm)

# Wait for a key press and then close all windows
cv2.waitKey(0)
cv2.destroyAllWindows()


In [None]:
# Create an empty histogram array of size 256 (for 256 grayscale levels)
# np.uint16 is used to handle large counts, depending on the image size
histo1 = np.zeros((256, 1), np.uint16)

# Manually compute the histogram of the original (darkened) image
for y in range(h):
    for x in range(w):
        histo1[img[y, x], 0] += 1

# Use OpenCV’s built-in function to calculate histogram of the normalized image
histo2 = cv2.calcHist([imgNorm], [0], None, [256], [0, 255])

# Plot both histograms using Matplotlib
plt.figure()
plt.title("Normalized Image Histogram")
plt.xlabel("Gray Level")
plt.ylabel("Number of Pixels")

# Plot the manually computed histogram for the original image
plt.plot(histo1, label="Original Image (Manual Histogram)")

# Plot the OpenCV-computed histogram for the normalized image
plt.plot(histo2, label="Normalized Image (cv2.calcHist)", linestyle='dashed')

# Limit x-axis to valid grayscale range
plt.xlim([0, 255])
plt.legend()
plt.show()


In [6]:
# === HISTOGRAM EXAMPLE 2 – Visualizing Horizontal and Vertical Projections ===

import cv2 
import numpy as np 

# Load the grayscale image
img = cv2.imread('sadcat.jpeg', cv2.IMREAD_GRAYSCALE)

# Apply binary thresholding: pixels > 130 become 255, others become 0
cv2.threshold(img, 130, 255, cv2.THRESH_BINARY, img)

# Get image dimensions
h, w = img.shape 

# Initialize arrays to count black pixels per row and column
lignes = np.zeros((h), np.uint16)  # Horizontal projection (per row)
cols = np.zeros((w), np.uint16)    # Vertical projection (per column)

# Count black pixels (value = 0) in each row and column
for i in range(h):
    for j in range(w):
        if img[i, j] == 0:
            lignes[i] += 1
            cols[j] += 1

# Create white images to visualize horizontal and vertical projections
imgLignes = np.full(img.shape, 255, dtype=np.uint8)
imgCols = np.full(img.shape, 255, dtype=np.uint8)

# Draw black lines in each row based on the count of black pixels
for i in range(h):
    for j in range(lignes[i]):
        imgLignes[i, j] = 0

# Draw black lines in each column based on the count of black pixels
for j in range(w):
    for i in range(cols[j]):
        imgCols[i, j] = 0

# Display the original and the histogram visualizations side by side
cv2.imshow("Source Image", img)
cv2.imshow("Horizontal Projection (Row Histogram)", cv2.hconcat([img, imgLignes]))
cv2.imshow("Vertical Projection (Column Histogram)", cv2.hconcat([img, imgCols]))

# Wait for a key press and close all windows
cv2.waitKey(0)
cv2.destroyAllWindows()


In [4]:
from random import randrange 
import numpy as np
import cv2

# === EXERCISE 3 – Random Point Movement on a Binary Image ===

# Function to create an image with all white pixels and a single black pixel at a random location
def createImgWithPointRand(h, w):
    img = np.ones((h, w), np.float32)  # Create a white image (all values = 1.0)
    randPointY, randPointX = randrange(h), randrange(w)  # Choose a random point
    img[randPointY, randPointX] = 0  # Set that random point to black (0.0)
    return img

# Function to find the position of the black pixel (value = 0) in the image
def findBlackPixel(img):
    h, w = img.shape
    for y in range(h):
        for x in range(w):
            if img[y, x] == 0:
                return (y, x)

# === Image configuration ===
heightImg = 200
widthImg = 400
step = 3  # Number of pixels to move each time

# Create the initial image and locate the black pixel
img = createImgWithPointRand(heightImg, widthImg)
(py, px) = findBlackPixel(img)

# Initialize input key
q = 'a'

# === Main interaction loop ===
while True:
    # Down arrow key (ASCII 50): move the black pixel downward
    if q == 50 and py + step < heightImg:
        img[py, px] = 1
        img[py + step, px] = 0
        py = py + step

    # Up arrow key (ASCII 56): move the black pixel upward
    if q == 56 and py - step >= 0:
        img[py, px] = 1
        img[py - step, px] = 0
        py = py - step

    # Left arrow key (ASCII 52): move the black pixel to the left
    if q == 52 and px - step >= 0:
        img[py, px] = 1
        img[py, px - step] = 0
        px = px - step

    # Right arrow key (ASCII 54): move the black pixel to the right
    if q == 54 and px + step < widthImg:
        img[py, px] = 1
        img[py, px + step] = 0
        px = px + step

    # Display the updated image
    cv2.imshow('Image', img)

    # Wait for a key press and get its ASCII code (only the last byte)
    q = cv2.waitKey(0) & 0xFF

    # Press '0' to quit the loop (ASCII code of '0' is 48)
    if ord('0') == q:
        break

# Close all OpenCV windows
cv2.destroyAllWindows()


In [7]:
''' TP 3 – Smoothing Filters: Mean and Median '''

import cv2
import numpy as np
import matplotlib.pyplot as plt
from numpy.lib.shape_base import vsplit

# Load the image in grayscale mode
image_path = 'sadcat.jpeg'
image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

# Size of the neighborhood (must be odd)
vois = 3  # Kernel size (3x3 neighborhood)

# === Mean Filter ===
def filtreMoy(img):
    h, w = img.shape
    imgMoy = np.zeros(img.shape, img.dtype)

    for y in range(h):
        for x in range(w):
            # Skip pixels near the border to avoid out-of-bounds access
            if y < vois/2 or y > (h - vois/2) or x < vois/2 or x > (w - vois/2):
                imgMoy[y, x] = img[y, x]
            else:
                m = int(vois / 2)
                # Extract the neighborhood (voisinage)
                imgVois = img[y - m:y + m + 1, x - m:x + m + 1]
                # Compute the mean and assign it to the current pixel
                imgMoy[y, x] = np.mean(imgVois)
    return imgMoy

# === Median Filter ===
def filtreMedian(img):
    h, w = img.shape
    imgMed = np.zeros(img.shape, img.dtype)

    for y in range(h):
        for x in range(w):
            # Skip border pixels
            if y < vois/2 or y > (h - vois/2) or x < vois/2 or x > (w - vois/2):
                imgMed[y, x] = img[y, x]
            else:
                m = int(vois / 2)
                # Extract the neighborhood
                imgVois = img[y - m:y + m + 1, x - m:x + m + 1]
                # Compute the median and assign it to the current pixel
                imgMed[y, x] = np.median(imgVois)
    return imgMed

# Apply filters to the image
imgMoy = filtreMoy(image)
imgMed = filtreMedian(image)

# Display original and filtered images
cv2.imshow("Original Image", image)
cv2.imshow("Mean Filtered Image", imgMoy)
cv2.imshow("Median Filtered Image", imgMed)

cv2.waitKey(0)
cv2.destroyAllWindows()


In [8]:
''' TP 4 – Interactive Image Thresholding with Trackbars '''

import cv2
import numpy as np
import matplotlib.pyplot as plt
import math

# === Load the grayscale image ===
image_path = 'sadcat.jpeg'
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

# Initial threshold value and type
th = 0           # Threshold level (0 to 255)
type_th = 0      # Thresholding method (OpenCV types 0 to 4)

# === Function to apply thresholding and display the result ===
def afficher():
    # Create an output image with the same shape and type as input
    imgRes = np.zeros_like(img)

    # Apply thresholding using OpenCV's built-in function
    # Parameters:
    # - img: input grayscale image
    # - th: threshold value
    # - 255: maximum value to use
    # - type_th: thresholding type (e.g., binary, inverse, trunc, etc.)
    # - imgRes: destination image
    cv2.threshold(img, th, 255, type_th, imgRes)

    # Show the result in a window
    cv2.imshow('img', imgRes)

# === Callback function: updates threshold value ===
def change_th(x):
    global th
    th = x
    afficher()

# === Callback function: updates thresholding method ===
def change_type(x):
    global type_th
    type_th = x
    afficher()

# Initial display of the image
afficher()

# === Create trackbars ===

# Threshold value trackbar: lets user choose a threshold between 0 and 255
cv2.createTrackbar("thresh", "img", 0, 256, change_th)

# Thresholding type trackbar: choose among 5 OpenCV thresholding modes
# 0: Binary
# 1: Binary Inverse
# 2: Truncate
# 3: To Zero
# 4: To Zero Inverse
cv2.createTrackbar("type", "img", 0, 4, change_type)

# Wait for a key press before closing the windows
cv2.waitKey(0)
cv2.destroyAllWindows()


In [9]:
''' Gradient-Based Edge Detection with Interactive Thresholding '''

import cv2
import numpy as np

# === Load the image in grayscale ===
image_path = 'sadcat.jpeg'
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

# Initial threshold value for edge detection
th = 0

# === Function to compute and display the edge map ===
def afficher():
    # Create an empty image to store the contours (edges)
    imgContour = np.zeros_like(img)  # Same size and type as the original

    # Compute the gradient in the X direction (horizontal difference)
    grad_x = img[:, :img.shape[1] - 1] - img[:, 1:]

    # Compute the gradient in the Y direction (vertical difference)
    grad_y = img[:img.shape[0] - 1, :] - img[1:, :]

    # Pad gradients to match the original image size
    grad_x = np.pad(grad_x, ((0, 0), (0, 1)), mode='constant')  # Pad last column
    grad_y = np.pad(grad_y, ((0, 1), (0, 0)), mode='constant')  # Pad last row

    # Compute the gradient magnitude (Euclidean norm)
    grad = np.sqrt(grad_x**2 + grad_y**2)

    # Apply threshold: highlight edges where gradient magnitude > th
    imgContour[grad > th] = 255  # Strong edge
    imgContour[grad <= th] = 0   # Weak or no edge

    # Display the resulting edge image
    cv2.imshow('img', imgContour)

# === Callback function to update the threshold from the trackbar ===
def change_th(x):
    global th
    th = x
    afficher()

# Create the display window
cv2.namedWindow("img")

# Create a trackbar to adjust the threshold value
cv2.createTrackbar("thresh", "img", 0, 256, change_th)

# Initial display
afficher()

# Wait for key press before closing
cv2.waitKey(0)
cv2.destroyAllWindows()


In [None]:
# === Section 1: Gaussian Blur (Smoothing) ===

import cv2
import numpy as np

# Load grayscale image
image_path = 'sadcat.jpeg'
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

# Define a 3x3 Gaussian kernel
kernel = np.array([[1, 2, 1],
                   [2, 4, 2],
                   [1, 2, 1]]) / 16

# Apply the Gaussian filter using convolution
imgRes = cv2.filter2D(img, -1, kernel)

# Normalize the result to the 0–255 range
cv2.normalize(imgRes, imgRes, 0, 255, cv2.NORM_MINMAX)

# Display the original and filtered image
cv2.imshow("Original Image", img)
cv2.imshow("Gaussian Blurred", imgRes)

cv2.waitKey(0)
cv2.destroyAllWindows()




In [None]:
# === Section 2: Laplacian Filtering + Edge Enhancement ===

import cv2
import numpy as np

# Load image in grayscale
image_path = 'sadcat.jpeg'
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

# Define a Laplacian kernel (edge detection)
kernel = np.array([[0, -1, 0],
                   [-1, 4, -1],
                   [0, -1, 0]])

# Apply convolution
imgRes = cv2.filter2D(img.astype(np.int16), -1, kernel)

# Add the result to the original image to enhance edges
imgRes = img + imgRes

# Normalize and display
cv2.normalize(imgRes, imgRes, 0, 255, cv2.NORM_MINMAX)
cv2.imshow("Original Image", img)
cv2.imshow("Laplacian Enhanced", imgRes.astype(np.uint8))

cv2.waitKey(0)
cv2.destroyAllWindows()


In [None]:
# === Section 3: Custom Gradient Kernel + Enhancement ===

import cv2
import numpy as np

# Load grayscale image
image_path = 'sadcat.jpeg'
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

# Define a simple custom gradient kernel
kernel = np.array([[0, -1, 0],
                   [-1, 1, -1],
                   [0, -1, 0]])

# Apply filter and enhance result
imgRes = cv2.filter2D(img.astype(np.int16), -1, kernel)
imgRes = img + imgRes

# Normalize and display
cv2.normalize(imgRes, imgRes, 0, 255, cv2.NORM_MINMAX)
cv2.imshow("Original Image", img)
cv2.imshow("Gradient Enhanced", imgRes.astype(np.uint8))

cv2.waitKey(0)
cv2.destroyAllWindows()


In [None]:
# === Section 4: Sharpening with Laplacian Kernel ===

import cv2
import numpy as np

# Load grayscale image
image_path = 'sadcat.jpeg'
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

# Sharpening kernel (Laplacian + identity)
kernel = np.array([[0, -1, 0],
                   [-1, 5, -1],
                   [0, -1, 0]])

# Apply sharpening filter
imgRes = cv2.filter2D(img.astype(np.int16), -1, kernel)

# Normalize and display
cv2.normalize(imgRes, imgRes, 0, 255, cv2.NORM_MINMAX)
cv2.imshow("Original Image", img)
cv2.imshow("Sharpened", imgRes.astype(np.uint8))

cv2.waitKey(0)
cv2.destroyAllWindows()


In [None]:
# === Section 5: Gaussian Kernel Generator and Display ===

import math
import numpy as np

# Function to compute the Gaussian value for coordinates (x, y)
def gauss(x, y, sigma):
    part1 = 1 / (2 * math.pi * sigma**2)
    part2 = -((x**2 + y**2) / (2 * sigma**2))
    return part1 * math.exp(part2)

# Print and view Gaussian kernel
def print_gauss(sigma=1.4, vois_mat=5):
    vois = vois_mat // 2
    som = 0.0
    for i in range(-vois, vois + 1):
        for j in range(-vois, vois + 1):
            val = round(gauss(i, j, sigma) * 185, 0)
            print('{:02.2f}'.format(val), '\t', end="")
            som += val
        print('')
    print('Sum =', som)

print_gauss()


In [None]:
# === Section 6: Return the Gaussian kernel as a NumPy matrix ===

def check_gauss(mat):
    for row in mat:
        print(" ".join(map(str, row)))

def get_gauss(sigma=1.4, vois_mat=5):
    mat_gauss = np.zeros((vois_mat, vois_mat), float)
    vois = vois_mat // 2
    som = 0.0
    for i in range(-vois, vois + 1):
        for j in range(-vois, vois + 1):
            val = round(gauss(i, j, sigma) * 185, 0)
            mat_gauss[i + vois][j + vois] = val
            som += val
    return som, mat_gauss

som, mat_gauss = get_gauss()
check_gauss(mat_gauss)


In [None]:
# === Section 7: Apply Custom Gaussian Kernel ===

import cv2
import numpy as np

# Load grayscale image
image_path = 'sadcat.jpeg'
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

# Generate Gaussian kernel
som, kernel = get_gauss()

# Apply the kernel as a filter
imgRes = cv2.filter2D(img.astype(np.int16), -1, kernel)

# Normalize and display result
cv2.normalize(imgRes, imgRes, 0, 255, cv2.NORM_MINMAX)
cv2.imshow("Original Image", img)
cv2.imshow("Gaussian Filtered (Custom)", imgRes.astype(np.uint8))

cv2.waitKey(0)
cv2.destroyAllWindows()

In [None]:
''' TP 6 – Morphological Operations with Trackbars (Erosion, Dilation, Gradient) '''

import cv2
import numpy as np

# === Load the grayscale image ===
image_path = 'sadcat.jpeg'
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

# Convert to binary image using a fixed threshold
cv2.threshold(img, 128, 255, cv2.THRESH_BINARY, img)

# Create a window for each operation
cv2.namedWindow("Erosion")
cv2.namedWindow("Dilation")
cv2.namedWindow("Morph Gradient")

# === Initial sizes for structuring elements (adjusted by trackbars) ===
sizeErode = 1
sizeDilate = 1
sizeMorph = 1

# === Erosion Function ===
def erode_func():
    size = sizeErode * 2 + 1  # Ensure odd kernel size (e.g., 3, 5, 7...)
    kernel = cv2.getStructuringElement(cv2.MORPH_CROSS, (size, size))
    print("Erode kernel:\n", kernel)
    img_erode = cv2.erode(img, kernel)
    cv2.imshow("Erosion", img_erode)

# === Dilation Function ===
def dilate_func():
    size = sizeDilate * 2 + 1
    kernel = cv2.getStructuringElement(cv2.MORPH_CROSS, (size, size))
    print("Dilate kernel:\n", kernel)
    img_dilate = cv2.dilate(img, kernel)
    cv2.imshow("Dilation", img_dilate)

# === Morphological Gradient Function ===
def morph_func():
    size = sizeMorph * 2 + 1
    kernel = cv2.getStructuringElement(cv2.MORPH_CROSS, (size, size))
    print("Morph Gradient kernel:\n", kernel)
    # Morph gradient = dilation - erosion
    img_morph = cv2.morphologyEx(img, cv2.MORPH_GRADIENT, kernel)
    cv2.imshow("Morph Gradient", img_morph)

# === Trackbar callbacks ===
def changeErodeSize(x):
    global sizeErode
    sizeErode = x
    erode_func()

def changeDilateSize(x):
    global sizeDilate
    sizeDilate = x
    dilate_func()

def changeMorphSize(x):
    global sizeMorph
    sizeMorph = x
    morph_func()

# === Create trackbars to adjust structuring element size dynamically ===
cv2.createTrackbar("Erode Size", "Erosion", sizeErode, 17, changeErodeSize)
cv2.createTrackbar("Dilate Size", "Dilation", sizeDilate, 17, changeDilateSize)
cv2.createTrackbar("Morph Size", "Morph Gradient", sizeMorph, 17, changeMorphSize)

# === Show the original binary image ===
cv2.imshow("Original Binary", img)

# === Initial display ===
erode_func()
dilate_func()
morph_func()

# === Wait until key is pressed ===
cv2.waitKey(0)
cv2.destroyAllWindows()


In [2]:
''' TP 7 '''

import cv2
import numpy as np

# === Read the original color image ===
image_path = 'sadcat.jpeg'
img = cv2.imread(image_path, cv2.IMREAD_COLOR)

# Create empty images for each color channel with the same shape and type as original
img_b = np.zeros(img.shape, img.dtype)
img_g = np.zeros(img.shape, img.dtype)
img_r = np.zeros(img.shape, img.dtype)

# Get image dimensions
h, w, c = img.shape

'''
# Method 1: Manual pixel-wise assignment (commented out for performance)
for y in range(h):
    for x in range(w):
        img_b[y,x,0] = img[y,x,0]
        img_g[y,x,1] = img[y,x,1]
        img_r[y,x,2] = img[y,x,2]
'''

# Method 2: Efficient slicing to separate channels
img_b[:, :, 0], img_g[:, :, 1], img_r[:, :, 2] = img[:, :, 0], img[:, :, 1], img[:, :, 2]

'''
# Compute grayscale manually using channel weighting:
# Grayscale = 0.1*Blue + 0.6*Green + 0.3*Red
# Using weights helps avoid truncation and keeps float precision
'''

# Basic grayscale computation using equal contribution from each channel
img_gray = (np.float32(img_b[..., 0]) + 
            np.float32(img_g[..., 1]) + 
            np.float32(img_r[..., 2])) / 3

# Optional: Normalize grayscale to [0, 1] for floating point visualization
img_gray = img_gray / 255  # (or divide by 3*255 if weights used)

# Convert BGR image to HLS color space (Hue, Lightness, Saturation)
img_hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HLS)

# Convert original image to float representation in range [0, 1]
img_float = img / 255

# === Display results ===
cv2.imshow("image B", img_b)
cv2.imshow("image G", img_g)
cv2.imshow("image R", img_r)
cv2.imshow("image gray", img_gray)
cv2.imshow("image hsv", img_hsv)
cv2.imshow("image float", img_float)

cv2.waitKey(0)
cv2.destroyAllWindows()


In [None]:
''' TP 8 – Video Sampling and Recording '''

import cv2
import numpy as np
import time 

# === Capture source ===
# Choose one of the following:
# cap = cv2.VideoCapture(0)                       # 1. For webcam
# cap = cv2.VideoCapture('output.avi')           # 2. From a saved video
# url = "http://192.168.226.189:8080/video"      
# cap = cv2.VideoCapture(url)                    # 3. IP camera (mobile phone via IP Webcam app)

cap = cv2.VideoCapture(0)  # ← Example: using webcam

# === Get frame dimensions ===
frame_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
frame_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))

# === Check if capture opened successfully ===
if not cap.isOpened():
    print("Error: Unable to open video source.")
    exit(0)

# === Video writer (output file with XVID codec, 30 fps) ===
fourcc = cv2.VideoWriter_fourcc(*'XVID')
out = cv2.VideoWriter('output2.avi', fourcc, 30, (frame_width, frame_height))

# === Frame reading loop ===
while cap.isOpened():

    debut = time.time()  # Time before reading the frame

    ret, frame = cap.read()  # Read a frame

    if not ret:
        print("Error: Unable to read frame.")
        break

    # Optional: Flip the frame (uncomment if needed)
    # frame = cv2.flip(frame, 0)

    out.write(frame)              # Save the frame to the output file
    cv2.imshow("image", frame)    # Display the frame

    # Exit loop if 'q' is pressed
    if cv2.waitKey(20) & 0xFF == ord('q'): 
        break

    # Print time per frame and FPS
    time_iter = time.time() - debut
    print("time =", round(time_iter, 4), "s | fps =", round(1. / time_iter, 2))

# === Release resources ===
out.release()
cap.release()
cv2.destroyAllWindows()


In [None]:
''' TP 9 '''

import cv2
import numpy as np

# Uncomment this line to use your phone's IP webcam
# url = "http://192.168.226.189:8080/video"
# VideoCap = cv2.VideoCapture(url)

# Use default webcam (computer camera)
VideoCap = cv2.VideoCapture(0)

# Define HSV color range for detection (example: blue object)
lo = np.array([95, 80, 60])      # Lower HSV bound
hi = np.array([115, 255, 255])   # Upper HSV bound

# Optional: set up video recording
fourcc = cv2.VideoWriter_fourcc(*'XVID')
out = cv2.VideoWriter('recorded_output.avi', fourcc, 30, (640, 480))

# === Function to detect objects within HSV range ===
def detect_inrange(image, min, max):
    points = []
    image = cv2.blur(image, (5, 5))  # Blur to reduce noise
    image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)  # Convert to HSV

    # Create mask from HSV range
    mask = cv2.inRange(image, lo, hi)
    mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, None, iterations=2)  # Clean mask

    # Find contours
    elements = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[-2]
    elements = sorted(elements, key=lambda x: cv2.contourArea(x), reverse=True)

    for element in elements:
        area = int(cv2.contourArea(element))
        if min < area < max:
            ((x, y), radius) = cv2.minEnclosingCircle(element)
            points.append(np.array([int(x), int(y), int(radius), area]))

    return image, mask, points

# === Verify capture is working ===
if not VideoCap.isOpened():
    print("Error: Cannot open video stream.")
    exit(0)

# === Main loop ===
while True:
    ret, frame = VideoCap.read()
    if not ret:
        print("Error: Cannot read frame.")
        break

    frame = cv2.resize(frame, (640, 480))
    cv2.flip(frame, 1, frame)  # Flip horizontally

    image, mask, points = detect_inrange(frame, 1000, 3000)

    # Debug point
    cv2.circle(frame, (100, 100), 20, (0, 255, 0), 5)
    print(image[100, 100])

    # Draw circle and info if any point found
    if len(points) > 0:
        cv2.circle(frame, (points[0][0], points[0][1]), points[0][2], (0, 0, 255), 2)
        cv2.putText(frame, str(points[0][3]), (points[0][0], points[0][1]),
                    cv2.FONT_HERSHEY_COMPLEX, 1, (255, 0, 0), 2, cv2.LINE_AA)

    # Display mask and frame
    if mask is not None:
        cv2.imshow("mask", mask)

    cv2.imshow("image", frame)
    out.write(frame)  # Save to video file

    # Press 'q' to stop
    if cv2.waitKey(10) & 0xFF == ord('q'):
        break

# Cleanup
VideoCap.release()
out.release()
cv2.destroyAllWindows()


In [1]:
''' TP 10 : Kalman Filter Tracking '''

import numpy as np

class KalmanFilter:
    def __init__(self, dt, point):
        self.dt = dt  # Time step

        # Initial state vector [x, y, vx, vy]
        self.E = np.matrix([[point[0]], [point[1]], [0], [0]])

        # State transition matrix (A)
        self.A = np.matrix([
            [1, 0, self.dt, 0],
            [0, 1, 0, self.dt],
            [0, 0, 1, 0],
            [0, 0, 0, 1]
        ])

        # Observation matrix (we only observe position x, y)
        self.H = np.matrix([
            [1, 0, 0, 0],
            [0, 1, 0, 0]
        ])

        # Process noise covariance (Q)
        self.Q = np.matrix([
            [1, 0, 0, 0],
            [0, 1, 0, 0],
            [0, 0, 1, 0],
            [0, 0, 0, 1]
        ])

        # Measurement noise covariance (R)
        self.R = np.matrix([
            [1, 0],
            [0, 1]
        ])

        # Initial estimation error covariance (P)
        self.P = np.eye(4)

    def predict(self):
        # Predict the next state
        self.E = self.A @ self.E

        # Update error covariance
        self.P = self.A @ self.P @ self.A.T + self.Q

        return self.E

    def update(self, z):
        # Compute Kalman Gain
        S = self.H @ self.P @ self.H.T + self.R
        K = self.P @ self.H.T @ np.linalg.inv(S)

        # Update estimate with measurement z
        self.E = np.round(self.E + K @ (z - self.H @ self.E))

        # Update error covariance
        I = np.eye(self.H.shape[1])
        self.P = (I - K @ self.H) @ self.P

        return self.E


In [None]:
''' TP 11 : Face Detection with Kalman Filter '''

import cv2
import numpy as np


# HSV range for optional object detection (e.g., blue object)
lo = np.array([95, 100, 30])
hi = np.array([125, 255, 255])

# Function to detect objects in a specific HSV color range
def detect_inrange(image, surfaceMin, surfaceMax):
    points = []
    image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)  # Convert to HSV
    image = cv2.blur(image, (5, 5))                 # Blur to reduce noise
    mask = cv2.inRange(image, lo, hi)               # Threshold in range
    mask = cv2.erode(mask, None, iterations=2)      # Morphological erosion
    mask = cv2.dilate(mask, None, iterations=2)     # Morphological dilation

    # Detect contours in the mask
    elements = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[-2]
    elements = sorted(elements, key=lambda x: cv2.contourArea(x), reverse=True)

    for element in elements:
        print('Surface :', cv2.contourArea(element))
        if surfaceMin < cv2.contourArea(element) < surfaceMax:
            ((x, y), rayon) = cv2.minEnclosingCircle(element)
            points.append(np.array([int(x), int(y)]))
            break

    return points, mask

# Function to detect faces using Haar cascade classifier
def detect_visage(image):
    # Load OpenCV built-in Haar cascade XML
    face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + "haarcascade_frontalface_alt2.xml")
    
    if face_cascade.empty():
        print("Failed to load Haar cascade XML file.")
        exit()

    points = []
    rects = []
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    face = face_cascade.detectMultiScale(gray, scaleFactor=1.2, minNeighbors=3)

    for x, y, w, h in face:
        points.append(np.array([int(x + w / 2), int(y + h / 2)]))
        rects.append(np.array([(x, y), (x + w, y + h)]))

    return points, rects


# Open webcam stream
VideoCap = cv2.VideoCapture(0)

# Initialize Kalman Filter with timestep = 0.1s and initial position [10,10]
KF = KalmanFilter(0.1, [10, 10])

while True:
    mask = None
    rects = None

    ret, frame = VideoCap.read()  # Read frame from webcam

    cv2.flip(frame, 1, frame)     # Mirror the frame (optional for user-facing cameras)

    # Uncomment if using color range detection:
    # points, mask = detect_inrange(frame, 3000, 7000)

    points, rects = detect_visage(frame)  # Use face detection instead

    etat = KF.predict().astype(np.int32)  # Predict next position with Kalman filter

    # Draw predicted position (green)
    cv2.circle(frame, (int(etat[0]), int(etat[1])), 2, (0, 255, 0), 5)

    # Draw velocity vector (green arrow)
    cv2.arrowedLine(
        frame,
        (int(etat[0]), int(etat[1])),
        (int(etat[0] + etat[2]), int(etat[1] + etat[3])),
        color=(0, 255, 0),
        thickness=3,
        tipLength=0.2
    )

    # If a new measurement is available (face detected)
    if len(points) > 0:
        KF.update(np.expand_dims(points[0], axis=-1))  # Update Kalman filter
        cv2.circle(frame, (points[0][0], points[0][1]), 10, (0, 0, 255), 2)  # Draw measured position (red)

    # Draw face rectangle if detected
    if rects is not None:
        try:
            print(rects[0])
            cv2.rectangle(frame, rects[0][0], rects[0][1], (0, 0, 255), 1, cv2.LINE_AA)
        except:
            print("erreur")

    # Show the mask (only if color detection was used)
    if mask is not None:
        cv2.imshow('mask', mask)

    # Show the result frame
    cv2.imshow('frame', frame)

    # Exit on key 'q'
    if cv2.waitKey(10) & 0xFF == ord('q'):
        break

# Release resources
VideoCap.release()
cv2.destroyAllWindows()


In [None]:
'''tp 12 : cell phone detection'''
import cv2 # manipulate frames and video (display)
from deep_sort_realtime.deepsort_tracker import DeepSort # to use the model of track 
from ultralytics import YOLO # to use yolo to track 


# Set environment variable to avoid duplicate library errors
# os.environ["KMP_DUPLICATE_LIB_OK"] = "TRUE"


# Initialize the DeepSort tracker (une instance )
''' 
DeepSort uses :
    Kalman Filters & Hungarian Algorithm for { association and prediction }
    Neural network for { appearance-based re-identification }
It ensures consistent tracking of objects 
    [even when they temporarily disappear from view or when the camera moves slightly]
Each tracked object is assigned
    a unique ID (to monitor that object)
'''
object_tracker = DeepSort()

# Initialize the YOLO model (version 8) with a smaller model for faster inference
# Use yolov5n for faster inference
# DETAILS 
'''
model = YOLO(weights, task=None, mode=None)
weights (Required):
    path to the custom model weights
        Pretrained YOLOv8 weights:
            "yolov8'c'.pt": c => can be ,n(nano) ,s(small) ,m(medium) ,l(large) ,x(extra large)
task (Optional):
    Specifies the task the model is meant to perform:
        "detect": Object detection (default if not specified).
        "segment": Instance segmentation.
        "classify": Image classification.
mode (Optional):
    Specifies the mode of operation:
        "train": Train a new or custom YOLO model.
        "val": Validate the performance of the model on a dataset.
        "predict": Make predictions on images or videos (default).
        "export": Export the model for inference (e.g., ONNX, TensorRT, etc.).

default : 
- model = YOLO("yolov8n-seg.pt/yolov8n.pt")
    Default task : "segment".
    Default mode: "predict".
- model = YOLO("yolov8n-cls.pt/yolov8n.pt")
    Default task : "classify".
    Default mode: "predict".
'''
model = YOLO("yolo-Weights/yolov5n.pt")  


# code to start the webcam ( 0 => the actual default camera , in my case my laptop)
# DETAILS 
'''
cap = cv2.VideoCapture(source, apiPreference)
READING of the video frame by frame 
Parameters:
    source (required):
        Specifies the video source.
        Integer: Refers to the index of the camera.
            0: Default camera 
            1: Second connected camera, and so on.
        String: Path to a video file 
apiPreference (optional):
Specifies which API backend to use (e.g., DirectShow, Media Foundation, etc.).
Common values include:
cv2.CAP_ANY (default): Auto-select the backend.
cv2.CAP_DSHOW: DirectShow (Windows).
cv2.CAP_AVFOUNDATION: AVFoundation (macOS).
'''
cap = cv2.VideoCapture(0)

# these lines is to define the capture of cam width and the height 
# DETAILS 
'''
cap.set(id number ,pixels values )
- '3' => CAP_PROP_FRAME_WIDTH the video frame width in pixels.
- '4' => CAP_PROP_FRAME_HEIGHT the video frame height in pixels.
'''
cap.set(3, 640)  
cap.set(4, 480)  

# Read class names from the model (reads all the classes that are available on the yolo model)
# DETAILS
'''
COCO dataset
model.names is a dictionnary : 
    {
        (key: "value",)
        0: "person",
        1: "bicycle",
        2: "car",
        ...
        79: "toothbrush"
    }
'''
classNames = model.names  

# just take the index of the object we want to detect , in our case , cell phone => 67
phone_class_index = 67  

# this line ensures that the web cam is indeed opened 
while cap.isOpened():
    '''
    cap.read() returns :
        success: A boolean  frame was successfully read or not .
        img: The actual frame (image) if success is true .
    '''
    success, img = cap.read()

    '''cas frame not read sortir '''
    if not success:
        break
    
    # DETAILS
    '''
    img is the frame we read 
    the stream value 
        The stream=True means  'YOLO' will return results as a generator(with streaming)
    means :
        Instead of returning all detections at once in one frame it returns a set of results that way we can manipulate each as we want , it is also memory saving
    '''
    '''
    here:
        preprocessing 
            resizing the frame to treat it with yolo 
            normalisation ect 
        inference 
            runing cnn of yolo to detect 
        steaming 
            already explained 
    '''
    results = model(img, stream=True)

    # Prepare a list to store detections for DeepSORT
    detections = []

    # Process results from YOLO
    '''
    for each results r we will have : 
        Bounding Box (r.boxes.xyxy):
            (x1, y1, x2, y2).
        Class Index (r.boxes.cls):
            The index of the detected class 
        Confidence (r.boxes.conf):
            The confidence score for the detection of the class (proba)
    '''
    # iterate all the results 
    for r in results:
        # retrieve the box detection result of all objects 
        boxes = r.boxes
        # this to only detect a phone 
        # iterate all the boxes 
        for box in boxes:
            # retrieve the classe of detection 
            cls = int(box.cls[0])  

            # If the detected class is 'cell phone'
            '''do traitement , detection + tracking '''
            if cls == phone_class_index:
                '''get the coordinates'''
                x1, y1, x2, y2 = box.xyxy[0]
                '''
                    float to int
                    because : 
                        Image pixels are discrete and indexed using integers.
                '''
                x1, y1, x2, y2 = int(x1), int(y1), int(x2), int(y2)

                # Calculate the center of the bounding box
                center_x = (x1 + x2) // 2
                center_y = (y1 + y2) // 2

                # Add detection to list for DeepSORT (bbox, confidence, class)
                detections.append(([x1, y1, x2, y2], box.conf[0], cls))

                # DISPLAY 
                '''
                display rectangle 
                cv2.rectangle ( frame , the coordinate top left , bottom right , color line rectangle , border thickness)
                display a circle 
                cv2.circle(frame , coordinate of center , radius of circle,color ,circle filled or no (with color))
                '''
                # Draw bounding box 
                cv2.rectangle(img, (x1, y1), (x2, y2), (255, 0, 255), 3)
                cv2.circle(img, (center_x, center_y), 5, (0, 255, 0), -1) 
                ''' 
                this to display the center x and y (position of the object)
                '''
                # Display the center position as text on the webcam
                center_text = f"Center: ({center_x}, {center_y})"
                org = (x1, y1 - 10)  
                font = cv2.FONT_HERSHEY_SIMPLEX
                fontScale = 0.7
                color = (0, 255, 0)
                thickness = 2
                ''' 
                cv2.putText(frame , text ,where to display (position),font, fontScale, color, thickness)
                '''
                cv2.putText(img, center_text, org, font, fontScale, color, thickness)


    # Pass detections to DeepSORT for tracking
    # track the object 
    ''' 
    Matching Detections to Existing Tracks:
        DeepSort attempts to match the new detections (detections) to previously tracked objects using:
            Bounding box overlap (Intersection over Union, IoU).
            Appearance features (if enabled).
        Updating Tracks:
            For matched detections, DeepSort updates the state of the corresponding track .
        Creating New Tracks:
            If a detection cannot be matched to an existing track, DeepSort adds it.
        Removing Lost Tracks:
            Tracks that have not been updated for multiple frames => delete.
        
        Return Value:
            tracks: A list of track objects
            track_id: ID
            to_ltrb(): The bounding box coordinates [left, top, right, bottom] of the tracked object.
            is_confirmed(): A flag indicating whether the track is active and confirmed.
            Optionally, additional information
    '''
    tracks = object_tracker.update_tracks(detections, frame=img)

    # Draw a moving dot for each track (phone)
    for track in tracks:
        
        if not track.is_confirmed():
            continue
        track_id = track.track_id
        ltrb = track.to_ltrb()

        # Calculate the center of the bounding box for the dot
        center_x = int((ltrb[0] + ltrb[2]) // 2)
        center_y = int((ltrb[1] + ltrb[3]) // 2)

        # Draw the dot at the center of the tracked phone object
        cv2.circle(img, (center_x, center_y), 5, (0, 0, 255), -1)  # Red dot for tracking

    

    # Display the image on webcam with all the added displays 
    cv2.imshow('Webcam', img)

    # press q to quit 
    if cv2.waitKey(1) == ord('q'):
        break

# Release the webcam and close all windows
cap.release()
cv2.destroyAllWindows()
