## Converting Pixels in Image to Physical Distance with Monocular Camera

Import needed libraries

In [4]:
import numpy as np
import cv2 as cv
import glob
import math

Camera Calibration Code from https://www.youtube.com/watch?v=3h7wgR5fYik

In [5]:
# Chessboard size
chessBoardSize = (9, 9)  # Size of the chessboard
frameSize = (1280, 720)   # Size of the images

# Termination criteria 
criteria = (cv.TERM_CRITERIA_EPS + cv.TERM_CRITERIA_MAX_ITER, 30, 0.001)

# Prepare object points like (0,0,0), (1,0,0), ..., (8,7,0)
objp = np.zeros((chessBoardSize[0] * chessBoardSize[1], 3), np.float32)
objp[:, :2] = np.mgrid[0:chessBoardSize[0], 0:chessBoardSize[1]].T.reshape(-1, 2)

# Arrays to store object points and image points from all images
objPoints = []  # 3D point in real world space 
imgPoints = []  # 2D point in image space 

# Read images from the captured_images folder
images = glob.glob('captured_images/*.jpg')

# Debug: Check if images are found
if not images:
    print("No images found in the 'captured_images' folder. Check the folder path and file extensions.")
else:
    print("Found {} images.".format(len(images)))

for image in images:
    print("Processing:", image)
    img = cv.imread(image)
    if img is None:
        print("Failed to load image {}".format(image))
        continue
    gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
    # NEW: sharpening the image may gives more allowance for blurriness
    kernel = np.array([[0, -1, 0], [-1, 5, -1], [0, -1, 0]]) 
    sharpened = cv.filter2D(gray, -1, kernel)

    # Find the chessboard corners
    ret, corners = cv.findChessboardCorners(sharpened, chessBoardSize, None)

    # If found, add object points and image points (after refining them)
    if ret:
        objPoints.append(objp)
        corners2 = cv.cornerSubPix(sharpened, corners, (11, 11), (-1, -1), criteria)
        imgPoints.append(corners2)

        # Draw and display the corners 
        cv.drawChessboardCorners(img, chessBoardSize, corners2, ret)
        cv.imshow('img', img)
        cv.waitKey(5000)
    else:
        print("Chessboard corners not found in image {}".format(image))

cv.destroyAllWindows()

# Check if we have enough points for calibration
if len(objPoints) == 0 or len(imgPoints) == 0:
    print("No valid points found for calibration. Please check your images and chessboard pattern.")
else:
    # Perform calibration
    ret, cameraMatrix, dist, rvecs, tvecs = cv.calibrateCamera(objPoints, imgPoints, frameSize, None, None)
    print("Camera Calibrated:", ret)
    print("\nCamera Matrix:\n", cameraMatrix)
    # print("\nDistortion Parameters:\n", dist)
    # print("\nRotation Vectors:\n", rvecs)
    # print("\nTranslation Vectors:\n", tvecs)

Found 5 images.
Processing: captured_images\image_1.jpg
Processing: captured_images\image_2.jpg
Processing: captured_images\image_3.jpg
Processing: captured_images\image_4.jpg
Processing: captured_images\image_5.jpg
Camera Calibrated: 0.8788909130594953

Camera Matrix:
 [[1.73836169e+04 0.00000000e+00 6.01859190e+02]
 [0.00000000e+00 1.73671699e+04 4.59314572e+02]
 [0.00000000e+00 0.00000000e+00 1.00000000e+00]]


We need the focal length of the camera in x and y direction, fx, fy. <br>
According to: https://stackoverflow.com/questions/58269814/how-to-get-camera-focal-length-with-opencv <br>
The `cameraMatrix` in the above code block showed included such needed values of focal length in pixels.

In [3]:
fx = cameraMatrix[0][0]
fy = cameraMatrix[1][1]
fx, fy

(17383.616892447702, 17367.169914570306)

Since we have monocular camera, the focal length values are constants.
Also, as the snake 2D motion is assumed to be on a different physical level (z value) from where the chessboard used for calibration, the distance between the 2D plane where the operations are happening and between the camera is needed to convert number of pixels between points into actual distance. However, we still need a reference object with its actual dimension. 

The formula is F = P * D / W
where: 
- F = focal length
- P = number of pixels between two points
- D = distance from camera to the plane of movement 
- W = distance measured physically

In the case of actual data collection, such reference will be the snake itself. For demo purpose, a zoomed in version of the chessboard is used as a reference. 

### Calculate distance between the chessboard and the camera for `diff_z.jpg`
To calculate such value, we rearrange the formula:
D = W * F / P
where:
- D = distance from camera to the chessboard
- W = distance measured physically = 0.5 mm between each corner
- F = focal length
- P = number of pixels between two corners

All variable are either known or can be measured from the image, except D, as we cannot directly know from the image. 
We acquire P from the code below:

In [15]:
W = 0.5

In [14]:
ref = cv.imread('diff_z.jpg')
chessBoardSize = (9, 9)  # Size of the chessboard
criteria = (cv.TERM_CRITERIA_EPS + cv.TERM_CRITERIA_MAX_ITER, 30, 0.001)

# Get corners from calibrated images
gray = cv.cvtColor(ref, cv.COLOR_BGR2GRAY)
kernel = np.array([[0, -1, 0], [-1, 5, -1], [0, -1, 0]]) 
sharpened = cv.filter2D(gray, -1, kernel)
ret, corners = cv.findChessboardCorners(sharpened, chessBoardSize, None)
corners = cv.cornerSubPix(sharpened, corners, (11, 11), (-1, -1), criteria)

# Acquire side length
side_length = []

for i in range(chessBoardSize[1]):
    for j in range(chessBoardSize[0] - 1):
        curr_corner = corners[j+i*9][0]
        next_corner = corners[j+i*9+1][0]
        pixel_num = math.sqrt((curr_corner[0]-next_corner[0])**2 + (curr_corner[1]-next_corner[1])**2)
        side_length.append(pixel_num)
    

P = np.average(side_length)

P

37.787073626947965

Calculate D:

In [16]:
D = W * fx / P
D

230.02068199388856

Any other unknown physical width on this plane can then be calculated with the rearrangement of the formula: </br>
W' = D * P' / F </br>
`W'` is the desired physical distance to be determined on this plane. </br>
`P'` is the number of pixels on the path shown in image.

Testing with P' = 2P

In [17]:
W_prime = D * P * 2 / fx
W_prime

1.0